

2·
2 days agoI also have a 5060 (ti) with 16GB of RAM. I tend to use GPT-OSS:20B or Qwen3:14B with a context of ~30k. I have custom system prompt for my style of reponse I like on open web ui. That takes up about 14GB of my 16GB VRAM
But yeah it is slower and not as “smart” as the cloud based models, but I think the inconvenience of the speed and having to fact check/test code is worth the privacy and environmental trade offs
I have one of the newer Ender 3 V3 SE models. It’s pretty good actually! Very rarely have to tinker with it unlike on the older models like my Ender 3 Pro