@iceberg314

iceberg314@midwest.social · 10 hours ago

I also have a 5060 (ti) with 16GB of RAM. I tend to use GPT-OSS:20B or Qwen3:14B with a context of ~30k. I have custom system prompt for my style of reponse I like on open web ui. That takes up about 14GB of my 16GB VRAM

But yeah it is slower and not as “smart” as the cloud based models, but I think the inconvenience of the speed and having to fact check/test code is worth the privacy and environmental trade offs

iceberg314@midwest.social · 21 hours ago

That I why I like small, specialized, locally hosted AI. Runs acceptably fast and quite on my gaming PC, it’s private, and I can give it knowledge is small doses in specific topics and projects.