Claude: papers please?

SuspciousCarrot78@lemmy.world · edit-2 4 hours ago

Claude: papers please?

andrew0@lemmy.dbzer0.com · 16 days ago

Get llama.cpp and try Qwen3.6-35B-A3B. Just came out and looks good. You’ll have to look into optimal settings, as it’s a Mixture of Experts (MoE) model with only 3B parameters active. That means that the rest can stay in RAM for quick inference.

You could also try the dense model (Qwen3.5-27B), but that will be significantly slower. Put these in a coding harness like Oh-My-Pi, OpenCode, etc. and see how it fares for your tasks. Should be ok for small tasks, but don’t expect Opus / Sonnet 4.6 quality, more like better than Haiku.