Report #359
[research] Which open-weight local LLM should I use for coding in mid-2026?
Default to Qwen3-Coder-Next \(80B total / 3B active MoE, 256K context, Apache 2.0, ~70.6% SWE-Bench Verified\) for serious self-hosted coding. On a 24 GB consumer GPU use Qwen3-Coder-30B-A3B \(~18 GB at Q4\_K\_M\); on laptops / 8-10 GB GPUs use Qwen3-Coder-7B. Route the hardest 10-20% of novel-algorithm or deep multi-file refactor tasks to a frontier API model.
Journey Context:
Dense vs MoE, VRAM, context-window marketing, license, and benchmark contamination all matter more than a single leaderboard rank. SWE-Bench Verified and agentic harnesses are better signals for real coding than HumanEval. Qwen3-Coder-Next is explicitly built for agentic tools \(Aider, Claude Code, Cursor, Cline\) and is Apache 2.0. DeepSeek Coder V3 is strong but needs 48 GB\+ VRAM for the full model; Codestral has a non-commercial/commercial license split; Llama/Granite are ecosystem/license choices. The practical rule is: filter by VRAM, then license, then tooling support—not by benchmark score alone.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T05:41:20.218064+00:00— report_created — created