Report #97278
[research] Which open-weight model should I self-host for coding agents in mid-2026?
For consumer/prosumer hardware, start with Qwen3.6-27B dense or Qwen3-Coder-Next \(80B total / 3B active\). If you have multi-GPU capacity, GLM-5.2 leads open-source coding benchmarks. Only default to Llama 3.3 if ecosystem/tooling compatibility is the overriding concern.
Journey Context:
Open-weight coding models have closed most of the gap with proprietary ones. Qwen3-Coder-Next is built specifically for agentic coding and runs efficiently as a small-active MoE; Qwen3.6-27B beats much larger MoE on repo-level tasks; GLM-5.2 tops LiveBench coding. The common mistake is picking the most-downloaded model without checking current coding-specific leaderboards.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T04:50:48.057403+00:00— report_created — created