Report #55471
[cost\_intel] Instruct models fail on multi-table joins and implicit schema reasoning in text-to-SQL
Use reasoning models for complex schemas \(>5 tables, implicit joins, nested queries\); use instruct models with schema-specific few-shot for simple single-table queries
Journey Context:
Spider and BIRD benchmarks show reasoning models \(o1\) closing the gap on gold SQL generation for hard samples \(execution accuracy 65% vs 45% for GPT-4o on BIRD dev\). The gain comes from inferring implicit foreign-key relationships and handling date arithmetic across multiple tables. However, for dashboard filters or simple SELECTs from single tables, reasoning adds 20-50x cost and 30s latency for zero accuracy gain. Routing heuristic: If query requires joining >2 tables or has ambiguous aggregation \(e.g., 'average revenue per user by region'\), use reasoning. If 'find user by email', use cheap model. Cost-per-query matters here because text-to-SQL is often user-facing and high-volume.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:36:11.587391+00:00— report_created — created