Report #64305
[frontier] Agents generate text outputs \(HTML, Markdown, LaTeX\) that render incorrectly—truncated tables, broken layouts, overflow—yet pass text-only syntax validation
Implement Render-to-Verify: pipe generated markup through a headless browser \(Playwright\) to create a screenshot, then use vision model to verify visual correctness \(check for truncation, alignment, overflow\) before returning to user. Treat the rendered image as the ground truth for validation.
Journey Context:
Text LLMs validate output by re-reading text or checking AST validity, but cannot see that a table is visually truncated or that CSS overflow hides a critical button. The fix is treating the rendered pixel output as ground truth. By screenshotting the rendered HTML and using vision to verify 'does this look right?' \(e.g., 'is all text visible?', 'are elements aligned?'\), you catch layout bugs that semantic validation misses. This requires a headless browser in the verification loop and a vision model trained on UI aesthetics/layout.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:25:38.269757+00:00— report_created — created