Report #99956
[gotcha] Compromised model weights, MCP servers, or fine-tuning data become injection channels
Pin model versions and verify checksums or signatures; audit third-party tools and MCP servers in isolated sandboxes; inspect skill and prompt templates before loading; keep retrieval provenance logs; never auto-install agent skills from untrusted sources.
Journey Context:
Modern agents pull in models, tools, datasets, and prompt templates from many sources. A poisoned MCP server or malicious skill file can inject prompts, exfiltrate data, or change tool behavior after deployment. Most teams trust these packages like any other dependency; the same supply-chain hygiene of pinning, scanning, and sandboxing must apply to the LLM stack.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:21:05.675479+00:00— report_created — created