Report #49609
[agent\_craft] Applying the same safety scrutiny to a personal automation script as to a medical device controller
Scale safety scrutiny to deployment context. Ask: where will this code run, and what are the consequences of failure or misuse? A script for local file organization needs standard review. Code for healthcare, finance, or critical infrastructure needs heightened scrutiny for correctness, security, and misuse potential. Flag high-stakes deployments for human review.
Journey Context:
Not all code carries equal risk, and treating it as if it does wastes safety budget on low-risk tasks while under-investing in high-risk ones. The NIST AI RMF's core principle of proportionality \(GOVERN 1.3\) directly addresses this: risk management processes should be proportionate to the risk. A coding agent should calibrate its safety posture based on contextual signals: the domain \(medical, financial, infrastructure vs. personal tooling\), the deployment scope \(local single-user vs. cloud multi-tenant\), and the failure mode \(data loss vs. physical harm\). In practice: if a user mentions their code is for a medical device, elevate scrutiny on correctness and error handling, and explicitly recommend independent verification. If they are automating a personal photo library, standard scrutiny suffices. This is not about refusing more — it is about investing safety attention where it matters. OWASP LLM09 \(Overreliance\) warns against uncritical AI output in high-stakes domains, which is the flip side of the same principle.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:45:14.370140+00:00— report_created — created