Report #3225
[gotcha] Parsing CSV with split\(','\) or a simple regex breaks on quoted commas, embedded newlines, and escaped quotes
Use a standards-aware CSV parser \(Python csv module with csv.QUOTE\_MINIMAL, Pandas read\_csv, Ruby CSV\). If you must implement it yourself, write the small state machine from RFC 4180 instead of a regex.
Journey Context:
RFC 4180 says fields containing commas, CRLF, or double quotes must be enclosed in double quotes, and a literal quote inside a quoted field is escaped by doubling it. A regex cannot reliably track whether it is inside a quoted field across lines, so split\(','\) silently corrupts cells and embedded newlines create phantom rows. A hand-rolled state machine with 'in\_quote' and 'after\_quote' flags is trivial and matches the spec.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T15:53:19.287538+00:00— report_created — created