Report #9115
[gotcha] catastrophic backtracking \(ReDoS\) in regex causing hangs on certain inputs
Avoid nested quantifiers like \`\(a\+\)\+\`, \`\(a\*\)\*\`, or \`\(a\|aa\)\+\` that can match the same string in exponentially many ways. Use atomic grouping \(possessive quantifiers like \`\+\+\`, \`\*\+\`\) if using the third-party \`regex\` module, or refactor to use character classes or bounded repetition. Test regex against long strings of the pattern's component \(e.g., 'aaaaaaaaaaaa\!'\).
Journey Context:
Python's \`re\` module uses a backtracking NFA \(Non-deterministic Finite Automaton\) engine. Patterns with nested quantifiers create exponential search trees. For example, \`\(a\+\)\+\` matched against 'aaaaaaaaaaaaaaaaaaaaaaaa\!' \(20 a's then a non-match\) tries 2^20 combinations. This manifests as a hang \(100% CPU, never returns\) on seemingly benign input. Standard \`re\` lacks atomic grouping \(which prevents backtracking\) found in Perl/Java/PCRE. Developers often copy regex from StackOverflow without testing pathological cases, leading to production ReDoS vulnerabilities on user input.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T07:18:38.440729+00:00— report_created — created