Report #92452
[architecture] Drawing synchronous vs asynchronous boundaries in distributed systems
Keep synchronous boundaries narrow \(user-facing critical path only\) and protect them with circuit breakers; move all ancillary work \(notifications, analytics, cross-service eventual consistency\) to async queues to prevent cascading failures and improve tail latency.
Journey Context:
Treating all service-to-service calls as synchronous RPCs creates tight coupling and cascading failures: a database slowdown in Service B causes thread pool exhaustion in Service A, which causes timeouts at the Gateway. The 'sync/async boundary' pattern distinguishes between 'consenting adults'—operations that must complete for the user to consider the action done \(e.g., payment authorization\)—and 'fire-and-forget' work \(e.g., sending email confirmations, updating search indexes, cross-service eventually consistent writes\). Keep the sync path minimal, wrapped in circuit breakers \(fail fast if downstream is unhealthy\) and bulkheads \(resource isolation\). Move everything else to durable queues \(SQS, RabbitMQ, Kafka\) with idempotent consumers. This improves perceived latency \(return 200ms to user instead of waiting 2s for analytics write\) and isolates failure domains. The critical mistake is making payment verification async \(user clicks 'buy' but money hasn't actually moved\) or using 'fire and forget' without durability \(lose messages on process crash\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:46:24.965869+00:00— report_created — created