On This Page
1The Problem It Solves2Pattern Structure
3When to Use4When Not to Use
5Trade-offs6Implementation Approach
7Anti-Patterns to Avoid8Integration Platform Implementations
9Cloud-Specific Implementations10References

The Problem It Solves

In a monolithic application, a database transaction spanning multiple tables is atomic — either all changes commit or none do. In a microservices architecture, each service owns its own database. No single transaction can span service boundaries. A business operation that must update Orders, Inventory, and Payments becomes three separate local transactions in three separate databases.

Without the Saga pattern, a failure after updating Orders but before updating Inventory leaves the system in an inconsistent state with no automated recovery path.

Pattern Structure

Two styles implement the saga: choreography and orchestration. The right choice depends on the complexity of the workflow and the team's operational preferences.

Choreography — Services React to Events

Each service publishes an event after completing its local transaction. Other services listen and react. No central coordinator exists. The workflow emerges from the chain of events.

%%{init:{'theme':'base','themeVariables':{'fontSize':'14px','fontFamily':'IBM Plex Sans, system-ui, sans-serif','primaryColor':'#DBEAFE','primaryTextColor':'#1e3a5f','primaryBorderColor':'#2563EB','lineColor':'#374151','clusterBkg':'#F9FAFB','clusterBorder':'#D1D5DB','edgeLabelBackground':'#FFFFFF'},'flowchart':{'curve':'orthogonal','padding':30,'nodeSpacing':65,'rankSpacing':75,'useMaxWidth':true}}}%% flowchart TD START([Customer Places Order]) START --> OS[Order Service\nCreate order — PENDING\nPublish OrderCreated event] OS --> IS[Inventory Service\nReserve stock\nPublish StockReserved event] IS --> PS[Payment Service\nCharge customer\nPublish PaymentProcessed event] PS --> OS2[Order Service\nUpdate order — CONFIRMED\nPublish OrderConfirmed event] OS2 --> DONE([Order Complete]) IS --> FAIL1{Stock\nAvailable?} FAIL1 -->|No| COMP1[Publish StockUnavailable\nOrder Service cancels order\nCustomer notified] PS --> FAIL2{Payment\nSucceeded?} FAIL2 -->|No| COMP2[Publish PaymentFailed\nInventory releases reservation\nOrder Service cancels] style START fill:#4f8ef7,color:#fff style DONE fill:#10b981,color:#fff style COMP1 fill:#fef3c7 style COMP2 fill:#fef3c7

Orchestration — Central Coordinator Controls the Workflow

An orchestrator service sends commands to each participant and waits for responses. The orchestrator holds the saga state and decides what happens next including compensating transactions on failure.

%%{init:{'theme':'base','themeVariables':{'fontSize':'14px','fontFamily':'IBM Plex Sans, system-ui, sans-serif','primaryColor':'#DBEAFE','primaryTextColor':'#1e3a5f','primaryBorderColor':'#2563EB','lineColor':'#374151','clusterBkg':'#F9FAFB','clusterBorder':'#D1D5DB','edgeLabelBackground':'#FFFFFF'},'flowchart':{'curve':'orthogonal','padding':30,'nodeSpacing':65,'rankSpacing':75,'useMaxWidth':true}}}%% flowchart TD START([Order Request Received]) ORCH[Saga Orchestrator\nHolds saga state\nCoordinates all steps] START --> ORCH ORCH -->|Command: ReserveStock| IS2[Inventory Service] IS2 -->|Reply: StockReserved| ORCH ORCH -->|Command: ProcessPayment| PS2[Payment Service] PS2 -->|Reply: PaymentProcessed| ORCH ORCH -->|Command: ConfirmOrder| OS3[Order Service] OS3 -->|Reply: OrderConfirmed| ORCH ORCH --> DONE2([Saga Complete]) IS2 -->|Reply: StockUnavailable| ORCH PS2 -->|Reply: PaymentFailed| ORCH ORCH -->|Compensate: CancelOrder| COMP_O[Order Service\nCancel and notify] style START fill:#4f8ef7,color:#fff style DONE2 fill:#10b981,color:#fff style ORCH fill:#e0f2fe style COMP_O fill:#fef3c7

When to Use

  • Business operations that span multiple services and require all-or-nothing semantics
  • Workflows where partial completion leaves the system in a visible inconsistent state
  • Long-running processes where a two-phase commit would hold locks for an unacceptable duration
  • Systems where each service must remain independently deployable and scalable

When Not to Use

  • Operations that can be made idempotent and retried without compensation — prefer simpler retry logic
  • High-frequency, low-latency transactions where saga overhead is measurable
  • Systems where you control all the services and could share a single database — prefer a database transaction
  • Simple two-service interactions — consider the Outbox Pattern instead

Trade-offs

Benefit Cost
No distributed locking — services remain available Compensating transactions must be designed and tested for every step
Each service uses its own database technology Eventual consistency means brief windows of observable inconsistency
Independent deployability preserved Debugging failed sagas requires distributed tracing across services
Scales horizontally with each service Orchestrator becomes a single point of failure in the orchestration style

Implementation Approach

Choreography is preferred when the workflow is simple (three to four steps), the team is comfortable with event-driven debugging, and you want to avoid a central coordinator.

Orchestration is preferred when the workflow has branching logic, compensations are complex, or you need a clear audit trail of saga state. AWS Step Functions, Temporal, and Apache Camel implement the orchestrator role.

Three principles apply to both styles:

  1. Compensating transactions must be idempotent. The compensation may be called multiple times due to retries. Releasing a stock reservation twice must be safe.

  2. Saga state must be persisted before sending commands. If the orchestrator crashes after sending a command but before recording that it did so, the command will be sent again on recovery — the participant must handle duplicates.

  3. Use a correlation ID. Every message in a saga carries the same correlation ID. Observability and debugging depend on being able to reconstruct the full saga from distributed logs.

Anti-Patterns to Avoid

⚠ 1. Saga Without Compensating Transactions

Designing the happy path of the saga but not implementing compensations. Tested in development where failures are rare, discovered in production when payment fails after inventory is already reserved.

Hover to see the fix ↻
↺ Correct Approach

Design compensations alongside forward steps. Every step that changes state must have a compensation. Test failure scenarios explicitly in integration tests.

⚠ 2. Synchronous Saga Steps

Implementing saga steps as synchronous HTTP calls between services, waiting for each response before proceeding. A slow or unavailable downstream service blocks the entire saga and holds the caller's thread.

Hover to see the fix ↻
↺ Correct Approach

Use asynchronous messaging between saga steps. Each service processes its command from a queue, publishes its result to a topic, and returns immediately. The saga progresses asynchronously.

Integration Platform Implementations

  • MuleSoft: Implement saga orchestration using MuleSoft flows with Until Successful scopes for retries and error handlers as compensating transaction triggers. See MuleSoft Platform.
  • OIC: OIC Process Automation implements choreography-style sagas where each step is a human task or system integration. See OIC Platform.

Cloud-Specific Implementations

  • AWS: Implement the orchestration style using AWS Step Functions with Lambda functions as saga participants. The Express Workflow type suits short-lived sagas. For choreography, use EventBridge + SQS.

Flowchart

%%{init:{'theme':'base','themeVariables':{'fontSize':'14px','fontFamily':'IBM Plex Sans, system-ui, sans-serif','primaryColor':'#DBEAFE','primaryTextColor':'#1e3a5f','primaryBorderColor':'#2563EB','lineColor':'#374151','clusterBkg':'#F9FAFB','clusterBorder':'#D1D5DB','edgeLabelBackground':'#FFFFFF'},'flowchart':{'curve':'orthogonal','padding':30,'nodeSpacing':65,'rankSpacing':75,'useMaxWidth':true}}}%% flowchart TD START([Distributed Business Transaction Required]) START --> CHOOSE{Workflow Complexity} CHOOSE -->|Simple linear flow| CHOREO[Choreography Style\nServices react to events\nNo central coordinator] CHOOSE -->|Branching or complex compensation| ORCH_S[Orchestration Style\nCentral coordinator holds state\nStep Functions or Temporal] CHOREO --> STEP1[Step 1 — Local Transaction\nPublish success event] STEP1 --> STEP2[Step 2 — Local Transaction\nPublish success event] STEP2 --> STEP3[Step 3 — Local Transaction\nPublish completion event] ORCH_S --> CMD[Send Command to Participant\nWait for reply\nPersist saga state] CMD --> REPLY{Reply Received} REPLY -->|Success| NEXT[Proceed to next step] REPLY -->|Failure| ROLLBACK[Execute compensating\ntransactions in reverse order] STEP1 & STEP2 & STEP3 -->|Any step fails| COMP_S[Compensating Transactions\nUndo completed steps\nIdempotent and retryable] NEXT --> CMD STEP3 & NEXT --> DONE([Eventual Consistency Achieved]) COMP_S & ROLLBACK --> REVERTED([System Reverted to Consistent State]) style START fill:#4f8ef7,color:#fff style DONE fill:#10b981,color:#fff style REVERTED fill:#10b981,color:#fff style ROLLBACK fill:#fef3c7 style COMP_S fill:#fef3c7 style ORCH_S fill:#e0f2fe

References

  1. Richardson, Chris — Microservices Patterns. Manning, 2018. Pattern: Saga
  2. Garcia-Molina, H. and Salem, K. — Sagas. ACM SIGMOD, 1987.
  3. AWS — Saga orchestration with Step Functions. docs.aws.amazon.com/step-functions/latest/dg/concepts-saga-pattern
  4. Temporal — Workflow orchestration. temporal.io/docs
Ascendion Engineering Knowledge Base ← Integration Patterns