| 1 | Overview | 2 | Architecture Overview |
| 3 | Implementation Guide | 4 | Decision Criteria — Pillar Trade-offs |
| 5 | Anti-Patterns to Avoid | 6 | References |
Overview
AWS distilled lessons from thousands of customer architectures into six pillars: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimisation, and Sustainability. Each pillar contains design principles and best practices. The framework does not prescribe a specific architecture — it provides lenses through which to evaluate the trade-offs in any architecture.
The Well-Architected Tool in the AWS console lets you conduct formal reviews that generate a risk report. Ascendion's Solutions Architecture practice uses this tool as a structured input to ADR documentation and architecture review board submissions.
Architecture Overview
%%{init:{'theme':'base','themeVariables':{'fontSize':'14px','fontFamily':'IBM Plex Sans, system-ui, sans-serif','primaryColor':'#DBEAFE','primaryTextColor':'#1e3a5f','primaryBorderColor':'#2563EB','lineColor':'#374151','clusterBkg':'#F9FAFB','clusterBorder':'#D1D5DB','edgeLabelBackground':'#FFFFFF'},'flowchart':{'curve':'orthogonal','padding':30,'nodeSpacing':65,'rankSpacing':75,'useMaxWidth':true}}}%% flowchart TD START([Architecture Decision Required]) START --> WA[Apply Well-Architected Framework\nReview all 6 pillars\nDocument trade-offs as ADRs] WA --> P1[Pillar 1: Operational Excellence\nInfrastructure as Code\nObservability and runbooks] WA --> P2[Pillar 2: Security\nLeast privilege IAM\nEncryption at rest and in transit] WA --> P3[Pillar 3: Reliability\nMulti-AZ deployment\nHealth checks and auto-recovery] WA --> P4[Pillar 4: Performance Efficiency\nRight-sized resources\nCaching and CDN strategy] WA --> P5[Pillar 5: Cost Optimisation\nSavings Plans and Spot\nRight-sizing and lifecycle] WA --> P6[Pillar 6: Sustainability\nGraviton ARM instances\nAuto-scaling to zero when idle] P1 & P2 & P3 & P4 & P5 & P6 --> REVIEW{Well-Architected\nTool Review} REVIEW -->|High Risk Issues| HRI[Document and remediate\nwithin 30 days\nARB notification] REVIEW -->|Medium Risk Issues| MRI[Plan to address\nin next quarter\nOwner assigned] REVIEW -->|No Issues| APPROVED([Architecture Approved\nADR record created]) HRI & MRI --> APPROVED style START fill:#4f8ef7,color:#fff style APPROVED fill:#10b981,color:#fff style HRI fill:#fef3c7 style MRI fill:#e0f2fe
Implementation Guide
Pillar 1 — Operational Excellence
Infrastructure as Code is the foundation. Every resource deployed to production exists as CDK code in version control. No console-click deployments. Runbooks are automation documents — not Word files.
// CloudWatch Dashboard — observability as code
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';
const dashboard = new cloudwatch.Dashboard(this, 'ServiceDashboard', {
dashboardName: 'order-service-production',
});
dashboard.addWidgets(
new cloudwatch.GraphWidget({
title: 'API Latency P50 P95 P99',
left: [
new cloudwatch.Metric({
namespace: 'AWS/ApiGateway',
metricName: 'Latency',
statistic: 'p50',
}),
new cloudwatch.Metric({
namespace: 'AWS/ApiGateway',
metricName: 'Latency',
statistic: 'p99',
}),
],
}),
new cloudwatch.AlarmWidget({
title: 'Active Alarms',
alarm: errorRateAlarm,
})
);
Pillar 2 — Security
Security is structural. Every service runs in a private subnet. Every secret is in Secrets Manager. Every role has least-privilege permissions. See the AWS Security Baseline page for the full security stack implementation.
Pillar 3 — Reliability
// Multi-AZ RDS with automated failover
import * as rds from 'aws-cdk-lib/aws-rds';
const database = new rds.DatabaseInstance(this, 'AppDatabase', {
engine: rds.DatabaseInstanceEngine.postgres({
version: rds.PostgresEngineVersion.VER_16,
}),
instanceType: ec2.InstanceType.of(
ec2.InstanceClass.T4G, ec2.InstanceSize.MEDIUM),
multiAz: true, // Synchronous standby replica
autoMinorVersionUpgrade: true,
backupRetention: Duration.days(7),
deletionProtection: true,
vpc,
vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
});
Pillar 4 — Performance Efficiency
// CloudFront distribution — cache at the edge
import * as cloudfront from 'aws-cdk-lib/aws-cloudfront';
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins';
const distribution = new cloudfront.Distribution(this, 'AppCDN', {
defaultBehavior: {
origin: new origins.LoadBalancerV2Origin(alb),
cachePolicy: cloudfront.CachePolicy.CACHING_OPTIMIZED,
viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
compress: true,
},
priceClass: cloudfront.PriceClass.PRICE_CLASS_100,
});
Pillar 5 — Cost Optimisation
// S3 lifecycle — move to cheaper storage tiers automatically
const dataBucket = new s3.Bucket(this, 'DataBucket', {
lifecycleRules: [
{
transitions: [
{
storageClass: s3.StorageClass.INFREQUENT_ACCESS,
transitionAfter: Duration.days(30),
},
{
storageClass: s3.StorageClass.GLACIER_INSTANT_RETRIEVAL,
transitionAfter: Duration.days(90),
},
],
expiration: Duration.days(365),
},
],
});
Pillar 6 — Sustainability
// Graviton ARM instances — 20% cheaper, 60% less energy than x86
const fn = new lambda.Function(this, 'GravitonFunction', {
runtime: lambda.Runtime.NODEJS_20_X,
architecture: lambda.Architecture.ARM_64, // Graviton2
handler: 'index.handler',
code: lambda.Code.fromAsset('lambda'),
});
Decision Criteria — Pillar Trade-offs
Every architecture involves trade-offs across pillars. Document these explicitly in your ADRs.
| Decision | Pillar Gained | Pillar Cost | Accepted trade-off |
|---|---|---|---|
| Multi-AZ database | Reliability | Cost | Standby replica cost accepted for 99.95% SLA |
| Provisioned Concurrency | Performance | Cost | Latency SLA of p99 under 200ms justifies pre-warm cost |
| Single-region deployment | Cost, Simplicity | Reliability | RPO/RTO requirements do not justify multi-region cost |
| Spot instances for batch | Cost, Sustainability | Reliability | Batch jobs are idempotent and restartable |
| Reserved instances | Cost | Flexibility | Baseline load predictable for 12-month commitment |
Anti-Patterns to Avoid
Conducting a Well-Architected review at project launch, addressing the High Risk Issues to get sign-off, and never reviewing again. The architecture evolves, traffic patterns change, new threat patterns emerge, and the original review becomes stale within months.
Schedule quarterly Well-Architected reviews for all production workloads. Treat HRIs as P1 issues. Track MRIs in the team backlog with owner and target quarter. Link findings to ADRs.
Spending equal time on all six pillars for a system where the business risk is entirely in reliability and security. A batch reporting system and a real-time payment processor have very different pillar priorities.
Weight pillars by business context. For financial transaction processing: Security and Reliability dominate. For a content delivery system: Performance and Cost dominate. Document the weighting in the review record.
Flowchart
References
- AWS — Well-Architected Framework. docs.aws.amazon.com/wellarchitected
- AWS — Well-Architected Tool. console.aws.amazon.com/wellarchitected
- AWS — Well-Architected Labs. wellarchitectedlabs.com
- AWS — Sustainability Pillar Whitepaper. docs.aws.amazon.com/wellarchitected/sustainability
- AWS CDK — API Reference. docs.aws.amazon.com/cdk/api/v2