AWSSAP-C0273 concepts
SAP-C02 Cheat Sheet
Quick reference for the AWS Certified Solutions Architect – Professional exam.
Quick Navigation
Multi-Account Strategy
- AWS Organizations — SCP deny-list vs allow-list
- Deny-list: attach FullAWSAccess SCP by default, then add explicit Deny SCPs to block specific actions. Allow-list: detach FullAWSAccess, then attach SCPs with only permitted actions. Deny-list is easier to manage at scale; allow-list gives tighter control but requires enumerating all allowed actions.
- SCP that denies leaving the Organization
- { "Effect": "Deny", "Action": "organizations:LeaveOrganization", "Resource": "*" }
- AWS Control Tower — Landing Zone setup
- Provisions a multi-account environment with a management account, Log Archive account (S3 centralized logs), and Audit account (Security Hub, Config aggregator). Account Factory automates new account vending with guardrails (mandatory preventive SCPs + detective Config rules). Use Customizations for Control Tower (CfCT) to deploy CloudFormation alongside.
- Cross-account AssumeRole pattern
- Trust policy on the target account role allows the source account principal. Source calls sts:AssumeRole to receive temporary credentials. Use ExternalId condition to prevent confused deputy attacks when delegating to third-party vendors.
- AssumeRole with ExternalId (trust policy snippet)
- { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::PARTNER_ACCT:root" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "unique-external-id" } } }
- Centralized logging architecture
- CloudTrail Organization trail writes to a central Log Archive S3 bucket. S3 bucket policy uses aws:SourceOrgID condition to accept logs only from org accounts. VPC Flow Logs and Config snapshots also aggregate to this bucket. Athena or Security Lake can query across all accounts.
- IAM Identity Center (formerly SSO) — permission sets
- Single sign-on to multiple accounts using permission sets mapped to AD groups via SAML/SCIM. Permission sets are deployed as IAM roles in each account. Centrally managed — changes propagate automatically. Preferred over creating individual IAM users in each account.
Advanced Networking
- Transit Gateway — attachments and route tables
- TGW connects VPCs, VPNs, and Direct Connect Gateways via attachments. Each attachment is associated with a TGW route table. Use multiple route tables for segmentation (e.g., shared-services table propagates to all; production VPCs only propagate to a restricted table). TGW peering enables cross-region connectivity.
- Direct Connect — VIF types
- Private VIF: connects to VPC resources via a Virtual Private Gateway (single VPC) or Direct Connect Gateway (multiple VPCs across Regions). Transit VIF: connects to a TGW (requires DXGW) — preferred for multi-VPC. Public VIF: accesses AWS public endpoints (S3, DynamoDB) without traversing internet. Hosted VIF: shared by a Direct Connect partner. LAG (Link Aggregation Group): bonds multiple ports for higher bandwidth and redundancy.
- Direct Connect — MACsec encryption
- Layer 2 encryption on dedicated 10 Gbps/100 Gbps/400 Gbps DX connections. Configured on the connection (not VIF level). Requires MACsec-capable hardware on customer side. Use when data-in-transit encryption must occur before the AWS network.
- Hybrid DNS — Route 53 Resolver endpoints
- Inbound endpoint: on-premises resolvers forward queries for AWS-hosted zones (VPC DNS, private hosted zones) to Route 53 via this ENI IP. Outbound endpoint: Route 53 forwards queries for on-premises domains to on-premises resolvers using forwarding rules. Both require endpoints in at least two AZs.
- PrivateLink cross-account service sharing
- Provider creates a Network Load Balancer and VPC Endpoint Service. Consumers create an Interface VPC Endpoint in their VPC targeting the service name. Traffic stays on AWS network; no VPC peering or route table changes needed. Supports sharing across accounts and organizations via allowed principals.
- VPC sharing via AWS RAM
- Owner account creates VPC and subnets, shares specific subnets to participant accounts via AWS RAM. Participants deploy resources into the shared subnets but do not manage the VPC. Reduces VPC sprawl and simplifies inter-service networking without peering.
- Network Firewall vs WAF vs Shield
- Network Firewall: stateful/stateless Layer 3-7 inspection at VPC perimeter, supports Suricata rules. WAF: HTTP/S Layer 7 protection for ALB, CloudFront, API Gateway, AppSync. Shield Standard: free DDoS protection. Shield Advanced: enhanced DDoS protection with 24/7 DRT access and cost protection.
Migration Strategies
- 6 Rs of cloud migration
- Rehost (lift-and-shift): move VMs as-is using MGN. Replatform (lift-and-reshape): minor optimizations (e.g., RDS instead of self-managed DB). Refactor/Re-architect: redesign for cloud-native (microservices, serverless). Repurchase: move to SaaS (e.g., Salesforce). Retire: decommission unused apps. Retain: keep on-premises (compliance, latency).
- AWS Application Migration Service (MGN)
- Replaces the legacy server migration service. Installs a lightweight agent on source servers; continuously replicates data to a staging area in AWS. On cutover, launches target EC2 instances. Supports Windows and Linux. Replaces manual AMI creation and is the standard rehost path.
- DMS — full load + CDC
- Full load: initial bulk copy of existing data. CDC (Change Data Capture): captures ongoing changes from source transaction logs during and after full load. Minimal-downtime migration pattern: run full load + CDC, let target catch up, then switch application connection string.
- Schema Conversion Tool (SCT) + DMS together
- SCT converts source schema and stored procedures to the target engine dialect (e.g., Oracle → Aurora PostgreSQL). DMS handles the actual data migration. Use SCT assessment report to estimate conversion effort before committing to a heterogeneous migration.
- Large data transfer decision tree
- < 1 week over existing internet: S3 Transfer Acceleration or multipart upload. > 1 week or > 10 TB: Snowball Edge (up to 80 TB usable). > 10 PB: Snowmobile. Ongoing high-throughput: Direct Connect (1/10/100 Gbps). DataSync: automated recurring transfers from on-premises NFS/SMB/HDFS to S3, EFS, or FSx.
- Migration Hub — tracking migrations
- Central dashboard aggregating migration status from MGN, DMS, and partner tools. Choose a home region — all tracking data is stored there. Discovery tools (Application Discovery Service agent or agentless) feed inventory data into Migration Hub to build a dependency map before migration.
Disaster Recovery
- RPO vs RTO definitions
- RPO (Recovery Point Objective): maximum acceptable data loss, measured in time (e.g., 1-hour RPO means you can lose up to 1 hour of data). RTO (Recovery Time Objective): maximum acceptable downtime — how quickly the system must be restored. DR strategy selection is driven by how low your RPO/RTO requirements are vs cost.
- DR strategies comparison
- Backup & Restore: RPO hours, RTO hours-days, lowest cost. Pilot Light: RPO minutes, RTO tens of minutes, minimal running infra. Warm Standby: RPO seconds-minutes, RTO minutes, scaled-down active environment. Active-Active (Multi-Site): RPO near zero, RTO near zero, highest cost. Each step up reduces RPO/RTO but increases cost.
- Aurora Global Database failover
- Up to 5 secondary read-only regions with < 1 second replication lag. Managed planned failover: promotes a secondary to primary in < 1 minute with no data loss. Unplanned (detach-and-promote): RPO measured in seconds. Application must update the writer endpoint (use Route 53 CNAME or Global Database endpoint).
- S3 cross-region replication (CRR) for DR
- Replicates new objects asynchronously to a destination bucket in another region. Enable versioning on both source and destination. S3 Replication Time Control (RTC) provides 99.99% of objects replicated within 15 minutes with SLA. Replicate existing objects using S3 Batch Replication.
- AWS Elastic Disaster Recovery (DRS)
- Continuous block-level replication of servers to a low-cost staging area. On failover, launches fully provisioned instances. Supports cloud-to-cloud and on-premises-to-cloud DR scenarios.
- Multi-Region DNS failover with Route 53
- Primary record with health check; secondary failover record points to DR environment. Route 53 health checks monitor endpoint health (HTTP/HTTPS/TCP) and fail over within 60 seconds (30s TTL + 10s health check interval). Note: DNS resolver caching at clients and intermediate resolvers can delay actual failover beyond the TTL. Use Application Recovery Controller for more controlled failover with readiness checks.
Advanced IAM & Security
- Permission boundaries
- An IAM policy attached to a role or user that sets the maximum permissions it can have. Effective permissions = intersection of identity policy AND permission boundary. Used to safely delegate role/policy creation to developers without escalation risk (developers cannot create roles with more permissions than their boundary allows).
- Policy evaluation order (cross-account)
- For cross-account access: (1) requestor account's identity policy must allow the action, (2) target resource-based policy or target role trust policy must allow the requestor. SCPs restrict but do not grant — they only filter what IAM policies can allow. Explicit Deny anywhere overrides all Allows.
- SCP vs IAM policy evaluation
- SCPs apply to all principals in the account EXCEPT the management account. A principal needs both an SCP that allows and an IAM policy that allows. SCPs do not affect service-linked roles. Use SCPs for org-wide guardrails (deny specific regions, require MFA for root, deny disabling CloudTrail).
- AWS RAM — resource sharing
- Share AWS resources (subnets, TGW, Route 53 Resolver rules, License Manager configs) across accounts without VPC peering. Works within an Organization (no invitation required if sharing is enabled) or via individual account invitations. Shared resources appear in the recipient account but are managed by the owner.
- Cognito — user pools vs identity pools
- User Pool: user directory for authentication (sign-up/sign-in), issues JWTs (ID token, access token, refresh token). Identity Pool (Federated Identities): exchanges tokens (from Cognito User Pool, SAML, OIDC, social IdPs) for temporary AWS credentials via STS AssumeRoleWithWebIdentity. Use both together: User Pool authenticates → Identity Pool authorizes AWS resource access.
- SAML 2.0 federation with IAM
- IdP (e.g., AD FS) sends SAML assertion to AWS. STS AssumeRoleWithSAML returns temporary credentials. Role trust policy must reference the SAML provider ARN. Attribute mapping (e.g., AD group → IAM role) is configured in IdP. Maximum session duration 12 hours for SAML-federated roles.
- Resource-based policy for cross-account S3 access
- { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::ACCOUNT_B:role/MyRole" }, "Action": ["s3:GetObject", "s3:PutObject"], "Resource": "arn:aws:s3:::my-bucket/*" }
Cost Optimization at Scale
- Reserved Instances across consolidated billing
- RI discounts are shared across all accounts in an Organization by default. The payer/management account benefits from RI utilization of linked accounts. Use RI sharing (enabled by default) so a purchased RI in one account can be applied to matching usage in any linked account.
- Savings Plans vs Reserved Instances
- Compute Savings Plans: most flexible — apply to EC2 (any instance family/region/OS), Fargate, and Lambda. Up to 66% savings. EC2 Instance Savings Plans: specific instance family + region, up to 72% savings. Standard RIs: specific instance type + region, up to 72%. Convertible RIs: can change instance family; up to 66%. Savings Plans are preferred for modern architectures due to flexibility.
- AWS Compute Optimizer
- Analyzes CloudWatch metrics to recommend optimal EC2 instance types, ASG configurations, Lambda memory, and EBS volume types. Identifies over-provisioned (rightsizing down) and under-provisioned resources. Integrates with Cost Explorer for estimated savings. Requires opt-in per account or org-wide.
- S3 Intelligent-Tiering
- Automatically moves objects between access tiers (Frequent, Infrequent, Archive Instant, Archive, Deep Archive) based on access patterns. No retrieval fees for Frequent/Infrequent tiers. Small monitoring fee per object. Best for unpredictable or unknown access patterns. Minimum object size 128 KB for cost benefit.
- Scheduled scaling for predictable workloads
- EC2 Auto Scaling scheduled actions set min/max/desired at specific times (e.g., scale up Monday 8am, scale down 6pm). More cost-effective than reactive scaling for known load patterns. Combine with target tracking scaling for unexpected spikes. Use predictive scaling (ML-based) for cyclical patterns.
- Cost allocation tags and chargeback
- Activate cost allocation tags in Billing console. Use tags like CostCenter, Project, Environment consistently across accounts. AWS Cost Explorer groups/filters by tag. AWS Budgets alerts by tag. For true multi-account chargeback, use Cost and Usage Report (CUR) exported to S3, queried via Athena.
Serverless & Event-Driven Architecture
- Lambda@Edge vs CloudFront Functions
- CloudFront Functions: lightweight JS, sub-millisecond execution, viewer request/response only, no network calls, 10 KB code limit — use for URL rewrites, header manipulation, cache key normalization. Lambda@Edge: Node.js/Python, up to 5 seconds (viewer) / 30 seconds (origin), can call external services — use for A/B testing, auth at edge, origin selection.
- Step Functions — Standard vs Express
- Standard: exactly-once execution, up to 1 year duration, auditable history in console, charged per state transition — use for long-running business workflows. Express: at-least-once, up to 5 minutes, high throughput (100K+ executions/sec), charged per duration — use for high-volume event processing, IoT, streaming pipelines.
- API Gateway — REST vs HTTP vs WebSocket
- HTTP API: lower cost (70% cheaper), lower latency, supports OIDC/JWT auth, Lambda and HTTP backends. REST API: full feature set — API keys, WAF, usage plans, request/response transformation, resource policies, custom domain + stage variables. WebSocket API: bidirectional persistent connections, $connect/$disconnect/$default routes — use for real-time chat, dashboards.
- EventBridge cross-account event buses
- Create a custom event bus in the central account. Add a resource-based policy allowing source accounts to send events. Source accounts create EventBridge rules with the target as the custom bus ARN in the central account. Enables centralized event routing and processing across an org.
- Lambda concurrency controls
- Reserved concurrency: caps a function's max concurrency, guaranteeing capacity and throttling above the limit. Provisioned concurrency: pre-initializes execution environments to eliminate cold starts — pay for it even when idle. Use reserved concurrency to protect downstream services from Lambda bursts.
- SQS + Lambda integration patterns
- Lambda polls SQS (event source mapping). Batch size and batch window control how many messages per invocation. On failure, messages return to queue after visibility timeout; after maxReceiveCount, move to DLQ. Use FIFO SQS for ordering + deduplication; design Lambda handlers as idempotent. Lambda reserved concurrency should be set to avoid overwhelming downstream resources.
Data Architecture
- AWS Lake Formation — data lake governance
- Centralizes access control for S3-based data lakes. Grant/revoke table- and column-level permissions on Glue Data Catalog databases/tables. LF-Tags (attribute-based access control) scale better than resource-based grants. Cross-account sharing: grant LF permissions to external account; recipient registers the data catalog.
- AWS Glue — crawlers, ETL, Data Catalog
- Crawlers: scan S3, JDBC, DynamoDB to auto-discover schema and populate the Data Catalog. Data Catalog: central Hive-compatible metastore used by Athena, Redshift Spectrum, and EMR. ETL jobs: Spark-based transformations; supports Python Shell and Streaming. Glue Studio provides visual ETL. Use Job Bookmarks to process only new/changed data incrementally.
- Athena vs Redshift Spectrum
- Athena: serverless SQL on S3, pay per TB scanned, best for ad-hoc queries on raw/semi-structured data, no infrastructure. Redshift Spectrum: extends Redshift cluster to query S3 data, joins between Redshift local tables and S3, better for complex analytics with existing Redshift investment. Both use Glue Data Catalog.
- Kinesis Data Streams vs Amazon Data Firehose
- Kinesis Data Streams: custom consumers (KCL, Lambda), configurable retention (1-365 days), at-least-once processing with checkpointing; design consumers idempotently for effective exactly-once, low latency (~200ms), requires capacity management (shards). Amazon Data Firehose: fully managed, automatic scaling, delivers to S3/Redshift/OpenSearch/Splunk/HTTP, minimum 60-second buffering, near-real-time but not real-time.
- Amazon MSK (Managed Streaming for Kafka)
- Fully managed Apache Kafka. Use when migrating existing Kafka workloads or when Kafka-specific features are required (consumer groups, topic compaction, Kafka Streams, Kafka Connect). MSK Serverless: auto-scales capacity. MSK Connect: managed Kafka Connect workers. Prefer Kinesis for greenfield AWS-native streaming.
- DynamoDB Global Tables for active-active multi-region
- Multi-master replication across regions — writes accepted in any region, replicated asynchronously (typically < 1 second). Last-writer-wins conflict resolution. Use for global low-latency reads/writes and active-active DR. Requires DynamoDB Streams enabled. Version 2019.11.21 is current (simpler setup, no manual stream management).
Container Architecture
- ECS vs EKS decision
- ECS: AWS-native, simpler operational model, tighter AWS service integration (IAM task roles, Service Connect, ALB integration), lower overhead. EKS: Kubernetes-compatible, required if migrating existing K8s workloads or needing K8s ecosystem (Helm, operators, KEDA). ECS is preferred for greenfield AWS-only projects; EKS for multi-cloud portability or K8s-specific requirements.
- Fargate vs EC2 launch type
- Fargate: serverless containers — no cluster capacity management, pay per vCPU/memory per second, isolated task execution environment, slower cold start. EC2: manage instances (or use Managed Node Groups for EKS), supports GPU instances, Windows containers (ECS), spot-based cluster for cost savings. Use Fargate for variable/bursty workloads; EC2 for steady-state high-throughput or specialized hardware.
- VPC Lattice — service-to-service networking
- Provides managed service-to-service networking (App Mesh is deprecated). Provides application-layer (L7) load balancing, traffic management, and mutual TLS between services across VPCs and accounts. Define service networks and associate VPCs and services. Auth policies control which services can communicate. Fully managed — no sidecar proxies required.
- ECS service discovery with AWS Cloud Map
- Register ECS tasks as Cloud Map service instances. DNS-based discovery: tasks resolve service names to IPs via Route 53 private hosted zones. API-based discovery: DiscoverInstances API for health-aware lookups. ECS Service Connect (newer): sidecar-based service mesh built on Cloud Map and Envoy, provides metrics and retries.
- ECR image security scanning
- Basic scanning: AWS-native CVE-based OS vulnerability scans, on-push or manual. Enhanced scanning: Amazon Inspector + ECR integration, continuous scanning of OS packages and application dependencies, findings in Security Hub. Enable enhanced scanning at registry level. Use lifecycle policies to expire old/untagged images and control storage costs.
Infrastructure as Code
- CloudFormation StackSets — self-managed vs service-managed
- Self-managed: manually create admin and execution IAM roles in each account; use for accounts outside AWS Organizations. Service-managed: integrates with Organizations, auto-deploys to new accounts, manages IAM roles automatically. Service-managed supports trusted access and delegated administrator for StackSets.
- CloudFormation nested stacks vs stack sets
- Nested stacks: parent stack references child stack templates via AWS::CloudFormation::Stack resource; shares outputs between stacks in the same account/region; good for modular reuse. StackSets: deploy the same stack to multiple accounts and/or regions simultaneously.
- CloudFormation drift detection
- Detects when actual resource configuration differs from the template-defined expected configuration. Run DetectStackDrift via console or CLI. Drift status: IN_SYNC, DRIFTED, NOT_CHECKED. Does not auto-remediate — use Config rules + SSM Automation for auto-remediation of drifted resources.
- CloudFormation custom resources
- Lambda-backed or SNS-backed resources for provisioning non-AWS resources or performing actions during stack operations. Lambda receives Create/Update/Delete events and must send a response (SUCCESS/FAILED) to a pre-signed S3 URL. Use for seeding databases, registering DNS in external providers, or calling third-party APIs.
- Blue/green deployment with CodeDeploy + CloudFormation
- For ECS: CodeDeploy automates blue/green by creating a new task set (green) behind the ALB test listener, runs validation hooks, then shifts traffic. CloudFormation can trigger CodeDeploy via transform (AWS::CodeDeployBlueGreen). For EC2: all-at-once, rolling, or blue/green using ASG + Launch Template swap.
- CDK vs CloudFormation
- CDK synthesizes to CloudFormation templates. Use constructs at L1 (CFn resources), L2 (opinionated defaults), or L3 (patterns) levels. CDK enables code reuse (loops, conditions in TypeScript/Python/Java) and type safety. CDK Bootstrapping required (CDKToolkit stack creates S3 bucket + ECR repo for assets).
High Availability Patterns
- Multi-Region active-active vs active-passive
- Active-active: traffic routes to multiple regions simultaneously (Route 53 weighted/latency routing or Global Accelerator). Both regions serve traffic and accept writes. Requires conflict resolution (DynamoDB Global Tables, Aurora Global Database). Active-passive: secondary region is on standby; Route 53 failover routing or Global Accelerator health checks redirect on failure.
- AWS Global Accelerator vs CloudFront
- Global Accelerator: routes TCP/UDP traffic to optimal AWS endpoint using Anycast IPs on AWS global network, health-check-based failover in < 30 seconds, static IPs (good for IP whitelisting). Not a CDN — no caching. CloudFront: HTTP/S CDN with edge caching, Lambda@Edge, signed URLs. Use GA for non-HTTP or when static IPs are required.
- Route 53 routing policies for HA
- Failover: primary/secondary with health checks. Latency-based: route to region with lowest latency. Weighted: distribute traffic by percentage (blue/green, canary). Geolocation: route by user's geographic location. Geoproximity: route by proximity with bias adjustment. Multivalue Answer: up to 8 healthy records returned.
- S3 Multi-Region Access Points
- Single global endpoint that routes requests to the nearest S3 bucket replica based on network latency. Requires S3 Cross-Region Replication between the associated buckets. Supports failover controls: Active or Passive per bucket. Simplifies multi-region S3 architectures without DNS complexity.
- ALB + ASG multi-AZ design
- Deploy ASG with instances spread across 3+ AZs using AZ-balanced capacity rebalancing. Enable ALB cross-zone load balancing (default for ALB). Set minimum healthy percentage in rolling update policy to maintain availability during deployments. Use Connection Draining (deregistration delay) to let in-flight requests complete before instance termination.
Hybrid & Edge
- AWS Outposts
- AWS-managed rack of servers installed on-premises. Runs native AWS services (EC2, EBS, RDS, ECS, EKS, S3 Outposts) locally. Connected to parent AWS region via Service Link. Use for workloads requiring ultra-low latency to on-premises systems, local data residency, or processing near on-premises equipment. Fully managed by AWS.
- Local Zones vs Wavelength
- Local Zones: AWS infrastructure in metro areas closer to end users (e.g., Los Angeles). Extension of a parent region. Deploy EC2, EBS, ECS for single-digit ms latency to city-level users. Wavelength: AWS compute embedded in 5G carrier networks (e.g., Verizon). Ultra-low latency (< 10ms) for mobile/5G applications — gaming, AR/VR, connected vehicles.
- Snow Family comparison
- Snowcone: 8 TB HDD or 14 TB SSD, 2 vCPUs, lightest (4.5 lbs), DataSync agent pre-installed, battery-powered option. Snowball Edge Storage Optimized: 80 TB usable (newer models support 210 TB), 40 vCPUs, 1 GB network. Snowball Edge Compute Optimized: 28 TB usable, 52 vCPUs, GPU option. Snowmobile: exabyte-scale (100 PB per truck). Choose based on data volume and edge compute needs.
- Storage Gateway modes
- File Gateway: NFS/SMB access to S3, locally caches frequently accessed files. Volume Gateway (Stored): primary data on-premises, snapshots to S3 as EBS snapshots. Volume Gateway (Cached): primary data in S3, local cache for active data. Tape Gateway: virtual tape library (VTL) interface replacing physical tape, archives to S3 Glacier.
- DataSync vs Storage Gateway
- DataSync: online data migration and recurring scheduled transfers (NFS/SMB/HDFS/S3/EFS/FSx), point-in-time transfer with scheduling, up to 10x faster than open-source tools. Storage Gateway: ongoing hybrid access — on-premises applications access cloud storage as if it were local. Use DataSync for migration; Storage Gateway for permanent hybrid integration.
- Direct Connect + VPN as backup
- Primary path: Direct Connect for consistent low-latency high-throughput connectivity. Backup path: Site-to-Site VPN over internet as failover. Use BGP routing: DX has lower AS path length so it's preferred. If DX fails, BGP converges to VPN. This combination meets both HA and performance requirements for hybrid architectures.