CertPrepNow
AWSSAA-C034 domains

SAA-C03 Exam Notes

Last-minute traps, must-know facts, and scenario tips for the AWS Certified Solutions Architect – Associate exam.

General Exam Tips

  • 1.Read EVERY answer option before selecting. The distractor is always plausible — AWS writes options that are almost right.
  • 2.Find the constraint keyword first: 'most cost-effective', 'minimum operational overhead', 'highest availability'. That keyword eliminates 2–3 answers immediately.
  • 3.You can miss ~18 of 65 questions and still pass (720/1000). Don't spiral on hard questions — flag and move on.
  • 4.~10 unscored experimental questions exist. A question that seems completely out of scope may be one of them — don't panic.
  • 5.Eliminate vertical scaling (bigger instance) answers first. AWS almost always wants horizontal scaling.
  • 6.When two answers both work technically, the one with LESS operational overhead wins. AWS favors managed services over custom solutions.
  • 7.Multi-choice questions (select 2 or 3) often have two obvious correct answers and one that sounds right but violates a constraint in the question.
  • 8.Pace yourself: 2 minutes per question. Flag anything taking more than 90 seconds. Return with fresh eyes.
  • 9.When the question says 'MOST cost-effective', eliminate active-active multi-region setups immediately — those are always the most expensive.
  • 10.IAM questions: if the scenario involves an EC2 instance or Lambda needing to access AWS, the answer is almost always an IAM role, never access keys.
Domain 130% of exam

Design Secure Architectures

Must-Know Facts

  • IAM policy evaluation order: explicit Deny always wins, then Organizations SCP, then resource-based policy, then identity-based policy, then implicit Deny. Cross-account requires BOTH resource-based AND identity-based policies to allow.
  • Security Groups are stateful (return traffic auto-allowed) and can only ALLOW. NACLs are stateless (return traffic needs explicit allow) and can DENY. Use NACLs to block specific IPs.
  • VPC Gateway endpoints (S3, DynamoDB) are FREE and keep traffic off the internet. VPC Interface endpoints (all other services) cost money per hour + per GB.
  • KMS key policies are the PRIMARY authorization for KMS keys — IAM policies alone are not sufficient without a key policy allowing the principal.
  • SSE-KMS gives customer-managed encryption with CloudTrail audit visibility. SSE-S3 is simpler but no customer key control. SSE-C requires sending keys with every request.
  • IAM roles use temporary STS credentials. Always use roles for EC2 instance profiles, Lambda, and cross-account — never embed long-term access keys.
  • Permissions boundaries set the MAXIMUM permissions an identity policy can grant — they don't grant anything themselves.
  • AWS Organizations SCPs restrict the maximum permissions available to member accounts — even root can be limited. They don't grant permissions.
  • For cross-account S3 access, the IAM role in Account B needs a trust policy allowing Account A, AND the S3 bucket may need a bucket policy if the role isn't in the same account.
  • Shield Standard is free for all customers (Layer 3/4). Shield Advanced ($3,000/month) adds Layer 7 DDoS protection, 24/7 DRT access, and cost protection.

Common Traps

TrapAssuming IAM identity-based policies alone are sufficient to authorize KMS key usage
RealityKMS key policies are the primary control mechanism. A KMS key with no explicit key policy grant will deny access even if the IAM policy says Allow. Both must allow access.
TrapUsing Security Groups to block a specific malicious IP address
RealitySecurity Groups only support Allow rules — you cannot create a Deny rule. Use NACLs to block specific IPs at the subnet level. NACLs evaluate rules in number order (lowest first), so place the Deny rule above the Allow-all rule.
TrapThinking VPC Gateway Endpoints and Interface Endpoints work the same way
RealityGateway endpoints (S3, DynamoDB only) modify the route table and are free. Interface endpoints create an ENI in your subnet and have hourly + data charges. For cost-sensitive scenarios requiring S3 access from a private subnet, Gateway endpoint is always the right answer.
TrapAssuming resource-based policies alone grant cross-account access without the other account's IAM policy
RealityCross-account access requires both the resource-based policy (e.g., S3 bucket policy in Account B allows Account A's role) AND an identity-based policy in Account A allowing the role to access Account B's resource. Within the same account, a resource-based policy is sufficient on its own.
TrapChoosing IAM user with access keys for an application running on EC2
RealityApplications on EC2 should use IAM instance profiles (roles). Access keys embedded in applications or environment variables are a security risk. AWS rotates temporary role credentials automatically.
TrapThinking CloudTrail data events are enabled by default
RealityCloudTrail management events (API calls on resources like CreateBucket, RunInstances) are enabled by default. Data events (S3 object-level operations, Lambda invocations) must be explicitly enabled and cost extra.
TrapBelieving NACL rules are evaluated all at once like Security Groups
RealityNACL rules are evaluated in number order, lowest first. The first matching rule wins. If rule 100 allows all traffic and rule 200 denies a specific IP, the deny never fires. Place deny rules with lower numbers than the corresponding allow rules.
TrapUsing AWS WAF to protect against network-level (Layer 3/4) DDoS attacks
RealityWAF operates at Layer 7 (HTTP/HTTPS). It protects against SQLi, XSS, bad bots, and HTTP floods. Shield Standard/Advanced handles Layer 3/4 volumetric DDoS attacks. Use both together for comprehensive protection.

Confusing Pairs

Security GroupsNACLs

Security Groups = stateful, instance-level, Allow-only, all rules evaluated. NACLs = stateless, subnet-level, Allow+Deny, first-match wins. Key decision: need to BLOCK a specific IP? Must use NACL. Need automatic return traffic? Use Security Group.

VPC Gateway EndpointVPC Interface Endpoint (PrivateLink)

Gateway = free, route-table based, S3 and DynamoDB only. Interface = paid (hourly + data), ENI in subnet, 100+ services. Questions asking 'most cost-effective way to access S3 from private subnet without internet' → Gateway Endpoint.

SSE-S3SSE-KMSSSE-C

SSE-S3 = AWS manages key entirely, no audit trail, free. SSE-KMS = you manage CMK, audit via CloudTrail, KMS API call cost. SSE-C = you provide key per-request, AWS doesn't store key. If question says 'full control of encryption keys' or 'key rotation' → SSE-KMS with CMK.

IAM Identity CenterIAM Federation (SAML/OIDC)

IAM Identity Center = centralized SSO for multi-account AWS access, integrates with IdPs, assigns permission sets. Direct IAM federation = configure trust directly with an IdP in a single account. Multi-account org with human users → IAM Identity Center.

Permissions BoundarySCP (Service Control Policy)

Permissions Boundary = limits the MAX permissions a specific IAM entity (user or role) can have. SCP = limits the MAX permissions for an entire AWS account or OU in Organizations. Both are guards, not grants. Neither grants any permissions on their own.

AWS WAFAWS Shield

WAF = Layer 7 (HTTP), protects against web exploits (SQLi, XSS, HTTP floods), attach to ALB/CloudFront/API GW. Shield = Layer 3/4 DDoS protection, Standard is automatic/free, Advanced costs $3K/month with 24/7 support and cost protection.

Scenario Tips

If the question asks about:

Question asks for private connectivity from EC2 to S3/DynamoDB without traversing the internet, and asks for 'most cost-effective'

Answer:

VPC Gateway Endpoint — free, no data transfer charges, modifies route table. Works for S3 and DynamoDB only.

Distractor to avoid:

NAT Gateway — also works but has hourly + per-GB charges, making it wrong for 'most cost-effective'.

If the question asks about:

Question describes an application on EC2 that needs to access another AWS service (S3, DynamoDB, Secrets Manager)

Answer:

Attach an IAM role to the EC2 instance (instance profile). The application uses the instance metadata service to get temporary credentials automatically.

Distractor to avoid:

Creating an IAM user with access keys and configuring them in the application — violates least privilege and is a security anti-pattern.

If the question asks about:

Question asks how to grant cross-account access to an S3 bucket 'without creating IAM users'

Answer:

Create an IAM role in the bucket's account with S3 permissions and a trust policy that allows the other account's principal to AssumeRole. The other account's users assume the role.

Distractor to avoid:

Making the S3 bucket public or sharing access keys — both violate security best practices and will always be wrong answers.

If the question asks about:

Question asks about 'encrypting S3 data with keys the company fully controls and requires annual key rotation audit trail'

Answer:

SSE-KMS with a customer managed key (CMK) and automatic annual rotation enabled. CloudTrail logs all key usage.

Distractor to avoid:

SSE-S3 — AWS manages the key, company has no visibility or control. Wrong when question specifies 'full control'.

If the question asks about:

Question involves a web application exposed to the internet that needs protection from SQL injection and cross-site scripting

Answer:

AWS WAF attached to the ALB or CloudFront distribution, with managed rule groups for SQL injection and XSS patterns.

Distractor to avoid:

Security Groups or NACLs — these are network-level controls that cannot inspect HTTP payloads.

Last-Minute Facts

1IAM policy evaluation: explicit Deny > SCP > resource policy > identity policy > implicit Deny
2Security Groups: stateful, allow only, instance level. NACLs: stateless, allow+deny, subnet level, number order
3VPC Gateway Endpoints: FREE, only S3 and DynamoDB. Interface Endpoints: PAID, any service
4KMS CMK: $1/month, full policy control, CloudTrail audit. AWS Managed Key: free, no customer control
5Shield Standard: free for all. Shield Advanced: $3,000/month, Layer 7 DDoS + DRT
6Permissions boundary: limits max permissions for a single IAM entity. SCP: limits max for entire AWS account
7CloudTrail management events: ON by default. Data events (S3 object ops): OFF by default, extra cost
8WAF attaches to ALB, CloudFront, API Gateway, or AppSync — NOT directly to EC2 or Route 53
9Direct Connect does NOT encrypt by default — must add IPsec VPN over DX or use MACsec
10ACM certificates for CloudFront MUST be provisioned in us-east-1 (N. Virginia)
Domain 226% of exam

Design Resilient Architectures

Must-Know Facts

  • DR strategies by cost: Backup & Restore (hours RTO) < Pilot Light (minutes-hours RTO) < Warm Standby (minutes RTO) < Active-Active (near zero RTO). Match the cheapest strategy to the RTO/RPO requirement.
  • Multi-AZ = high availability within a Region. Multi-Region = disaster recovery. Do not confuse them — a question about 'if the entire region fails' needs Multi-Region, not Multi-AZ.
  • RDS Multi-AZ: synchronous replication, automatic failover 60–120 seconds, standby CANNOT serve read traffic. Read Replicas: asynchronous, can serve reads, manual promotion, cross-Region supported.
  • Aurora: 6 copies across 3 AZs, automatic failover in <30 seconds, up to 15 read replicas. Aurora Global Database: <1 second RPO, secondary Region becomes primary in under 1 minute.
  • SQS visibility timeout MUST exceed the processing time, or the same message will be processed twice. Set it to ~6x the expected processing time.
  • ALB sticky sessions break if the instance terminates — externalize session state to ElastiCache or DynamoDB instead.
  • Auto Scaling across 2 AZs with min=2 means 1 instance per AZ. If one AZ fails, you have 1 instance until ASG recovers — design for n+1 redundancy.
  • Route 53 failover routing requires health checks on the primary endpoint. If you forget health checks, failover never triggers.
  • SQS FIFO guarantees exactly-once processing and ordering. Default throughput: 300 TPS (3,000 with batching). High-throughput mode (opt-in) supports up to 70,000 TPS in supported regions. Standard has unlimited throughput but at-least-once delivery (can duplicate).
  • DynamoDB Global Tables = multi-Region, multi-active. Any Region can serve reads and writes. Conflict resolution is last-writer-wins.

Common Traps

TrapAssuming RDS Multi-AZ standby serves read traffic
RealityRDS Multi-AZ standby is solely for failover — it sits idle and syncs synchronously. Zero read traffic. For read scaling, add Read Replicas. Questions asking for 'both HA and improved read performance' need Multi-AZ PLUS Read Replicas.
TrapTreating ALB sticky sessions as a valid HA solution for stateful apps
RealitySticky sessions keep users on the same instance, but if that instance terminates (Auto Scaling event or failure), the session is lost anyway. The real fix is externalizing session state to ElastiCache Redis or DynamoDB.
TrapThinking Pilot Light means a fully running scaled-down system
RealityPilot Light = only the critical core (usually just the database) is running in the DR Region. Everything else is off/terminated and must be launched during recovery. Warm Standby = fully functional but small version running continuously. Know which one has which RTO.
TrapSelecting Multi-AZ as the answer when the question describes a full Region outage
RealityMulti-AZ only protects within a Region. If a Region fails, Multi-AZ provides no protection. Full Region failure requires Multi-Region DR strategy (Aurora Global Database, DynamoDB Global Tables, S3 CRR + Route 53 failover).
TrapSetting SQS visibility timeout shorter than the Lambda processing time
RealityIf Lambda takes 5 minutes to process but visibility timeout is 30 seconds, the message reappears on the queue after 30 seconds and gets processed again by another consumer — causing double processing. Visibility timeout should be approximately 6x the function timeout.
TrapThinking that Auto Scaling alone provides high availability
RealityAn ASG in a single AZ can replace failed instances, but if the AZ itself fails, you have zero capacity. ASGs MUST span multiple AZs for true high availability. Multi-AZ ASG is the minimum HA configuration.

Confusing Pairs

RDS Multi-AZRDS Read Replicas

Multi-AZ = HA (synchronous standby, automatic failover, cannot serve reads). Read Replicas = read scalability (asynchronous, can serve reads, manual promotion). Question says 'minimize downtime' or 'automatic failover' → Multi-AZ. Question says 'scale reads' or 'reduce read load' → Read Replicas.

Pilot LightWarm Standby

Pilot Light = only the database runs continuously in DR Region, compute is off (RTO: minutes-hours). Warm Standby = fully functional mini environment runs continuously (RTO: minutes). Warm Standby costs more but recovers faster. The exam loves testing this distinction with specific RTO requirements.

SQS Standard QueueSQS FIFO Queue

Standard = unlimited throughput, at-least-once delivery (can duplicate!), best-effort ordering. FIFO = exactly-once, strict ordering within message group, 300 TPS default (3,000 batched; high-throughput mode up to 70,000 TPS opt-in). FIFO queue name must end in .fifo. If the question mentions 'financial transactions', 'no duplicate processing', or 'order matters' → FIFO.

Multi-AZMulti-Region

Multi-AZ = HA against AZ failure, automatic (RDS Multi-AZ, Aurora, ELB). Multi-Region = DR against regional failure, requires explicit design (Aurora Global, DynamoDB Global Tables, S3 CRR). If question mentions 'region goes down' or 'compliance requiring geographic separation' → Multi-Region.

AuroraRDS MySQL/PostgreSQL

Aurora = up to 5x MySQL / 3x PostgreSQL throughput, 6 copies across 3 AZs auto, failover <30 sec, up to 15 replicas, 128 TB auto-scaling storage. RDS = standard managed database, Multi-AZ is separate purchase, failover 60–120 sec, up to 5 replicas. For 'highly available', 'automatic failover', 'minimal downtime' → Aurora wins.

Scenario Tips

If the question asks about:

Users report losing session data when instances are replaced during Auto Scaling events

Answer:

Externalize session state to ElastiCache for Redis or DynamoDB. This makes instances truly stateless so any instance can serve any user.

Distractor to avoid:

Enable ALB sticky sessions — this only delays the problem, it doesn't fix it. If the instance terminates, session data is still lost.

If the question asks about:

Company needs DR with RPO of 1 hour and RTO of 15 minutes, minimize cost

Answer:

Cross-Region Read Replica with a promote-on-failure procedure. Replication lag is typically seconds (well within 1-hr RPO), promotion takes minutes (within 15-min RTO).

Distractor to avoid:

Aurora Global Database — meets requirements but costs more than a Read Replica solution when cost minimization is the constraint.

If the question asks about:

Application processes orders and loses them during peak traffic spikes

Answer:

SQS queue between the web tier and processing tier. SQS buffers the orders and they persist until processed — zero order loss with eventual processing.

Distractor to avoid:

Increasing Auto Scaling maximum capacity — may still drop requests at the extreme peak moment before scaling completes.

If the question asks about:

Company runs a database and asks for 'both high availability and improved read performance'

Answer:

RDS Multi-AZ for automatic failover HA PLUS Read Replicas for read traffic distribution. Both are needed because Multi-AZ standby cannot serve reads.

Distractor to avoid:

RDS Multi-AZ alone — provides HA but zero read performance improvement since standby is idle.

If the question asks about:

Question requires exactly-once message processing for a payment system

Answer:

SQS FIFO queue with exactly-once processing guarantees. Financial transactions require no duplicates.

Distractor to avoid:

SQS Standard queue — at-least-once delivery means duplicate processing is possible, which is catastrophic for payments.

Last-Minute Facts

1RDS Multi-AZ failover time: 60–120 seconds. Aurora failover: <30 seconds.
2Aurora Global Database RPO: <1 second. RTO (regional failover): <1 minute.
3SQS visibility timeout default: 30 seconds. Maximum: 12 hours.
4SQS message retention: 4 days default, max 14 days.
5SQS FIFO default throughput: 300 msg/sec (3,000 with batching). High-throughput mode (opt-in): up to 70,000 TPS. Standard: unlimited.
6DR cost order (cheapest to most): Backup & Restore → Pilot Light → Warm Standby → Active-Active
7Route 53 failover routing requires health checks — without them, failover never occurs automatically
8DynamoDB Global Tables: last-writer-wins for conflict resolution, requires on-demand or auto-scaling mode
9Auto Scaling group minimum spans multiple AZs — always deploy ASG across 2+ AZs
10S3 replication (CRR/SRR) requires versioning enabled on BOTH source and destination buckets
Domain 324% of exam

Design High-Performing Architectures

Must-Know Facts

  • Caching layers from closest to farthest from user: CloudFront edge (global, HTTP/HTTPS content) → API Gateway cache (API responses) → ElastiCache (application-level, sub-ms) → DAX (DynamoDB only, in-memory). Each layer reduces load on the next.
  • EBS gp3: 3,000 IOPS baseline regardless of size (unlike gp2 which ties IOPS to GB). gp3 is cheaper and more flexible than gp2 — default choice for most workloads.
  • ElastiCache Redis cluster mode enables sharding for datasets larger than a single node's memory. Non-cluster mode = single primary, up to 5 replicas — limited to single-node RAM.
  • DynamoDB on-demand mode scales instantly without capacity planning but costs ~2x provisioned at sustained load. Use on-demand for truly unpredictable/spiky traffic; provisioned + auto-scaling for consistent or predictable loads.
  • S3 performance: 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD per second PER PREFIX. Use multiple prefixes to scale beyond this. Multipart upload required for files >5 GB, recommended for >100 MB.
  • EC2 placement groups: Cluster (same rack, lowest latency, 10 Gbps, all-or-nothing failure risk), Spread (different racks, max 7 per AZ, best fault isolation), Partition (large distributed systems like Hadoop, 7 partitions per AZ).
  • Global Accelerator = static anycast IPs + AWS backbone routing for TCP/UDP. CloudFront = HTTP/HTTPS CDN with edge caching. Different tools for different problems.
  • RDS Proxy: connection pooling for Lambda connecting to RDS. Lambda creates new connections per invocation — without proxy, database connection exhaustion is common at scale.

Common Traps

TrapChoosing CloudFront to optimize TCP/UDP game traffic or VoIP
RealityCloudFront is an HTTP/HTTPS content CDN. For TCP/UDP optimization (gaming, VoIP, real-time APIs), Global Accelerator is the right tool. It uses the AWS global backbone but doesn't cache — it just routes faster.
TrapChoosing Global Accelerator to cache content and reduce origin load
RealityGlobal Accelerator does NOT cache anything. It just routes traffic more efficiently to your origin. CloudFront caches content at edge locations, reducing origin requests. 'Reduce origin load' → CloudFront. 'Static IPs required' or 'TCP/UDP' → Global Accelerator.
TrapAssuming gp2 and gp3 perform identically
Realitygp3 provides 3,000 IOPS baseline regardless of volume size. gp2 delivers 3 IOPS/GB (so a 100 GB volume only gets 300 IOPS). For small volumes needing performance, gp3 is dramatically better AND cheaper.
TrapChoosing Read Replicas as the caching solution for a read-heavy database
RealityRead Replicas still hit a database engine for every query. ElastiCache serves cached responses in sub-millisecond without touching the database at all. For 'repeatedly reading the same data' scenarios, ElastiCache is the right answer — Read Replicas help with different read queries, not repeated identical ones.
TrapThinking DynamoDB auto-scaling responds instantly to traffic spikes
RealityDynamoDB auto-scaling has a reaction delay of several minutes. For truly unpredictable bursty workloads (sudden traffic spikes within seconds), use DynamoDB on-demand mode instead of provisioned + auto-scaling.
TrapUsing S3 Transfer Acceleration for downloads to speed up access for distant users
RealityS3 Transfer Acceleration is optimized for UPLOADS from distant locations to S3. For download performance (serving content to users globally), use CloudFront. Transfer Acceleration charges apply only when it's actually faster than standard upload.

Confusing Pairs

CloudFrontGlobal Accelerator

CloudFront = HTTP/HTTPS CDN, caches content at edge, reduces origin load. Global Accelerator = TCP/UDP anycast routing, no caching, static IPs, routes over AWS backbone. If question says 'static IP required' or 'non-HTTP' → Global Accelerator. If question says 'cache', 'CDN', 'reduce origin requests' → CloudFront.

ElastiCache RedisElastiCache Memcached

Redis = persistence, replication, Multi-AZ failover, cluster mode (sharding), sorted sets, pub/sub, Lua. Memcached = pure cache, multi-threaded, no persistence, no replication, no failover. Session storage or leaderboards → Redis. Simple object caching, multi-threaded scaling → Memcached.

DAX (DynamoDB Accelerator)ElastiCache (general purpose)

DAX = purpose-built in-memory cache for DynamoDB only, API-compatible (just change endpoint), microsecond reads. ElastiCache = general-purpose cache for any data source including RDS, application objects, or session data. Questions about caching DynamoDB reads → DAX. Questions about caching RDS or application data → ElastiCache.

EBS gp3EBS io2

gp3 = default SSD, up to 16,000 IOPS, great for most workloads. io2 Block Express = up to 256,000 IOPS, 99.999% durability, multi-attach supported, mission-critical databases. If the question asks for 'maximum IOPS' or 'mission-critical database' → io2. If it just needs good performance at low cost → gp3.

S3 Transfer AccelerationCloudFront

Transfer Acceleration = speeds up UPLOADS to S3 from distant locations via CloudFront edge. CloudFront = caches and delivers content (downloads) globally. 'Users globally uploading large files to S3' → Transfer Acceleration. 'Users globally accessing static content' → CloudFront.

Scenario Tips

If the question asks about:

RDS database at high CPU due to repeated reads of the same popular data records

Answer:

ElastiCache for Redis as a read-through cache. Popular records are served from memory (sub-ms) without hitting the database at all.

Distractor to avoid:

RDS Read Replicas — these distribute different read queries but still execute SQL for every request, not solving the repeated identical-query problem.

If the question asks about:

Application requires static IP addresses that customers can whitelist in their firewalls, with global performance

Answer:

AWS Global Accelerator provides 2 static Anycast IPs that route traffic to your endpoints over the AWS backbone.

Distractor to avoid:

CloudFront — doesn't support static IPs. ALB — IPs can change. Neither satisfies the 'static IP for firewall whitelist' requirement.

If the question asks about:

Lambda functions connecting to RDS causing 'too many connections' errors at scale

Answer:

RDS Proxy — pools connections between Lambda invocations. Lambda creates a new connection per invocation, which exhausts database connection limits. Proxy reuses pooled connections.

Distractor to avoid:

Increasing RDS max_connections — you can increase the limit but Lambda's connection-per-invocation pattern will still exhaust any limit at sufficient scale.

If the question asks about:

ML training job on EC2 needs lowest inter-node network latency for distributed training

Answer:

EC2 Cluster Placement Group — places instances on the same physical rack, enabling 10 Gbps enhanced networking. Required for tightly coupled HPC/ML workloads.

Distractor to avoid:

Spread Placement Group — distributes instances across racks for fault tolerance, which INCREASES latency between nodes.

If the question asks about:

Company wants to run containerized microservices without managing EC2 servers and needs auto-scaling to zero at off-peak hours

Answer:

ECS or EKS with Fargate launch type — serverless containers, no EC2 fleet to manage, scales to zero, pay per task. Fargate removes the operational overhead of patching and capacity planning.

Distractor to avoid:

ECS with EC2 launch type — you still manage and pay for EC2 instances even when no tasks are running. Not truly serverless and doesn't scale to zero on infrastructure.

Last-Minute Facts

1EBS gp3: 3,000 IOPS baseline, up to 16,000 IOPS. gp2: 3 IOPS/GB, max 16,000 IOPS (needs 5,334 GB to reach max).
2EBS io2 Block Express: up to 256,000 IOPS, 4,000 MiB/s, 99.999% durability, supports Multi-Attach.
3S3: 3,500 PUT/5,500 GET per second PER PREFIX — add more prefixes to scale.
4Lambda: 15-minute max execution, 10 GB max memory, 10 GB /tmp storage.
5DynamoDB: 1 RCU = 1 strongly consistent 4KB read (or 2 eventually consistent reads). 1 WCU = 1 write of 1KB.
6ElastiCache Redis cluster mode: horizontal sharding across nodes, supports datasets larger than single-node RAM.
7RDS Proxy reduces Lambda-to-RDS failover time by up to 66% in addition to connection pooling.
8Global Accelerator: 2 static Anycast IPs, routes over AWS backbone, <30-second failover.
9Aurora Serverless v2: scales in fine-grained ACU increments, better for variable workloads vs v1.
10Kinesis: 1 MB/s ingest and 2 MB/s read per shard. Hot partition keys cause throttling — add random prefix.
Domain 420% of exam

Design Cost-Optimized Architectures

Must-Know Facts

  • EC2 savings in order (highest to lowest cost): On-Demand > Standard RI 1yr > Convertible RI > Compute Savings Plans > Standard RI 3yr All Upfront > Spot. Spot can reach ~90% off; Standard RI 3yr all-upfront gives ~72% for committed workloads.
  • Spot Instances: up to 90% savings, interrupted with 2-minute warning. ONLY for fault-tolerant, stateless workloads. Never for databases, master nodes, or time-sensitive single-instance work.
  • Serverless services have zero cost at zero load: Lambda, Fargate, DynamoDB on-demand, Aurora Serverless, SQS, SNS. Perfect for sporadic or unpredictable workloads.
  • Data transfer costs: same AZ = free between services. Cross-AZ = $0.01/GB each direction. Cross-Region = varies (~$0.02/GB). Internet egress = $0.09/GB (US). Minimize cross-AZ traffic in cost-sensitive systems.
  • NAT Gateway charges hourly ($0.045/hr) PLUS per-GB processed ($0.045/GB). High-data-transfer applications through NAT can get very expensive. Use VPC Gateway Endpoints for S3/DynamoDB to bypass NAT.
  • S3 storage class selection: Standard-IA has 30-day minimum charge AND retrieval fee. Glacier has 90-day minimum AND retrieval cost/time. Pick the right class or the 'savings' become penalties.
  • Compute Savings Plans are more flexible than EC2 RIs — apply across EC2 (any family/size/region/OS), Fargate, and Lambda. EC2 Instance Savings Plans apply within a specific family+region.
  • Savings Plans vs RIs: Standard RI gives the deepest discount for known, stable workloads. Savings Plans give flexibility (apply to Fargate, Lambda) at slightly less discount.

Common Traps

TrapRecommending Reserved Instances for non-24/7 workloads like dev/test environments
RealityRIs are only cost-effective if the instance runs nearly continuously. A dev environment that runs 50 hours/week should use On-Demand + scheduled Auto Scaling (start/stop). Paying for 168 hours/week of RI capacity on a 50-hour workload wastes ~70% of the commitment.
TrapChoosing Spot Instances for a workload that has any time-critical completion deadline or maintains state
RealitySpot can be reclaimed with only 2 minutes notice. Any workload that can't be interrupted (databases, master nodes, time-critical batch with hard deadlines) must use On-Demand or Reserved. Use Spot only for jobs that can checkpoint and resume.
TrapThinking Convertible RIs provide the same discount as Standard RIs
RealityConvertible RIs cost more (smaller discount, ~54% vs ~72%) but let you change instance family, OS, or tenancy. Standard RIs give deeper discounts but are locked to the original configuration. If flexibility is explicitly stated in the question, pick Convertible.
TrapUsing NAT Gateway for EC2 instances that only need to access S3 or DynamoDB
RealityVPC Gateway Endpoints for S3 and DynamoDB are free and bypass NAT entirely. Using a NAT Gateway when an endpoint would work is a common cost mistake on the exam. Look for the VPC Gateway Endpoint as the cost-optimized answer whenever the destination is S3 or DynamoDB.
TrapRecommending S3 One Zone-IA for 'rarely accessed but critical data that cannot be recreated'
RealityOne Zone-IA stores data in a single AZ. If that AZ fails, the data is gone. One Zone-IA is only appropriate for data that can be recreated (thumbnails, derivatives). For irreplaceable data, use Standard-IA or Glacier (which use multi-AZ) instead.
TrapThinking Savings Plans cover all services like EC2 RIs do for RDS
RealityCompute Savings Plans cover EC2, Fargate, and Lambda only. For RDS, ElastiCache, Redshift, OpenSearch, you still need their respective Reserved Instances/Nodes — not covered by Savings Plans.

Confusing Pairs

Standard Reserved InstancesConvertible Reserved Instances

Standard RI = up to 72% off, locked to specific instance family/region/OS/tenancy. Convertible RI = up to 54% off, can change attributes mid-term. If question says 'maximum savings' with stable workload → Standard RI. If question says 'flexibility to change' → Convertible RI.

EC2 Instance Savings PlansCompute Savings Plans

EC2 Instance Savings Plans = ~72% off, locked to specific instance family+region, highest discount. Compute Savings Plans = ~66% off, applies to EC2/Fargate/Lambda across any region/family. If the workload is purely EC2 in one region → EC2 Instance Savings Plans. Mixed EC2+Lambda+Fargate → Compute Savings Plans.

S3 Standard-IAS3 One Zone-IA

Standard-IA = multi-AZ storage, retrieval fee, 30-day minimum, for infrequently accessed durable data. One Zone-IA = single AZ, 20% cheaper, for recreatable data only. If data 'cannot be recreated' → never One Zone-IA. If data 'can be regenerated/recreated' → One Zone-IA acceptable.

Savings PlansReserved Instances for other services

Savings Plans cover EC2, Lambda, Fargate. RDS/ElastiCache/Redshift/OpenSearch use their own reserved instance/node programs — NOT covered by Savings Plans. Exam may list 'Compute Savings Plans' as an option for an RDS cost optimization question — it's a distractor.

Scenario Tips

If the question asks about:

Company runs identical workload 24/7 for 3 years on specific EC2 instance type, asking for 'maximum cost savings'

Answer:

3-year Standard Reserved Instance with All Upfront payment — maximum ~72% discount for predictable, committed workload.

Distractor to avoid:

Compute Savings Plans — slightly less savings (~66%) but more flexibility. If the question says 'maximum savings' with a known, stable workload, Standard RI wins.

If the question asks about:

Dev/test environment used only Monday–Friday 8am–6pm (50 hours/week)

Answer:

On-Demand instances with scheduled Auto Scaling to stop instances outside business hours. Saves ~70% compared to running 24/7.

Distractor to avoid:

Reserved Instances — you pay for 24/7 even when instances are off, wasting ~70% of the cost.

If the question asks about:

Nightly batch job processing large files, 2-hour window, tolerates restarts, needs cheapest compute

Answer:

EC2 Spot Instances in an Auto Scaling Group — up to 90% savings. Batch job is fault-tolerant (can checkpoint/restart), exactly the Spot use case.

Distractor to avoid:

On-Demand — correct but overpriced. Reserved — doesn't make sense for nightly 2-hour jobs.

If the question asks about:

Company has 50 TB of log data: accessed daily for first 30 days, never accessed again but must be kept 5 years

Answer:

Lifecycle policy: S3 Standard for 30 days, transition to S3 Glacier Deep Archive after 30 days (cheapest long-term storage, $0.00099/GB/month). Glacier Deep Archive satisfies the 5-year retention at minimal cost.

Distractor to avoid:

S3 Intelligent-Tiering — adds unnecessary monitoring cost when the access pattern is perfectly predictable.

Last-Minute Facts

1EC2 savings levels (most expensive → cheapest): On-Demand → Standard RI 1yr (~40% off) → Convertible RI (~54% off) → Compute Savings Plans (~66% off) → Standard RI 3yr / EC2 Instance SP (~72% off) → Spot (~90% off)
2NAT Gateway: $0.045/hr + $0.045/GB processed. High traffic makes it very expensive.
3Cross-AZ data transfer: $0.01/GB each direction. Keep traffic within AZ when possible.
4S3 Standard-IA: 30-day minimum charge + retrieval fee. Glacier: 90-day minimum. Glacier Deep Archive: 180-day minimum.
5S3 One Zone-IA: single AZ only — appropriate only for recreatable data.
6Compute Savings Plans: covers EC2 + Fargate + Lambda. RDS needs separate RDS Reserved Instances.
7Spot 2-minute termination notice — only for fault-tolerant, stateless, interruptible workloads.
8DynamoDB on-demand: ~2x cost of provisioned at sustained load, but zero cost at zero load.
9AWS Compute Optimizer: analyzes CloudWatch metrics, recommends right-sizing for EC2/Lambda/EBS/ECS Fargate.
10S3 Intelligent-Tiering: monitoring fee per object, no retrieval fees — use when access pattern is genuinely unknown.

Feeling confident?

Put your knowledge to the test with a timed SAA-C03 mock exam.