General Exam Tips
- 1.Read ALL answer choices before selecting — two options often look identical but differ on one critical word like 'structure' vs 'data' or 'automatic' vs 'manual'
- 2.PBQs (drag-and-drop, simulations) appear early in the exam. Flag complex ones and return after completing all multiple-choice questions — you cannot go back to virtual simulation PBQs so answer those when you reach them
- 3.90 questions in 90 minutes is 1 minute per question maximum. Difficult scenario questions can consume 3-4 minutes — skip and flag instead of losing time
- 4.There is NO penalty for wrong answers. If unsure, eliminate obviously wrong options and guess from the remaining choices — never leave a question blank
- 5.The exam is scenario-heavy. Identify the constraint in the scenario first (near-zero RPO, high-write workload, non-production use, EU data) before evaluating answer options
- 6.Domain 3 (Database Management, 25%) and Domain 1 (Database Fundamentals, 24%) together equal nearly half the exam. These two domains alone determine whether you pass
- 7.Performance-based questions may ask you to write SQL, match backup strategies to RPO/RTO requirements, or identify which normal form a table violates — hands-on practice is the only real preparation for these
- 8.When a question gives you a recovery scenario, draw out the backup chain on scratch paper before selecting an answer — this prevents arithmetic errors on restore sequence questions
Quick Navigation
Database Fundamentals
Must-Know Facts
- DDL = CREATE, ALTER, DROP, TRUNCATE (modifies database STRUCTURE). DML = SELECT, INSERT, UPDATE, DELETE (manipulates DATA). TCL = COMMIT, ROLLBACK, SAVEPOINT (manages TRANSACTIONS). DCL = GRANT, REVOKE (manages PERMISSIONS). Know which category every SQL statement belongs to
- ACID: Atomicity (transaction is all-or-nothing), Consistency (DB moves between valid states only), Isolation (concurrent transactions do not interfere), Durability (committed data survives failures)
- Normalization: 1NF requires atomic column values and no repeating groups. 2NF requires full functional dependency on the entire primary key (no partial dependencies). 3NF eliminates transitive dependencies. BCNF is stricter: every determinant must be a candidate key
- NoSQL types and their use cases: Document (MongoDB, Cosmos DB) for nested JSON-like data; Key-Value (DynamoDB, Redis) for fast session storage and caching; Column-Oriented (Cassandra) for wide time-series data; Graph (Neo4j) for relationship traversal
- Views vs materialized views: a view is a saved query that always executes against live data. A materialized view physically stores pre-computed results and must be refreshed — used for reporting against large transactional tables without impacting production
- Stored procedures are explicitly called by applications or users. Triggers execute AUTOMATICALLY in response to data events (INSERT, UPDATE, DELETE). Functions return a value and can be called inside SELECT statements
- ORM tools (Hibernate, Entity Framework, Ebean) abstract SQL from application code. Performance trap: ORMs can generate inefficient queries — review generated SQL and add database indexes to compensate
- JOIN type determines whether missing matches are included or dropped. INNER JOIN drops unmatched rows from both sides. LEFT JOIN keeps all left-table rows even if the right side has no match (NULLs fill missing right columns). FULL OUTER JOIN keeps unmatched rows from BOTH sides. Exam trap: 'return all customers even those with no orders' requires LEFT JOIN not INNER JOIN — INNER JOIN silently drops customers with no orders
Common Traps
Confusing Pairs
Scenario Tips
A table has a single column storing multiple email addresses separated by commas. Which normal form does this violate?
First Normal Form (1NF) — columns must contain atomic (single) values. Multiple values in one cell violates atomicity.
Candidates pick 2NF (partial dependency). 2NF only applies to tables with composite primary keys. You cannot violate 2NF if 1NF is already violated.
An application needs to track friend connections in a social network and query 'friends of friends' efficiently
Graph database (Neo4j). Relationship traversal is the defining use case for graph databases.
Relational database with foreign keys. While this works, deep graph traversal across many joins is extremely slow in relational databases.
A developer creates a rule that automatically logs any DELETE operation on the Customers table to an audit table
Trigger — it fires automatically on a DELETE event. This is the canonical use case for database triggers.
Stored procedure. Stored procedures must be explicitly called by code or a user — they cannot react to table events automatically.
Last-Minute Facts
Database Deployment
Must-Know Facts
- Cloud deployment models: IaaS (you manage OS, DB engine, and everything above hardware — e.g., PostgreSQL on EC2), PaaS (vendor manages OS and infrastructure, you manage DB schema and queries — e.g., AWS RDS, Azure SQL), SaaS (vendor manages everything including the application layer)
- Schema abstraction levels: Logical schema describes WHAT entities and relationships exist (business view). Physical schema describes HOW data is stored on disk (tables, columns, indexes, file groups). View schema is the user-facing perspective (what a specific user can see)
- Testing types for database deployments: Stress testing = push to capacity limits. Regression testing = verify new changes did not break existing functionality. Negative testing = submit invalid inputs to verify error handling. Quality check = verify data accuracy and completeness
- Data dictionary vs ERD: a data dictionary is a TEXT-BASED metadata catalog (column names, data types, constraints, descriptions) used for governance and developer reference. An ERD is a VISUAL DIAGRAM showing entities, attributes, and cardinality relationships. The exam will offer both as answers when describing metadata documentation — choose data dictionary for 'documenting column definitions and data types,' choose ERD for 'showing entity relationships visually'
- Referential integrity: foreign keys must point to valid primary keys. Orphaned records (foreign keys pointing to deleted primary keys) indicate referential integrity violations. Enforce through FOREIGN KEY constraints and ON DELETE rules
Common Traps
Confusing Pairs
Scenario Tips
An organization wants a database where the vendor handles OS patching and backups, but the DBA controls schema and query optimization
PaaS — the vendor manages infrastructure, the organization controls the database layer.
IaaS (EC2 with MySQL). In IaaS the organization manages everything above the hardware, including OS patches and backups.
A DBA submits a 10,000-character string into a VARCHAR(255) field to verify the database rejects it gracefully
Negative testing — intentionally invalid input to verify error handling.
Stress testing. Stress testing uses valid inputs at high volume. Negative testing uses invalid inputs at any volume.
Last-Minute Facts
Database Management and Maintenance
Must-Know Facts
- Baseline metrics are required for alert interpretation. CPU at 80% might be normal for a reporting cluster. Always compare against the established baseline, not an absolute threshold
- Index fragmentation is the first thing to investigate when SELECT queries slow down on tables with frequent INSERT/UPDATE/DELETE operations. Fragmentation occurs because page splits disrupt the physical order of index pages
- Index rebuilding restores index page order and dramatically improves read performance but temporarily consumes more disk space and blocks writes during the rebuild. Schedule during maintenance windows
- Too many indexes hurt write performance: every INSERT, UPDATE, or DELETE must update ALL indexes on the table. An index strategy must balance read performance against write overhead
- Change management process: plan the change, test in non-production, get approvals, schedule during a maintenance window, document the change, execute, verify, have a rollback plan ready
- Monitoring metrics: CPU utilization, memory usage, disk I/O, storage capacity, throughput, concurrent connections, failed login attempts, replication lag, job completion status
- Materialized views are a performance tool: pre-compute expensive queries and refresh during off-peak hours. Reports run against the cached result, not the live transactional table
- Audit logs serve both security AND operational purposes: failed logins indicate security threats, failed jobs indicate operational problems, slow query logs indicate performance issues
Common Traps
Confusing Pairs
Scenario Tips
A DBA notices SELECT queries degrading over time on a table that receives thousands of daily INSERTs. No hardware changes have occurred.
Index fragmentation — frequent inserts cause page splits over time. Rebuild or reorganize the index.
Network issues or permissions. Fragmentation is the classic cause of gradual read degradation on write-heavy tables.
A team wants to run complex aggregate reports against a large transactional table without slowing production OLTP queries
Create a materialized view refreshed during off-peak hours. Reports run against the stored snapshot, not the live table.
Create a standard view (no performance benefit) or add indexes (helps reads but creates write overhead on the production table).
A DBA must apply a critical security patch to a production database server
Test the patch in non-production first. Even critical patches follow change management — test, approve, schedule, execute, verify, have a rollback plan.
Apply immediately due to criticality. Change management process is not optional even for security patches.
After an index rebuild, write operations on the table slow down significantly compared to before the rebuild
The table likely has too many indexes. The rebuild improved read performance but re-exposed the write overhead from maintaining all indexes on every INSERT, UPDATE, and DELETE. Evaluate which indexes are unused and drop them.
The rebuild was performed incorrectly. Index rebuilds do not cause write slowdowns by themselves — the issue is index count, not rebuild quality.
Last-Minute Facts
Data and Database Security
Must-Know Facts
- Encryption at rest protects data stored on disk (database files, backups, storage media). Encryption in transit (TLS/SSL) protects data moving across a network between client and server. Both can and should be implemented simultaneously
- Data masking: replaces sensitive values with realistic but fictional substitutes while preserving FORMAT. A masked credit card number still looks like a 16-digit number. Used for non-production environments (dev, QA, testing). Does not protect production data
- SQL injection is the most common database attack: malicious SQL is inserted into application input fields to manipulate queries. Prevention: parameterized queries (prepared statements), input validation, stored procedures, and least-privilege DB accounts
- Authentication (who are you?) happens BEFORE authorization (what can you do?). Authentication failure = cannot log in. Authorization failure = logged in but permission denied. The exam tests diagnosis of which type of failure is occurring
- Least privilege: grant users and service accounts only the minimum permissions needed for their function. Service accounts must use dedicated credentials — never shared human accounts
- GDPR applies to ANY organization processing personal data of EU residents, regardless of where the organization or its servers are located. Processing EU data from US servers still requires GDPR compliance
- PCI DSS applies specifically to systems that store, process, or transmit payment card data. Both GDPR and PCI DSS can apply to the same system if it processes EU cardholder data
- On-path attacks (man-in-the-middle): attacker intercepts data IN TRANSIT. Primary mitigation is encryption in transit (TLS). Encryption at rest does not protect against on-path attacks
Common Traps
Confusing Pairs
Scenario Tips
A developer is given production data with real customer names and email addresses to use in a testing environment
Apply data masking to the sensitive fields before handing data to developers. Masking preserves realistic data format without exposing actual values.
Encrypt the database at rest. Encryption protects stored data but authorized users (including developers) can still see real values when they query it.
A US-based company collects and stores personal data from EU website visitors in a US database. Which regulation applies?
GDPR — it applies based on where the data subjects (EU residents) are located, not where the company or servers are.
GDPR does not apply because the company and servers are in the US. This is the most common GDPR misconception.
An attacker is sniffing network traffic between a web application and the database server to capture query results
Implement TLS/SSL encryption in transit. This is an on-path (MITM) attack — the data is being intercepted while moving across the network.
Enable encryption at rest. Encryption at rest protects stored data on disk, not data moving across the network.
A user can log into the database but receives 'Permission denied' when running SELECT on the Orders table
Authorization failure — the user is authenticated but lacks the required GRANT to access that table. Fix: GRANT SELECT on Orders to the user or role.
Authentication failure — authentication was successful (they logged in). The issue is authorization (what they are allowed to do after login).
Last-Minute Facts
Business Continuity
Must-Know Facts
- Backup types: Full backup copies ALL data. Incremental backs up only changes since the LAST BACKUP OF ANY TYPE (full or incremental). Differential backs up all changes since the LAST FULL BACKUP only
- Restore chain for incrementals: to restore, you need the last full backup PLUS EVERY incremental in sequence (full, Inc1, Inc2, Inc3...). One missing incremental breaks the chain
- Restore chain for differentials: you only need the last full backup PLUS THE SINGLE MOST RECENT differential (full + latest diff). Much simpler than incremental restore
- RPO (Recovery Point Objective) = maximum acceptable DATA LOSS in time. Near-zero RPO requires real-time or synchronous replication
- RTO (Recovery Time Objective) = maximum acceptable DOWNTIME before recovery. Near-zero RTO requires hot standby with automatic failover
- Synchronous replication: every write confirmed on all nodes before acknowledging success. Zero data loss (RPO=0). Adds latency
- Asynchronous replication: write acknowledged on primary immediately, replicated to secondary later. Lower latency. Risk of data loss equal to the replication lag
- Log shipping: transaction log backups shipped to standby server on a schedule (every N minutes). RPO = shipping interval. Manual failover typically required
- Database mirroring decision: when the scenario requires BOTH near-zero data loss AND automatic failover, the answer is synchronous mirroring with a witness server — not log shipping (which has delay) and not asynchronous mirroring (which risks data loss on failover). The witness server is the enabler of automatic failover; without it, mirroring requires manual failover just like log shipping
- Failback: returning operations from the standby back to the original primary AFTER the primary is repaired. Failback must be planned and tested separately from failover
Common Traps
Confusing Pairs
Scenario Tips
Full backups on Sunday, incrementals every other night. Database fails Thursday morning. What backups are needed to restore?
Sunday full + Monday incremental + Tuesday incremental + Wednesday incremental — applied in order. Every incremental since the last full is required.
Sunday full + Wednesday incremental only. This is the differential restore logic, not incremental. Incremental restore requires the complete chain.
A financial trading database requires zero data loss and recovery within 5 minutes of any failure
Synchronous replication with automatic failover. Synchronous = zero data loss (RPO=0). Automatic failover = near-zero downtime (RTO approaches 0).
Log shipping every 5 minutes. Log shipping with a 5-minute interval means up to 5 minutes of data loss — RPO is NOT zero. It also typically requires manual failover.
After a failover to the standby server, the original primary has been repaired and tested. What is the process of returning production traffic to the original primary?
Failback — the deliberate, planned process of returning operations from the standby to the repaired primary.
Failover. Failover switches TO the standby. Failback returns FROM the standby to the primary. They are opposite operations.