CompTIADS0-0015 domains

DS0-001 Exam Notes

Last-minute traps, must-know facts, and scenario tips for the CompTIA DataSys+ exam.

General Exam Tips

1.Read ALL answer choices before selecting — two options often look identical but differ on one critical word like 'structure' vs 'data' or 'automatic' vs 'manual'
2.PBQs (drag-and-drop, simulations) appear early in the exam. Flag complex ones and return after completing all multiple-choice questions — you cannot go back to virtual simulation PBQs so answer those when you reach them
3.90 questions in 90 minutes is 1 minute per question maximum. Difficult scenario questions can consume 3-4 minutes — skip and flag instead of losing time
4.There is NO penalty for wrong answers. If unsure, eliminate obviously wrong options and guess from the remaining choices — never leave a question blank
5.The exam is scenario-heavy. Identify the constraint in the scenario first (near-zero RPO, high-write workload, non-production use, EU data) before evaluating answer options
6.Domain 3 (Database Management, 25%) and Domain 1 (Database Fundamentals, 24%) together equal nearly half the exam. These two domains alone determine whether you pass
7.Performance-based questions may ask you to write SQL, match backup strategies to RPO/RTO requirements, or identify which normal form a table violates — hands-on practice is the only real preparation for these
8.When a question gives you a recovery scenario, draw out the backup chain on scratch paper before selecting an answer — this prevents arithmetic errors on restore sequence questions

Quick Navigation

Database Fundamentals Database Deployment Database Management and Maintenance Data and Database Security Business Continuity

Domain 124% of exam

Database Fundamentals

Must-Know Facts

DDL = CREATE, ALTER, DROP, TRUNCATE (modifies database STRUCTURE). DML = SELECT, INSERT, UPDATE, DELETE (manipulates DATA). TCL = COMMIT, ROLLBACK, SAVEPOINT (manages TRANSACTIONS). DCL = GRANT, REVOKE (manages PERMISSIONS). Know which category every SQL statement belongs to
ACID: Atomicity (transaction is all-or-nothing), Consistency (DB moves between valid states only), Isolation (concurrent transactions do not interfere), Durability (committed data survives failures)
Normalization: 1NF requires atomic column values and no repeating groups. 2NF requires full functional dependency on the entire primary key (no partial dependencies). 3NF eliminates transitive dependencies. BCNF is stricter: every determinant must be a candidate key
NoSQL types and their use cases: Document (MongoDB, Cosmos DB) for nested JSON-like data; Key-Value (DynamoDB, Redis) for fast session storage and caching; Column-Oriented (Cassandra) for wide time-series data; Graph (Neo4j) for relationship traversal
Views vs materialized views: a view is a saved query that always executes against live data. A materialized view physically stores pre-computed results and must be refreshed — used for reporting against large transactional tables without impacting production
Stored procedures are explicitly called by applications or users. Triggers execute AUTOMATICALLY in response to data events (INSERT, UPDATE, DELETE). Functions return a value and can be called inside SELECT statements
ORM tools (Hibernate, Entity Framework, Ebean) abstract SQL from application code. Performance trap: ORMs can generate inefficient queries — review generated SQL and add database indexes to compensate
JOIN type determines whether missing matches are included or dropped. INNER JOIN drops unmatched rows from both sides. LEFT JOIN keeps all left-table rows even if the right side has no match (NULLs fill missing right columns). FULL OUTER JOIN keeps unmatched rows from BOTH sides. Exam trap: 'return all customers even those with no orders' requires LEFT JOIN not INNER JOIN — INNER JOIN silently drops customers with no orders

Common Traps

TrapClassifying TRUNCATE as DML because it removes data

RealityTRUNCATE is DDL because it resets the table structure (including auto-increment counters and HWM). DELETE is DML because it removes specific rows and can be rolled back. The exam frequently includes TRUNCATE in the DML options as a distractor.

TrapAssuming NoSQL databases cannot use SQL

RealityMany NoSQL databases support SQL-like query languages. The term 'NoSQL' refers to the data model (document, key-value, column, graph) not the query language. DynamoDB has PartiQL, Cassandra has CQL. Do not confuse 'no schema enforcement' with 'no SQL'.

TrapPicking a relational database for questions about relationship traversal

RealityWhen a scenario mentions traversing connections (social networks, org hierarchies, recommendation engines), the answer is a graph database — not a relational database. Relational databases can model relationships but are poor at deep traversals across many joins.

TrapConfusing partial dependency (2NF violation) with transitive dependency (3NF violation)

RealityPartial dependency means a non-key column depends on PART of a composite primary key — this violates 2NF. Transitive dependency means a non-key column depends on another non-key column — this violates 3NF. Both can exist in the same table. Fix partial dependency first (2NF), then address transitive dependencies (3NF).

TrapSelecting a stored procedure when the scenario says 'automatically when data changes'

RealityStored procedures must be explicitly called. If the question says the logic should execute automatically when an INSERT, UPDATE, or DELETE occurs, the answer is a trigger. This distinction appears directly in PBQ-style scenario questions.

Confusing Pairs

DDLDML

DDL (Data Definition Language) defines STRUCTURE: CREATE, ALTER, DROP, TRUNCATE. DML (Data Manipulation Language) manipulates DATA: SELECT, INSERT, UPDATE, DELETE. Key catch: TRUNCATE is DDL even though it empties a table. CREATE INDEX is DDL even though you might think it is performance tuning.

Stored ProcedureTrigger

Stored procedure = reusable code block called explicitly by an application or user. Trigger = code that fires automatically when a DML event (INSERT, UPDATE, DELETE) occurs on a table. Use stored procedures for business logic. Use triggers for audit logging, auto-validation, or enforcing referential integrity that constraints alone cannot handle.

ViewMaterialized View

View = virtual table that re-executes its query against live data every time it is accessed. Always current but can be slow on large tables. Materialized view = physically stored snapshot of query results. Fast for reporting but must be refreshed to reflect new data. Exam scenario: reporting against a large transactional table without slowing production → materialized view refreshed during off-peak hours.

Relational DatabaseNoSQL Database

Relational: structured tables, enforced schema, ACID transactions, SQL queries, best for consistency and complex joins. NoSQL: flexible schema, horizontal scaling, eventual consistency (typically), best for high-volume unstructured or semi-structured data. Neither is universally better — the exam tests which fits the described use case.

Scenario Tips

If the question asks about:

A table has a single column storing multiple email addresses separated by commas. Which normal form does this violate?

Answer:

First Normal Form (1NF) — columns must contain atomic (single) values. Multiple values in one cell violates atomicity.

Distractor to avoid:

Candidates pick 2NF (partial dependency). 2NF only applies to tables with composite primary keys. You cannot violate 2NF if 1NF is already violated.

If the question asks about:

An application needs to track friend connections in a social network and query 'friends of friends' efficiently

Answer:

Graph database (Neo4j). Relationship traversal is the defining use case for graph databases.

Distractor to avoid:

Relational database with foreign keys. While this works, deep graph traversal across many joins is extremely slow in relational databases.

If the question asks about:

A developer creates a rule that automatically logs any DELETE operation on the Customers table to an audit table

Answer:

Trigger — it fires automatically on a DELETE event. This is the canonical use case for database triggers.

Distractor to avoid:

Stored procedure. Stored procedures must be explicitly called by code or a user — they cannot react to table events automatically.

Last-Minute Facts

1DDL commands cannot be rolled back in most databases (DROP TABLE is permanent without transaction wrapping)

21NF: atomic values. 2NF: no partial key dependency. 3NF: no transitive dependency. BCNF: every determinant is a candidate key

3Four NoSQL types: Document, Key-Value, Column-Oriented, Graph

4TCL commands: COMMIT (save), ROLLBACK (undo), SAVEPOINT (mark point to roll back to)

5A table can have only ONE clustered index but many non-clustered indexes

Domain 216% of exam

Database Deployment

Must-Know Facts

Cloud deployment models: IaaS (you manage OS, DB engine, and everything above hardware — e.g., PostgreSQL on EC2), PaaS (vendor manages OS and infrastructure, you manage DB schema and queries — e.g., AWS RDS, Azure SQL), SaaS (vendor manages everything including the application layer)
Schema abstraction levels: Logical schema describes WHAT entities and relationships exist (business view). Physical schema describes HOW data is stored on disk (tables, columns, indexes, file groups). View schema is the user-facing perspective (what a specific user can see)
Testing types for database deployments: Stress testing = push to capacity limits. Regression testing = verify new changes did not break existing functionality. Negative testing = submit invalid inputs to verify error handling. Quality check = verify data accuracy and completeness
Data dictionary vs ERD: a data dictionary is a TEXT-BASED metadata catalog (column names, data types, constraints, descriptions) used for governance and developer reference. An ERD is a VISUAL DIAGRAM showing entities, attributes, and cardinality relationships. The exam will offer both as answers when describing metadata documentation — choose data dictionary for 'documenting column definitions and data types,' choose ERD for 'showing entity relationships visually'
Referential integrity: foreign keys must point to valid primary keys. Orphaned records (foreign keys pointing to deleted primary keys) indicate referential integrity violations. Enforce through FOREIGN KEY constraints and ON DELETE rules

Common Traps

TrapChoosing IaaS when the question says the vendor manages patching and backups

RealityIf the vendor handles the OS, patching, and backups, that is PaaS. IaaS gives you infrastructure only — you are responsible for everything above the hypervisor. Key phrase: 'vendor manages backups' → PaaS.

TrapConfusing stress testing with regression testing

RealityStress testing finds your capacity ceiling (how many concurrent connections before it fails). Regression testing verifies that after a change, previously working functionality still works. Both are important after a deployment but test completely different things.

TrapTreating a data dictionary as an ERD

RealityA data dictionary is text-based metadata (column names, data types, descriptions). An ERD is a visual diagram showing entities and their relationships with cardinality notation. Both are documentation artifacts but serve different purposes.

TrapSelecting negative testing when the scenario describes high load

RealityHigh load (thousands of concurrent users, large datasets) = stress testing. Negative testing = invalid input (wrong data types, null values in required fields, strings too long). The distinction is invalid DATA vs excessive LOAD.

Confusing Pairs

IaaSPaaS

IaaS = you manage the OS, DB engine, patches, and backups (e.g., running MySQL on a raw EC2 instance). PaaS = vendor manages all infrastructure, you manage the database schema and data (e.g., AWS RDS, Azure SQL Database). PaaS reduces operational overhead but less control. Exam cue: 'managed service' or 'vendor handles patching' = PaaS.

Logical SchemaPhysical Schema

Logical schema = conceptual design showing entities, attributes, and relationships without implementation details. Answers 'what data exists.' Physical schema = actual implementation: tables, columns, data types, indexes, storage parameters. Answers 'how is it stored.' ERDs represent logical schema. The actual CREATE TABLE statements represent physical schema.

Stress TestingNegative Testing

Stress testing = push the system with extreme LOAD (high concurrent connections, large data volumes) to find breaking points. Negative testing = submit deliberately INVALID INPUT (wrong data types, nulls in required fields, oversized strings) to verify error handling. Both are deployment validation techniques but test completely different failure modes.

Scenario Tips

If the question asks about:

An organization wants a database where the vendor handles OS patching and backups, but the DBA controls schema and query optimization

Answer:

PaaS — the vendor manages infrastructure, the organization controls the database layer.

Distractor to avoid:

IaaS (EC2 with MySQL). In IaaS the organization manages everything above the hardware, including OS patches and backups.

If the question asks about:

A DBA submits a 10,000-character string into a VARCHAR(255) field to verify the database rejects it gracefully

Answer:

Negative testing — intentionally invalid input to verify error handling.

Distractor to avoid:

Stress testing. Stress testing uses valid inputs at high volume. Negative testing uses invalid inputs at any volume.

Last-Minute Facts

1IaaS: you manage OS up. PaaS: you manage schema up. SaaS: you manage nothing.

2Logical schema = business view (entities/relationships). Physical schema = implementation view (tables/columns/indexes)

3Negative testing = invalid input. Stress testing = extreme load. Regression testing = verify no regressions after change

4Referential integrity violations result in orphan records — foreign keys pointing to deleted parents

Domain 325% of exam

Database Management and Maintenance

Must-Know Facts

Baseline metrics are required for alert interpretation. CPU at 80% might be normal for a reporting cluster. Always compare against the established baseline, not an absolute threshold
Index fragmentation is the first thing to investigate when SELECT queries slow down on tables with frequent INSERT/UPDATE/DELETE operations. Fragmentation occurs because page splits disrupt the physical order of index pages
Index rebuilding restores index page order and dramatically improves read performance but temporarily consumes more disk space and blocks writes during the rebuild. Schedule during maintenance windows
Too many indexes hurt write performance: every INSERT, UPDATE, or DELETE must update ALL indexes on the table. An index strategy must balance read performance against write overhead
Change management process: plan the change, test in non-production, get approvals, schedule during a maintenance window, document the change, execute, verify, have a rollback plan ready
Monitoring metrics: CPU utilization, memory usage, disk I/O, storage capacity, throughput, concurrent connections, failed login attempts, replication lag, job completion status
Materialized views are a performance tool: pre-compute expensive queries and refresh during off-peak hours. Reports run against the cached result, not the live transactional table
Audit logs serve both security AND operational purposes: failed logins indicate security threats, failed jobs indicate operational problems, slow query logs indicate performance issues

Common Traps

TrapApplying a patch directly to production because it is critical

RealityEven critical security patches must be tested in a non-production environment first. Applying an untested patch to production can break applications. The change management process does not get bypassed for urgency — it gets expedited, not skipped.

TrapAdding more indexes to improve slow query performance without considering write workload

RealityIndexes accelerate reads but slow down writes. On a write-heavy transactional table, excessive indexes may worsen overall throughput. The exam will present a scenario where a table has frequent writes — 'add indexes' is a trap if the write workload is mentioned.

TrapTreating high CPU as a problem without comparing to baseline

Reality90% CPU may be perfectly normal during a scheduled batch job. Alerting should trigger when metrics DEVIATE from baseline, not when they cross an absolute threshold. Establish baselines before configuring alerts.

TrapUsing a standard view to improve reporting performance on large tables

RealityA view re-runs its underlying query on every access — it offers zero performance benefit over querying the table directly. A materialized view pre-computes and stores results. Exam scenario: reporting on a large table without impacting production → materialized view.

Confusing Pairs

Index RebuildIndex Reorganize

Index rebuild drops and recreates the index from scratch — fixes all fragmentation, fully optimizes the index, requires more resources, and may lock the table (in older systems). Index reorganize defragments in place — online operation, less resource-intensive, but does not compact sparse pages as effectively. Rebuild for high fragmentation (>30%). Reorganize for moderate fragmentation (10-30%).

Clustered IndexNon-Clustered Index

Clustered index = determines the physical sort order of table rows. Only ONE per table. The table itself IS the clustered index. Best for primary keys and range queries. Non-clustered index = separate structure with pointers back to table rows. Multiple per table. Best for columns frequently used in WHERE clauses, JOINs, or ORDER BY. Too many non-clustered indexes hurt write performance.

Monitoring AlertBaseline Threshold

Alerts should fire when metrics deviate significantly from baseline, not when they cross an absolute number. A CPU at 85% might be fine for a reporting server during peak hours. The same reading on an OLTP server at 2am is alarming. Baselines make alerts meaningful.

Scenario Tips

If the question asks about:

A DBA notices SELECT queries degrading over time on a table that receives thousands of daily INSERTs. No hardware changes have occurred.

Answer:

Index fragmentation — frequent inserts cause page splits over time. Rebuild or reorganize the index.

Distractor to avoid:

Network issues or permissions. Fragmentation is the classic cause of gradual read degradation on write-heavy tables.

If the question asks about:

A team wants to run complex aggregate reports against a large transactional table without slowing production OLTP queries

Answer:

Create a materialized view refreshed during off-peak hours. Reports run against the stored snapshot, not the live table.

Distractor to avoid:

Create a standard view (no performance benefit) or add indexes (helps reads but creates write overhead on the production table).

If the question asks about:

A DBA must apply a critical security patch to a production database server

Answer:

Test the patch in non-production first. Even critical patches follow change management — test, approve, schedule, execute, verify, have a rollback plan.

Distractor to avoid:

Apply immediately due to criticality. Change management process is not optional even for security patches.

If the question asks about:

After an index rebuild, write operations on the table slow down significantly compared to before the rebuild

Answer:

The table likely has too many indexes. The rebuild improved read performance but re-exposed the write overhead from maintaining all indexes on every INSERT, UPDATE, and DELETE. Evaluate which indexes are unused and drop them.

Distractor to avoid:

The rebuild was performed incorrectly. Index rebuilds do not cause write slowdowns by themselves — the issue is index count, not rebuild quality.

Last-Minute Facts

1Domain 3 is the heaviest domain at 25% — roughly 22-23 questions out of 90

2Index fragmentation threshold: rebuild if >30% fragmented, reorganize if 10-30% fragmented

3A table can have exactly ONE clustered index (physical sort order) and many non-clustered indexes

4Every index added to a table increases write overhead on INSERT, UPDATE, DELETE operations

5Audit logs = security AND operational tool (failed logins, slow queries, failed jobs)

6Change management: test → approve → schedule maintenance window → execute → verify → document → rollback plan. Urgency expedites the process but does not skip it

7EXPLAIN / query execution plan: the first diagnostic tool when a query is suddenly slow — reveals full table scans, missing indexes, and inefficient join methods before any hardware is blamed

Domain 423% of exam

Data and Database Security

Must-Know Facts

Encryption at rest protects data stored on disk (database files, backups, storage media). Encryption in transit (TLS/SSL) protects data moving across a network between client and server. Both can and should be implemented simultaneously
Data masking: replaces sensitive values with realistic but fictional substitutes while preserving FORMAT. A masked credit card number still looks like a 16-digit number. Used for non-production environments (dev, QA, testing). Does not protect production data
SQL injection is the most common database attack: malicious SQL is inserted into application input fields to manipulate queries. Prevention: parameterized queries (prepared statements), input validation, stored procedures, and least-privilege DB accounts
Authentication (who are you?) happens BEFORE authorization (what can you do?). Authentication failure = cannot log in. Authorization failure = logged in but permission denied. The exam tests diagnosis of which type of failure is occurring
Least privilege: grant users and service accounts only the minimum permissions needed for their function. Service accounts must use dedicated credentials — never shared human accounts
GDPR applies to ANY organization processing personal data of EU residents, regardless of where the organization or its servers are located. Processing EU data from US servers still requires GDPR compliance
PCI DSS applies specifically to systems that store, process, or transmit payment card data. Both GDPR and PCI DSS can apply to the same system if it processes EU cardholder data
On-path attacks (man-in-the-middle): attacker intercepts data IN TRANSIT. Primary mitigation is encryption in transit (TLS). Encryption at rest does not protect against on-path attacks

Common Traps

TrapUsing data masking to protect production data

RealityData masking is for non-production environments. It makes sensitive data usable in testing without exposing real values. Production data requires encryption. The exam will present a scenario where developers need realistic test data — the answer is masking, not encryption.

TrapApplying encryption at rest as the solution to an on-path (MITM) attack

RealityOn-path attacks intercept data IN TRANSIT across a network. Encryption at rest protects data ON DISK. To mitigate an on-path attack, you need encryption in transit (TLS/SSL), not encryption at rest.

TrapAssuming GDPR only applies if the company is located in the EU

RealityGDPR is triggered by the location of the DATA SUBJECT (the EU resident), not the location of the company or server. A US company processing EU customer data is subject to GDPR.

TrapConfusing authentication failure with authorization failure

RealityIf the user CANNOT LOG IN at all, that is an authentication failure (wrong password, expired account, MFA issue). If the user IS logged in but cannot access a specific database or table, that is an authorization failure (missing GRANT). The correct fix depends on which type of failure it is.

TrapChoosing a shared admin account for automated database backup jobs

RealityAutomated processes (backup jobs, monitoring agents) must use dedicated service accounts with the minimum required permissions. Using shared human admin credentials for service accounts violates least privilege and creates accountability issues.

Confusing Pairs

Data MaskingData Encryption

Masking = replaces data values with realistic fakes that PRESERVE FORMAT (masked SSN still looks like an SSN). Not cryptographically reversible — once masked, original may be lost. For non-production use. Encryption = transforms data into unreadable ciphertext using a key. Fully reversible with the correct key. For production data protection. Exam cue: 'developers need test data without seeing real values' → masking. 'Protect data stored in the database from unauthorized access' → encryption.

Encryption at RestEncryption in Transit

At rest = protects data ON DISK (database files, backups, storage media). Mitigates: stolen hard drive, data center physical breach. In transit = protects data MOVING ACROSS A NETWORK via TLS/SSL. Mitigates: on-path/MITM attacks, network sniffing. Exam cue: 'attacker intercepts traffic between application and database' → encryption in transit. 'Attacker steals backup tapes' → encryption at rest.

AuthenticationAuthorization

Authentication = verifying IDENTITY ('who are you?'). Happens first. Methods: passwords, certificates, biometrics, MFA. Failure symptom: cannot log in. Authorization = determining PERMISSIONS ('what can you do?'). Happens after authentication. Methods: GRANT/REVOKE, roles, row-level security. Failure symptom: logged in but access denied to specific resource.

GDPRPCI DSS

GDPR = protects ANY personal data of EU residents. Applies based on data SUBJECT location, not company location. Covers all personal data (names, emails, IPs). PCI DSS = protects payment card data specifically (cardholder data, card numbers, CVV). Applies to any entity that stores, processes, or transmits cardholder data. Both can apply simultaneously if you process EU customer payment cards.

Scenario Tips

If the question asks about:

A developer is given production data with real customer names and email addresses to use in a testing environment

Answer:

Apply data masking to the sensitive fields before handing data to developers. Masking preserves realistic data format without exposing actual values.

Distractor to avoid:

Encrypt the database at rest. Encryption protects stored data but authorized users (including developers) can still see real values when they query it.

If the question asks about:

A US-based company collects and stores personal data from EU website visitors in a US database. Which regulation applies?

Answer:

GDPR — it applies based on where the data subjects (EU residents) are located, not where the company or servers are.

Distractor to avoid:

GDPR does not apply because the company and servers are in the US. This is the most common GDPR misconception.

If the question asks about:

An attacker is sniffing network traffic between a web application and the database server to capture query results

Answer:

Implement TLS/SSL encryption in transit. This is an on-path (MITM) attack — the data is being intercepted while moving across the network.

Distractor to avoid:

Enable encryption at rest. Encryption at rest protects stored data on disk, not data moving across the network.

If the question asks about:

A user can log into the database but receives 'Permission denied' when running SELECT on the Orders table

Answer:

Authorization failure — the user is authenticated but lacks the required GRANT to access that table. Fix: GRANT SELECT on Orders to the user or role.

Distractor to avoid:

Authentication failure — authentication was successful (they logged in). The issue is authorization (what they are allowed to do after login).

Last-Minute Facts

1SQL injection prevention: parameterized queries (prepared statements) are the primary technical control

2Least privilege for service accounts: use dedicated accounts with minimum required permissions — never shared admin credentials

3GDPR triggers: EU data subject location, not server or company location

4PCI DSS: payment card data only. GDPR: all personal data of EU residents

5On-path attack mitigation: TLS/SSL (encryption in transit). Stolen disk mitigation: encryption at rest

6Data masking = format-preserving, for non-production. Encryption = unreadable without key, for production

Domain 512% of exam

Business Continuity

Must-Know Facts

Backup types: Full backup copies ALL data. Incremental backs up only changes since the LAST BACKUP OF ANY TYPE (full or incremental). Differential backs up all changes since the LAST FULL BACKUP only
Restore chain for incrementals: to restore, you need the last full backup PLUS EVERY incremental in sequence (full, Inc1, Inc2, Inc3...). One missing incremental breaks the chain
Restore chain for differentials: you only need the last full backup PLUS THE SINGLE MOST RECENT differential (full + latest diff). Much simpler than incremental restore
RPO (Recovery Point Objective) = maximum acceptable DATA LOSS in time. Near-zero RPO requires real-time or synchronous replication
RTO (Recovery Time Objective) = maximum acceptable DOWNTIME before recovery. Near-zero RTO requires hot standby with automatic failover
Synchronous replication: every write confirmed on all nodes before acknowledging success. Zero data loss (RPO=0). Adds latency
Asynchronous replication: write acknowledged on primary immediately, replicated to secondary later. Lower latency. Risk of data loss equal to the replication lag
Log shipping: transaction log backups shipped to standby server on a schedule (every N minutes). RPO = shipping interval. Manual failover typically required
Database mirroring decision: when the scenario requires BOTH near-zero data loss AND automatic failover, the answer is synchronous mirroring with a witness server — not log shipping (which has delay) and not asynchronous mirroring (which risks data loss on failover). The witness server is the enabler of automatic failover; without it, mirroring requires manual failover just like log shipping
Failback: returning operations from the standby back to the original primary AFTER the primary is repaired. Failback must be planned and tested separately from failover

Common Traps

TrapThinking incremental restore needs only the last full plus the last incremental

RealityIncremental restore requires the last full backup PLUS EVERY incremental in order since that full. If you have Full + Inc1 + Inc2 + Inc3 and you only apply Full + Inc3, you miss Inc1 and Inc2. You need the complete chain, applied in sequence.

TrapConfusing RPO and RTO because both are measured in time

RealityRPO = DATA LOSS. How much data can you afford to lose? Drives backup/replication frequency. RTO = DOWNTIME. How long can the system be offline? Drives DR architecture complexity. A financial trading system with 'zero data loss and 5-minute max downtime' has an RPO of 0 and an RTO of 5 minutes — these are separate requirements demanding different solutions.

TrapTreating failback as automatic after a failover

RealityFailover (switching to standby) can be automatic. Failback (returning to the original primary) is almost always a separate, planned operation. You must re-synchronize the primary, verify health, and then deliberately switch back. Assuming failback happens automatically is a gap in DR planning.

TrapChoosing log shipping when the question requires zero data loss

RealityLog shipping always has a delay equal to the shipping interval. Even if logs ship every 5 minutes, up to 5 minutes of data can be lost. Only synchronous replication guarantees zero data loss (RPO=0). For near-zero RPO, the answer is synchronous replication, not log shipping.

TrapNever testing backups

RealityThe exam treats untested backups as effectively no backup. Backup validation through test restores and hash verification is a required practice, not optional. Questions will specifically ask about backup validation — 'taking backups' and 'validating backups' are distinct practices.

Confusing Pairs

Incremental BackupDifferential Backup

Incremental = backs up changes since last backup OF ANY TYPE. Resets the archive bit. Short backup time, long restore time (need full + every incremental). Differential = backs up ALL changes since last FULL backup. Does not reset the archive bit. Grows in size each day. Longer backup time than incremental, shorter restore time (need only full + latest differential). Exam cue: 'which backup chain is faster to restore?' → differential. 'Which backup uses least storage per run?' → incremental.

RPORTO

RPO (Recovery Point Objective) = maximum acceptable DATA LOSS measured in time. 'We can lose up to 1 hour of data.' Drives backup frequency and replication method. RTO (Recovery Time Objective) = maximum acceptable DOWNTIME before recovery. 'We must be back online within 4 hours.' Drives standby architecture and failover automation. Both are measured in time but they quantify completely different business risks.

Log ShippingDatabase Mirroring

Log shipping = scheduled backup of transaction logs shipped to standby. Built-in delay = data loss risk equal to shipping interval. Usually manual failover. Simple to configure. Mirroring = real-time synchronized copy on mirror server. Can be synchronous (zero data loss) or asynchronous. Supports automatic failover with a witness server. Exam cue: 'minimize data loss and automate failover' → mirroring with synchronous mode + witness. 'Simple standby with acceptable delay' → log shipping.

FailoverFailback

Failover = switching from the failed primary to the standby server. Can be automatic (with a witness server or cluster) or manual. Failback = returning operations from the standby back to the repaired primary. Almost always manual and planned. Requires re-synchronization of the primary before switching back. Both must be documented and tested in the DR plan.

Scenario Tips

If the question asks about:

Full backups on Sunday, incrementals every other night. Database fails Thursday morning. What backups are needed to restore?

Answer:

Sunday full + Monday incremental + Tuesday incremental + Wednesday incremental — applied in order. Every incremental since the last full is required.

Distractor to avoid:

Sunday full + Wednesday incremental only. This is the differential restore logic, not incremental. Incremental restore requires the complete chain.

If the question asks about:

A financial trading database requires zero data loss and recovery within 5 minutes of any failure

Answer:

Synchronous replication with automatic failover. Synchronous = zero data loss (RPO=0). Automatic failover = near-zero downtime (RTO approaches 0).

Distractor to avoid:

Log shipping every 5 minutes. Log shipping with a 5-minute interval means up to 5 minutes of data loss — RPO is NOT zero. It also typically requires manual failover.

If the question asks about:

After a failover to the standby server, the original primary has been repaired and tested. What is the process of returning production traffic to the original primary?

Answer:

Failback — the deliberate, planned process of returning operations from the standby to the repaired primary.

Distractor to avoid:

Failover. Failover switches TO the standby. Failback returns FROM the standby to the primary. They are opposite operations.

Last-Minute Facts

1Incremental restore: full + every incremental in sequence. Differential restore: full + latest differential only

2RPO = data loss. RTO = downtime. Both measured in time. Different solutions.

3Synchronous replication = zero data loss, adds latency. Asynchronous = lower latency, small data loss risk

4Log shipping RPO = shipping interval (not zero). Mirroring (synchronous) RPO = 0

5Failback is NOT automatic — it is a separate planned operation requiring primary re-synchronization

6Backup validation is a separate exam question category from backup creation — candidates who 'take backups' but skip test restores will be marked wrong on questions asking what step is missing from a backup strategy. The exam tests whether you know that an untested backup is equivalent to no backup for DR purposes

Feeling confident?

Put your knowledge to the test with a timed DS0-001 mock exam.