You Can Pass This Exam For Free
Choose Your Study Path
Limited SQL or database development experience. You need to build foundational T-SQL skills and database design knowledge before tackling AI integration topics.
Exam Overview
Format
Approximately 50 questions, 120 minutes. Multiple choice, multiple select, case studies, and interactive lab-style questions.
Scoring
Scaled score 100-1000. Passing: 700. No penalty for wrong answers — always answer every question.
Domains & Weights
- Design and Develop Database Solutions37%
- Secure, Optimize, and Deploy Database Solutions37%
- Implement AI Capabilities in Database Solutions26%
Registration
$165 USD. Available at Pearson VUE testing centers or online proctored from home. Exam fee is $165 USD.
Topic Priority Table
Not all topics are tested equally. Focus your study time on Tier 1 first, then Tier 2. Tier 3 topics rarely appear — just recognize what they do.
Design and Develop Database Solutions
This domain covers the full breadth of database design and T-SQL development. You need to design tables with appropriate data types, indexes, and constraints, implement programmability objects like views and stored procedures, write advanced T-SQL including JSON functions, regular expressions, and graph queries, and use AI-assisted development tools like GitHub Copilot and MCP server connections.
Key Topics
Must-Know Concepts
- Table design: choosing appropriate data types, sizes, columns, clustered and nonclustered indexes, and columnstore indexes for the workload type
- Specialized table types: in-memory (MEMORY_OPTIMIZED), temporal (system-versioned with history table), external (referencing remote data), ledger (tamper-evident with cryptographic hashing), and graph (node and edge tables)
- JSON column design: creating JSON columns, JSON indexes, and using JSON_OBJECT, JSON_ARRAY, JSON_ARRAYAGG, JSON_CONTAINS, OPENJSON, and JSON_VALUE for querying semi-structured data
- Constraints: PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, and DEFAULT — know when to use each and how they enforce data integrity
- SEQUENCES: how to create and use sequences for generating ordered numeric values independently of tables
- Table and index partitioning: partition functions, partition schemes, and how partitioning improves manageability and query performance for large tables
- Programmability objects: creating views, scalar functions, table-valued functions, stored procedures, and triggers — know the use cases and limitations of each
- CTEs (Common Table Expressions): syntax, recursive CTEs for hierarchical data, and when CTEs are better than subqueries or temp tables
- Window functions: ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, SUM/AVG/COUNT with OVER and PARTITION BY
- JSON functions: constructing JSON (JSON_OBJECT, JSON_ARRAY), parsing JSON (OPENJSON, JSON_VALUE), aggregating JSON (JSON_ARRAYAGG), and testing containment (JSON_CONTAINS)
- Regular expressions: REGEXP_LIKE (matching), REGEXP_REPLACE (replacement), REGEXP_SUBSTR (extraction), REGEXP_INSTR (position), REGEXP_COUNT (counting), REGEXP_MATCHES, and REGEXP_SPLIT_TO_TABLE
- Fuzzy string matching: EDIT_DISTANCE (Levenshtein distance), EDIT_DISTANCE_SIMILARITY (percentage match), and JARO_WINKLER_DISTANCE (positional similarity) for approximate matching
- Graph queries: creating node tables and edge tables, writing MATCH operator queries to traverse graph relationships
- Error handling: TRY/CATCH blocks, THROW, RAISERROR, @@ERROR, and XACT_ABORT for transactional error management
- AI-assisted tools: enabling GitHub Copilot, configuring instruction files, connecting to MCP server endpoints for SQL Server and Fabric lakehouse, and understanding the security implications of AI-assisted code generation
Common Exam Traps
Secure, Optimize, and Deploy Database Solutions
This domain covers data security features, performance optimization, CI/CD pipeline implementation with SQL Database Projects, and integration with Azure services including Data API builder. Together with Domain 1, these two domains represent about 74% of the exam. You need both conceptual knowledge and practical ability to implement these features.
Key Topics
Must-Know Concepts
- Always Encrypted: deterministic vs randomized encryption, enclave-enabled Always Encrypted, column master keys vs column encryption keys, and that the database engine NEVER sees plaintext
- Column-level encryption: encrypting specific columns using symmetric keys and certificates, different from Always Encrypted which is client-side
- Dynamic Data Masking: default, email, random, and custom string masking functions. Users with UNMASK permission bypass masking
- Row-Level Security: filter predicates (control which rows are returned by SELECT), block predicates (control which rows can be modified by INSERT/UPDATE/DELETE), and security policy objects
- Object-level permissions: GRANT, DENY, REVOKE on tables, views, stored procedures, schemas. Understand permission inheritance and the principle of least privilege
- Passwordless database access: Azure AD authentication, Managed Identity, eliminating connection strings with passwords
- Auditing: server-level and database-level auditing, audit action groups, writing to Azure Blob Storage or Log Analytics
- Securing endpoints: Managed Identity for model endpoints, authentication/authorization for GraphQL, REST, and MCP endpoints
- Query execution plans: reading actual vs estimated plans, identifying expensive operators (Table Scan, Key Lookup, Sort), and using plans to diagnose performance issues
- Dynamic Management Views (DMVs): sys.dm_exec_query_stats, sys.dm_exec_requests, sys.dm_os_wait_stats for monitoring query and server performance
- Query Store: enabling, configuring retention, identifying regressed queries, forcing plans, and comparing performance across time periods
- Query Performance Insight: Azure SQL dashboard for identifying top resource-consuming queries
- Blocking and deadlocks: identifying blocking chains, using sys.dm_exec_requests and sys.dm_tran_locks, implementing deadlock retry logic, and designing to minimize lock contention
- Transaction isolation levels: READ UNCOMMITTED (dirty reads), READ COMMITTED (default), REPEATABLE READ, SERIALIZABLE, and SNAPSHOT — know which phenomena each prevents
- SQL Database Projects: SDK-style .sqlproj files, build with dotnet build, schema comparison, deployment with SqlPackage or dacpac, and schema drift detection
- CI/CD pipeline design: source control integration, branching policies, pull request workflows, deployment triggers, approval gates, code owners, secrets management, and unit/integration testing strategies
- Data API builder (DAB): JSON configuration files, entity definitions for tables/views/stored procedures, REST and GraphQL endpoint configuration, pagination, filtering, searching, caching, and GraphQL relationships
- Azure Monitor: Application Insights for application-level telemetry, Log Analytics for centralized log queries, and diagnostic settings
- Change patterns: Change Data Capture (CDC), Change Tracking, Change Event Streaming (CES), Azure Functions with SQL trigger binding, and Azure Logic Apps for event-driven architectures
Common Exam Traps
Implement AI Capabilities in Database Solutions
This domain covers integrating AI directly into SQL databases: evaluating and managing external AI models, generating and maintaining embeddings, implementing intelligent search (full-text, vector, and hybrid), and building RAG workflows using sp_invoke_external_rest_endpoint. While the smallest domain by weight, this is the most novel material and a key differentiator for the certification.
Key Topics
Must-Know Concepts
- Evaluating external models: multimodal capabilities, multilanguage support, model sizes, structured output support, and cost considerations when choosing an AI model for your database solution
- Creating and managing external models: registering models with CREATE EXTERNAL MODEL, configuring credentials, and managing model lifecycle
- Embedding maintenance methods: table triggers, Change Tracking, Azure Functions with SQL trigger binding, Azure Logic Apps, CDC, Change Event Streaming (CES), and Microsoft Foundry — know which to use for different freshness requirements
- Choosing which columns to include in embeddings: identifying the most semantically meaningful columns, combining multiple columns for richer embeddings, and excluding irrelevant data
- Designing and implementing chunks for embeddings: splitting large text into appropriately sized chunks with overlap to maintain context across chunk boundaries
- Generating embeddings: using AI_GENERATE_EMBEDDINGS (the native T-SQL function that calls a registered external model) or sp_invoke_external_rest_endpoint (direct REST call to an AI API). AI_GENERATE_EMBEDDINGS requires a CREATE EXTERNAL MODEL registration and is the more concise, exam-preferred method
- Full-text search: creating full-text indexes, full-text catalogs, and using CONTAINS, FREETEXT, CONTAINSTABLE, and FREETEXTTABLE for keyword search
- Vector data type: declaring vector columns with dimensions, understanding the vector data type limitations, and storing embeddings
- VECTOR_DISTANCE: syntax with three distance metrics (cosine, Euclidean, dot product), when to use each metric, and the performance characteristics of exact search
- VECTOR_NORMALIZE: normalizes a vector to unit length using a specified norm type (norm1, norm2, norminf). Use before storing or comparing embeddings when magnitude should not affect similarity scores
- VECTORPROPERTY: returns metadata about a vector column — Dimensions (the count of dimensions as an integer) or BaseType (the underlying data type name). Used for inspecting vector schema programmatically
- VECTOR_SEARCH: using DiskANN indexes for approximate nearest neighbor search, understanding recall vs speed tradeoffs, and configuring vector index parameters
- ANN vs ENN: Approximate Nearest Neighbor (VECTOR_SEARCH with indexes, fast but approximate) vs Exact Nearest Neighbor (VECTOR_DISTANCE scanning all vectors, slow but precise)
- Hybrid search: combining full-text and vector search results for improved relevance using Reciprocal Rank Fusion (RRF) to merge ranked lists
- Evaluating search performance: measuring recall, precision, latency, and relevance of vector and hybrid search implementations
- RAG workflow: (1) receive user query, (2) generate query embedding, (3) search for relevant context using vector/hybrid search, (4) construct prompt with retrieved context, (5) send to language model via sp_invoke_external_rest_endpoint, (6) extract and return the response
- Converting structured data to JSON for language model processing: using FOR JSON, JSON_OBJECT, and JSON_ARRAY to format query results as model input
Common Exam Traps
Technologies and Concepts You Must Not Confuse
These pairs appear on nearly every exam. Learn the difference and you'll avoid the most common traps.
Top Mistakes to Avoid
Exam-Ready Checklist
Recommended Resources
Free & Official Resources
Paid Courses & Practice Exams
These are recommended if you prefer a structured learning path. They can save time but are not required to pass.