CertPrepNow
MicrosoftDP-600Updated 2026-06-15

DP-600 Study Guide

Everything you need to pass the Microsoft Fabric Analytics Engineer Associate exam. Structured study plans, key services, common traps, and practice questions.

You Can Pass This Exam For Free

The DP-600 exam is passable with free resources alone if you study consistently for 6-10 weeks:

  • Microsoft Learn DP-600 official learning path (free, self-paced)
  • Microsoft Fabric free 60-day trial for hands-on practice
  • Official DP-600 exam study guide with objectives breakdown (free)
  • Microsoft Fabric documentation on Lakehouse, Warehouse, and semantic models (free)
  • Microsoft Learn free practice assessment for DP-600
  • 500+ free practice questions on this site

Microsoft Fabric has a generous free trial (60 days) that gives you full access to build lakehouses, warehouses, dataflows, and semantic models. Combined with the official Microsoft Learn paths, you can cover every exam objective without spending anything beyond the $165 exam fee.

Choose Your Study Path

Limited experience with Power BI, data warehousing, or data engineering. You need to learn the Fabric ecosystem from scratch before tackling analytics engineering topics.

Week 1Learn Microsoft Fabric fundamentals: OneLake architecture, workspaces, capacity model, and how lakehouses, warehouses, and semantic models relate to each other
Week 2Study data ingestion methods: Data Pipelines, Dataflows Gen2, Notebooks, and the Copy activity. Understand when to use each tool and practice creating connections
Week 3Deep dive into Lakehouse: create tables, work with Delta format, use Apache Spark notebooks, and understand the SQL analytics endpoint (read-only vs. read-write)
Week 4Study Warehouse: T-SQL for creating views, stored procedures, and functions. Understand ACID transactions and how Warehouse differs from Lakehouse
Week 5Learn star schema design: fact tables, dimension tables, Type 1 and Type 2 slowly changing dimensions, bridge tables, and denormalization patterns
Week 6Study semantic models: storage modes (Import, DirectQuery, Direct Lake), relationships, DAX basics (CALCULATE, FILTER, iterators, variables)
Week 7Learn security and governance: workspace access controls, RLS, OLS, CLS, sensitivity labels, endorsement, and deployment pipelines with Git integration
Week 8Study query languages: SQL for warehouses, KQL for Real-Time Intelligence, DAX for semantic models. Practice writing queries in each language
Week 9Learn performance optimization: Direct Lake configuration, incremental refresh, DAX performance tuning, and monitoring capacity usage
Week 10Take full practice exams, review all incorrect answers. Focus on Prepare Data domain which is 47.5% of the exam. Schedule your exam when scoring 75%+

Exam Overview

Format

40-60 questions, 100 minutes. Multiple choice, case studies, drag-and-drop, and interactive scenario-based questions.

Scoring

Scaled score 100-1000. Passing: 700. No penalty for wrong answers — always guess if unsure.

Domains & Weights

  • Maintain a Data Analytics Solution27.5%
  • Prepare Data47.5%
  • Implement and Manage Semantic Models25%

Registration

$165 USD. Available at Pearson VUE testing centers or online proctored from home. Exam fee is $165 USD.

Topic Priority Table

Not all topics are tested equally. Focus your study time on Tier 1 first, then Tier 2. Tier 3 topics rarely appear — just recognize what they do.

Tier 1: Must KnowYou must understand these services deeply, know their capabilities and limitations, and be able to apply them in scenario-based questions. These appear across multiple domains.
Tier 2: Should KnowUnderstand what these are and their key characteristics. May appear in 2-5 questions each.
Tier 3: Recognize OnlyKnow what these are at a high level. Rarely more than 1-2 questions each.
Domain 127.5% of exam

Maintain a Data Analytics Solution

This domain covers security, governance, and lifecycle management for Fabric analytics solutions. You need to implement granular access controls (workspace, item, row, column, object, and file level), apply sensitivity labels, manage deployment pipelines, configure Git integration, and work with reusable assets like PBIP files and shared semantic models. While not the largest domain, getting security wrong means losing easy points.

Key Topics

Workspace Access ControlsRLSOLSCLSSensitivity LabelsDeployment PipelinesGit IntegrationXMLA EndpointPBIPImpact Analysis

Must-Know Concepts

  • Workspace-level access: Admin, Member, Contributor, and Viewer roles with specific permissions for each
  • Item-level access: per-item permissions that override workspace roles for specific lakehouses, warehouses, or semantic models
  • Row-level security (RLS): DAX-based row filtering defined in roles. Configure in Power BI Desktop, test and manage in the service
  • Column-level security (CLS): restricts access to specific columns at the data layer in Warehouse or Lakehouse SQL endpoint
  • Object-level security (OLS): hides entire tables or columns from the semantic model. Requires Tabular Editor — cannot be configured natively in Power BI Desktop
  • File-level access control: managing who can access files stored in the Lakehouse Files section
  • Sensitivity labels: Microsoft Purview labels (Confidential, Internal, Public) applied to Fabric items that propagate to downstream assets
  • Endorsement: Promoted and Certified badges that signal trustworthy data assets to consumers
  • Git integration: connecting workspaces to Azure DevOps or GitHub repos for version control, branching, and pull requests
  • Deployment pipelines: promoting content through Dev, Test, and Production stages with comparison and selective deployment
  • PBIP files: Power BI Project format that stores reports as JSON for Git-friendly version control
  • XMLA endpoint: external tool connectivity for deploying and managing semantic models via SSMS, Tabular Editor, or ALM Toolkit
  • Impact analysis: understanding downstream dependencies when changes are made to lakehouses, warehouses, dataflows, or semantic models
  • Reusable assets: PBIT templates (no data), PBIDS data source files, and shared semantic models

Common Exam Traps

OLS requires Tabular Editor or another external tool. You CANNOT create OLS rules natively in Power BI Desktop — only RLS can be configured there
Workspace Viewer role can view content but cannot build reports on shared semantic models. You need Build permission for that
Sensitivity labels propagate DOWNSTREAM. If a Lakehouse is labeled Confidential, items derived from it inherit the label
Deployment pipelines and Git integration serve different purposes: pipelines for environment promotion, Git for version control. They are complementary
The SQL analytics endpoint of a Lakehouse is READ-ONLY. You cannot implement CLS through the Lakehouse SQL endpoint the same way as in a Warehouse
Quick Check: Maintain a Data Analytics Solution

Question 1 of 3

A team needs to ensure that sales managers can only see revenue data for their own region in Power BI reports built on a Fabric semantic model. Which security feature should they implement?

Domain 247.5% of exam

Prepare Data

The heaviest domain at 47.5% — expect roughly 20-25 questions on this topic. Covers the full data lifecycle: getting data into Fabric, transforming it, and querying it for analysis. You must know how to create connections, ingest data, choose between data stores, implement star schemas, clean and transform data, and write queries in SQL, KQL, and DAX. Master this domain or you will not pass.

Key Topics

LakehouseWarehouseOneLakeDataflows Gen2Data PipelinesSpark NotebooksSQLKQLDAXStar Schema

Must-Know Concepts

  • Data connections: how to create connections to external sources and Fabric internal items for ingestion
  • Data discovery: using OneLake catalog to browse organizational data assets and Real-Time Hub to find streaming data sources
  • Choosing between data stores: Lakehouse for mixed data types and Spark workloads, Warehouse for structured SQL-heavy analytics, Eventhouse for real-time streaming data
  • OneLake integration: how Eventhouse and semantic models connect to OneLake for unified data access
  • Creating views, functions, and stored procedures in Warehouse using T-SQL for reusable transformation logic
  • Enriching data: adding new calculated columns or lookup tables to enhance analytical value
  • Star schema implementation: fact tables, dimension tables, SCD Type 1 (overwrite) and Type 2 (history tracking), bridge tables for many-to-many relationships
  • Denormalization: flattening normalized tables into wider tables for query performance in analytics scenarios
  • Aggregation: pre-computing summary data to improve query performance on large datasets
  • Data quality: identifying and resolving duplicate data, handling missing data and null values, converting column data types
  • Merge and join operations: combining data from multiple sources using inner, outer, left, right, and cross joins
  • SQL querying: SELECT, WHERE, GROUP BY, HAVING, JOIN, window functions, and CTEs for warehouse and lakehouse data
  • KQL querying: basic Kusto syntax for filtering, aggregating, and analyzing real-time data in Eventhouse
  • DAX querying: using DAX to query semantic models including EVALUATE, SUMMARIZE, CALCULATE, and table functions
  • Visual Query Editor: building queries graphically without writing code

Common Exam Traps

Lakehouse SQL analytics endpoint is READ-ONLY. If a question asks you to create views or stored procedures, the answer is Warehouse, not Lakehouse
SCD Type 1 OVERWRITES the old value with the new one (no history). SCD Type 2 ADDS a new row preserving the old record with effective dates. The exam tests both
Dataflows Gen2 output to destinations (Lakehouse, Warehouse). They do NOT replace Notebooks for complex transformations — the exam tests when each tool is appropriate
KQL is for Real-Time Intelligence (Eventhouse), NOT for Lakehouse or Warehouse queries. Do not confuse which query language works with which data store
Bridge tables solve many-to-many relationships in star schemas. The exam may present a scenario where a direct many-to-many relationship seems simpler but a bridge table is the correct answer
Quick Check: Prepare Data

Question 1 of 4

A data engineer needs to transform raw JSON files into structured Delta tables with complex schema inference and incremental loading logic. The data volume is 500GB. Which Fabric tool is most appropriate?

Domain 325% of exam

Implement and Manage Semantic Models

This domain covers designing, building, and optimizing Power BI semantic models within Fabric. You need to understand storage modes (Import, DirectQuery, Direct Lake), relationship design including bridge tables and many-to-many, DAX calculations (variables, iterators, table filtering, windowing), calculation groups, large model storage format, composite models, and enterprise-scale optimization including Direct Lake configuration and incremental refresh.

Key Topics

Semantic ModelsDirect LakeDAXComposite ModelsCalculation GroupsIncremental RefreshImport ModeDirectQuery

Must-Know Concepts

  • Storage modes: Import (data copied into model), DirectQuery (live queries to source), Direct Lake (reads Delta tables from OneLake into memory)
  • Direct Lake configuration: two variants exist — Direct Lake on OneLake (no DirectQuery fallback, multi-item support, composite models with Import supported) vs. Direct Lake on SQL endpoints (falls back to DirectQuery, single SQL endpoint only, no composite model support with DirectQuery/Dual tables). Know refresh settings and the Direct Lake behavior property that controls fallback for SQL endpoints
  • Direct Lake on OneLake: semantic model can reference tables from multiple Fabric items (e.g., Lakehouse A + Warehouse B in one model). No DirectQuery fallback — guardrail violations cause refresh failure
  • Composite models: Direct Lake on OneLake supports mixing with Import tables from any data source. Direct Lake on SQL endpoints does NOT support composite models with DirectQuery or Dual tables
  • Relationships: one-to-many, many-to-one, many-to-many, and bridge tables to resolve complex relationships
  • DAX variables: using VAR/RETURN to improve readability and performance by storing intermediate calculations
  • DAX iterators: SUMX, AVERAGEX, MAXX, MINX — row-by-row calculations that iterate over tables
  • DAX table filtering: CALCULATE with FILTER, ALL, ALLEXCEPT, KEEPFILTERS for modifying filter context
  • DAX windowing functions: OFFSET, INDEX, WINDOW for calculations relative to the current row position
  • DAX information functions: ISBLANK, ISERROR, HASONEVALUE for conditional logic in measures
  • Calculation groups: reusable calculation patterns (YTD, QTD, same period last year) applied across multiple measures
  • Dynamic format strings: DAX expressions that change measure formatting based on context
  • Field parameters: enabling users to dynamically swap axes and measures in report visuals
  • Large semantic model storage format: enabling models larger than the default 10GB per dataset to support enterprise workloads
  • Incremental refresh: configuring date-based partitioning to refresh only new or changed data partitions
  • DAX performance: minimizing iterator nesting, using variables, avoiding CALCULATE within iterators, and leveraging aggregations

Common Exam Traps

Direct Lake fallback behavior differs by variant: Direct Lake on SQL endpoints falls back to DirectQuery when it cannot load data directly (e.g., SQL views, RLS enforcement, or guardrail breaches). Direct Lake on OneLake does NOT fall back to DirectQuery — if guardrails are exceeded or data cannot be framed, refresh fails and queries fail. Know both variants and their fallback behavior
Direct Lake on OneLake can mix tables from multiple Fabric items and supports composite models with Import tables. Direct Lake on SQL endpoints is limited to a single SQL endpoint and does NOT support composite models with DirectQuery or Dual tables. The exam tests both distinctions
Composite models mixing Direct Lake and Import tables have different performance characteristics than pure Direct Lake models — the Import portion requires scheduled refresh
Calculation groups apply to ALL measures by default. You must explicitly exclude measures that should not be affected
CALCULATE changes the filter context. FILTER is a table function that returns rows. Using FILTER inside CALCULATE is common but understand that CALCULATE alone can handle simple filter modifications
Incremental refresh requires a DateTime column and configured RangeStart/RangeEnd parameters. Without these parameters, incremental refresh cannot be set up
Quick Check: Implement and Manage Semantic Models

Question 1 of 3

A company wants their Power BI semantic model to access Delta tables from both a Lakehouse and a Warehouse in the same Fabric workspace without data duplication. Which configuration should they use?

Fabric Services You Must Not Confuse

These pairs appear on nearly every exam. Learn the difference and you'll avoid the most common traps.

Lakehouse vs Warehouse

Use Lakehouse when…

Stores structured, semi-structured, and unstructured data. Uses Apache Spark for transformations. SQL analytics endpoint is read-only. Best for data engineers working with raw data and Spark-based ETL.

Use Warehouse when…

Stores structured data only. Full T-SQL read-write support with ACID transactions, views, stored procedures, and functions. Best for SQL developers building reporting layers.

Exam trap

Lakehouse SQL endpoint is READ-ONLY. Only the Warehouse supports full T-SQL write operations (INSERT, UPDATE, DELETE). If a question requires DML statements, the answer is Warehouse.

Direct Lake vs DirectQuery

Use Direct Lake when…

Reads Delta tables from OneLake directly into memory. Near-real-time data without scheduled refresh overhead. Direct Lake on OneLake does NOT fall back — if guardrails are exceeded, refresh fails and queries fail. Direct Lake on SQL endpoints falls back to DirectQuery.

Use DirectQuery when…

Sends queries to the source system at report interaction time. Always live data but slower due to query translation and network round trips.

Exam trap

Direct Lake loads data INTO MEMORY from OneLake. Whether fallback to DirectQuery occurs depends on the variant: Direct Lake on SQL endpoints falls back to DirectQuery for SQL views, RLS enforcement, or guardrail breaches. Direct Lake on OneLake does NOT fall back — guardrail violations cause refresh failure instead. The exam explicitly tests this distinction.

Dataflows Gen2 vs Spark Notebooks

Use Dataflows Gen2 when…

Low-code/no-code Power Query interface. Best for simple to moderate transformations, 100+ connectors, and business users. Higher compute cost per operation.

Use Spark Notebooks when…

Code-first environment (PySpark, Spark SQL, Scala, R). Best for complex transformations, large datasets, machine learning, and cost-efficient production pipelines.

Exam trap

Dataflows Gen2 are easier but more expensive per CU. Notebooks are more flexible and cheaper at scale. The exam tests scenarios where one is clearly better than the other — look for keywords like 'complex joins,' 'schema drift,' or 'low-code.'

Row-Level Security (RLS) vs Object-Level Security (OLS)

Use Row-Level Security (RLS) when…

Filters data ROWS dynamically using DAX expressions. Users see the table and columns but only their permitted rows. Configured in Power BI Desktop.

Use Object-Level Security (OLS) when…

Hides entire tables or columns completely. Users cannot see hidden objects in the field list at all. Requires external tools like Tabular Editor.

Exam trap

RLS filters ROWS (users see the table but restricted data). OLS hides OBJECTS (tables/columns are completely invisible). Also know Column-Level Security (CLS) which restricts specific columns but at the data layer, not the model layer.

Import Mode vs Direct Lake Mode

Use Import Mode when…

Copies data into the Power BI engine at scheduled refresh time. Best for small to medium datasets. Requires explicit refresh to get new data.

Use Direct Lake Mode when…

Reads Delta tables directly from OneLake into memory. No scheduled import needed. Near-real-time data availability with high performance.

Exam trap

Import mode creates a COPY of the data inside the model. Direct Lake reads from OneLake WITHOUT copying — it loads directly into memory. Import mode requires scheduled refresh; Direct Lake auto-syncs when Delta tables change.

Deployment Pipelines vs Git Integration

Use Deployment Pipelines when…

Promotes Fabric items through Dev/Test/Prod stages within the Fabric service. Focuses on deployment and environment management.

Use Git Integration when…

Syncs workspace items to a Git repository (Azure DevOps or GitHub) for version control, branching, and pull request workflows.

Exam trap

Deployment pipelines handle PROMOTION between environments. Git integration handles VERSION CONTROL. They are complementary, not alternatives. Use Git for source control and deployment pipelines for staging promotions.

Star Schema in Lakehouse vs Star Schema in Semantic Model

Use Star Schema in Lakehouse when…

Physical star schema created with Delta tables in the Lakehouse using Spark or SQL. Data is stored in fact and dimension tables in OneLake.

Use Star Schema in Semantic Model when…

Logical star schema built in the semantic model layer through relationships between tables. May reference the same physical tables via Direct Lake.

Exam trap

You can build star schemas at BOTH levels. The exam tests when to create the star schema upstream (in Lakehouse/Warehouse) vs. in the semantic model. Best practice is to model the star schema in the data layer and have the semantic model reflect it.

PBIX File vs PBIP (Power BI Project)

Use PBIX File when…

Binary file format for Power BI reports. Contains the report, data model, and data in a single compressed file. Not Git-friendly.

Use PBIP (Power BI Project) when…

Folder-based format that stores report components as human-readable JSON files. Designed for Git version control and team collaboration.

Exam trap

PBIP is the Git-friendly format. PBIX is the traditional binary format. If a question asks about version control or team collaboration for Power BI development, the answer involves PBIP, not PBIX.

Top Mistakes to Avoid

Confusing Lakehouse SQL analytics endpoint (read-only) with Warehouse (full T-SQL read-write) — stored procedures, views with DML, and functions require the Warehouse
Mixing up Direct Lake (reads from OneLake into memory) with DirectQuery (queries source system live) — Direct Lake is faster because data is loaded into memory, not queried remotely
Assuming Direct Lake always falls back to DirectQuery — only Direct Lake on SQL endpoints falls back; Direct Lake on OneLake does NOT fall back to DirectQuery (guardrail violations cause refresh failure instead)
Confusing RLS (filters rows with DAX) with OLS (hides entire tables or columns) — and forgetting that OLS requires Tabular Editor, not Power BI Desktop
Using Dataflows Gen2 when Spark Notebooks are more appropriate — Dataflows are low-code but expensive and limited for complex transformations on large datasets
Confusing SCD Type 1 (overwrites old value, no history) with SCD Type 2 (adds new row, preserves history with effective dates)
Thinking PBIX and PBIP are interchangeable — PBIP is the Git-friendly folder format designed for version control, PBIX is the binary format
Using KQL syntax for Warehouse queries or T-SQL for Eventhouse queries — each data store has its specific query language
Confusing deployment pipelines (environment promotion: Dev to Test to Prod) with Git integration (version control: commits, branches, pull requests)
Forgetting that calculation groups affect ALL measures by default — you must explicitly exclude measures that should not be modified
Not setting up RangeStart and RangeEnd parameters before configuring incremental refresh — these parameters are required prerequisites
Thinking Direct Lake on OneLake and Direct Lake on SQL endpoints are the same — OneLake supports multiple source items, SQL endpoints are limited to one

Exam-Ready Checklist

Can explain all 3 exam domains and their relative weights (27.5% Maintain, 47.5% Prepare Data, 25% Semantic Models)
Know when to use Lakehouse vs. Warehouse vs. Eventhouse and the specific capabilities of each
Can configure all security layers: workspace access, item-level, RLS, CLS, OLS, file-level access, and sensitivity labels
Understand the three semantic model storage modes (Import, DirectQuery, Direct Lake) and when to use each
Can explain Direct Lake variants: OneLake variant does NOT fall back to DirectQuery (guardrail violations cause refresh failure); SQL endpoints variant DOES fall back to DirectQuery for SQL views, RLS, and guardrail breaches
Know how to implement star schemas: fact tables, dimension tables, SCD Type 1 and Type 2, bridge tables, and denormalization
Can write DAX measures using variables, iterators (SUMX, AVERAGEX), CALCULATE with filter modifications, and windowing functions
Understand calculation groups, field parameters, and dynamic format strings and when to apply each
Know the full lifecycle management stack: Git integration, deployment pipelines, PBIP format, and XMLA endpoint
Can choose the right data ingestion tool: Dataflows Gen2 for low-code, Notebooks for complex ETL, Pipelines for orchestration
Understand incremental refresh setup including RangeStart/RangeEnd parameters and partition management
Can write basic queries in SQL (for Warehouse), KQL (for Eventhouse), and DAX (for semantic models)
Scored 75%+ on at least two full mock exams (700/1000 passing score)
Reviewed all incorrect answers — Prepare Data domain is 47.5% of the exam, so weak performance there is fatal

Recommended Resources

Free & Official Resources

Paid Courses & Practice Exams

These are recommended if you prefer a structured learning path. They can save time but are not required to pass.

Frequently Asked Questions