You Can Pass This Exam For Free
Choose Your Study Path
You have general Python and data engineering skills but limited experience with LLMs, RAG, or Databricks Mosaic AI. You need to build foundational GenAI knowledge before tackling Databricks-specific implementation.
Exam Overview
Format
45 multiple-choice questions, 90 minutes. Proctored through Kryterion Webassessor online or at testing centers.
Scoring
Percentage-based scoring. Passing: 70% (32 out of 45 questions). No penalty for wrong answers — always answer every question.
Domains & Weights
- Design Applications14%
- Data Preparation14%
- Application Development30%
- Assembling and Deploying Applications22%
- Governance8%
- Evaluation and Monitoring12%
Registration
$200 USD. Register at Databricks Academy (academy.databricks.com). Exam fee is $200 USD. Delivered through Kryterion Webassessor. Available as online proctored or at select testing centers.
Topic Priority Table
Not all topics are tested equally. Focus your study time on Tier 1 first, then Tier 2. Tier 3 topics rarely appear — just recognize what they do.
Design Applications
This domain covers designing GenAI application architectures: crafting effective prompts, selecting appropriate models for business requirements, choosing chain components, converting business goals into AI pipeline specifications, and designing agentic systems with tools and Agent Bricks.
Key Topics
Must-Know Concepts
- Craft prompts that yield specifically formatted responses: use explicit format instructions, JSON schema examples in the prompt, and output parsers to enforce structure
- Select model tasks by matching LLM capabilities to business requirements: instruction following, classification, extraction, summarization, code generation, and multi-step reasoning
- Chain components: retriever, prompt template, LLM, output parser, memory. Know what each does and how they connect in LCEL
- Convert a business goal (e.g., 'answer employee HR questions') into an AI pipeline: identify inputs, required context sources, output format, and quality constraints
- Define and sequence tools for multi-stage reasoning: each tool needs a clear name, description, and typed input/output schema so the LLM can decide when to use it
- Determine Agent Bricks usage: Knowledge Assistant for RAG Q&A, Multiagent Supervisor for routing to specialized agents, Information Extraction for structured output from text
Common Exam Traps
Data Preparation
This domain covers preparing data for RAG applications: selecting appropriate chunking strategies, cleaning source documents, choosing extraction libraries, writing chunks to Delta Lake in Unity Catalog, identifying source documents, evaluating retrieval quality, and understanding re-ranking.
Key Topics
Must-Know Concepts
- Chunking strategies and when to use each: fixed-size (uniform content), sentence/paragraph (preserves semantic units), recursive (hierarchy-aware for documents with headers/sections), semantic (topic-based clustering)
- Chunk overlap: include overlapping tokens between adjacent chunks to avoid losing context at chunk boundaries. Typical: 10-20% of chunk size
- Document content extraction libraries: PyPDF2/pdfplumber for PDFs, python-docx for Word, BeautifulSoup/html2text for HTML, unstructured for mixed types
- Extraneous content to remove before chunking: headers, footers, page numbers, navigation menus, legal disclaimers, ads, formatting artifacts. These degrade retrieval relevance
- Writing chunks to Delta Lake: store in a Unity Catalog table with columns for chunk_id, source_document, chunk_text, metadata. This table is the source for Delta Sync Vector Search index
- Retrieval evaluation metrics: Precision@k (what fraction of retrieved chunks are relevant), Recall@k (what fraction of all relevant chunks are retrieved), MRR (Mean Reciprocal Rank of first relevant result)
- Re-ranking: apply a cross-encoder model after initial vector search to re-score and reorder results. Improves precision at the cost of added latency
- Source document identification: determine which documents are authoritative for the RAG application. Not all available data should be in the vector store
Common Exam Traps
Application Development
The heaviest domain at 30%. Covers LangChain and tool selection, response quality assessment, chunking strategy selection based on evaluation, prompt augmentation, guardrail implementation, LLM selection, embedding model selection, model hub usage, MLflow lifecycle, and agentic system development.
Key Topics
Must-Know Concepts
- LangChain LCEL: pipe syntax for composing chains (retriever | prompt | llm | output_parser). Each component is a Runnable. Know ChatPromptTemplate, retriever integration, and chain invocation
- Assessing response quality qualitatively: hallucination, incomplete answers, incorrect tone, format violations, safety issues. Know how to identify these without automated metrics
- Choosing chunking strategy based on model context length and retrieval evaluation results: if eval metrics are poor, iterate on chunking strategy
- Augmenting prompts with context: inject retrieved chunks into the prompt template using {context} placeholder. Structure the prompt to clearly separate context from the user question
- LLM guardrails: select techniques based on threat type — input classifiers for injection/harmful requests, output validators for PII/hallucination, topic classifiers for relevance
- Selecting LLMs based on application attributes: task type (instruction following, code gen, multi-step reasoning), latency requirements, cost constraints, context window needs, and multilingual requirements
- Embedding model context length: the embedding model must support a context length at least as long as the largest chunk. Chunks exceeding context length are truncated, degrading quality
- Selecting models from hubs: use metadata filters (task, context length, benchmark scores, license, cost) to shortlist candidates
- MLflow for GenAI lifecycle: log experiments, compare evaluation metrics across runs, register best model to Unity Catalog, track which prompt/model/data version produced which results
- Agentic systems with MLflow Agent Framework: define tools, build the agent loop, trace executions, evaluate agent performance
- Multi-agent systems: Genie Spaces for data access, conversational APIs for inter-agent communication, supervisor pattern for routing
Common Exam Traps
Assembling and Deploying Applications
This domain covers the technical implementation of deploying GenAI applications: coding pyfunc models, managing resource access, creating Vector Search indexes, registering models to Unity Catalog via MLflow, serving LLM applications, batch inference with ai_query(), CI/CD practices, MCP server integration, prompt lifecycle management, and building user interfaces.
Key Topics
Must-Know Concepts
- Code pyfunc models: implement a class extending mlflow.pyfunc.PythonModel with a predict(context, model_input) method. Use for chains with custom pre/post-processing
- Resource access from serving endpoints: grant the endpoint's service principal permissions to Vector Search indexes, Delta tables, secrets, and external APIs
- Coding simple chains: retriever + prompt + LLM using LCEL. Know the minimal implementation of a functional RAG chain
- RAG application MLflow model elements: model flavor (langchain or pyfunc), embedding model reference, retriever configuration, Unity Catalog dependencies, input examples, and model signature
- Registering models: mlflow.log_model() logs the artifact; mlflow.register_model() creates a Unity Catalog model version. Both steps required for deployment
- Creating and querying Vector Search indexes: create index via SDK or UI, specify source Delta table (for Delta Sync) or schema (for Direct Access), query with similarity_search()
- Serving LLM applications: deploy registered MLflow models to Databricks Model Serving. Configure compute type, concurrency, and environment variables
- ai_query() syntax: SELECT ai_query('catalog.schema.endpoint', prompt_column) FROM source_table. Enables batch inference in SQL
- Configuring Vector Search parameters: embedding model, index type (Delta Sync vs Direct Access), sync schedule/trigger, similarity metric (cosine vs dot product), and latency/cost trade-offs
- CI/CD for GenAI: automate Vector Search index updates when source Delta table changes, promote tested prompts from dev to prod via version control, run component integration tests
- MCP server types: Managed (Unity Catalog functions), External (third-party tool providers), Custom (user-implemented Python servers)
- Prompt version control: track prompt versions as MLflow artifacts, use lifecycle stages (development, staging, production) for promotion
Common Exam Traps
Governance
This domain covers governance of GenAI applications: applying guardrails for performance and safety objectives, selecting guardrail techniques against specific threats, addressing legal and licensing requirements for data sources, and recommending alternatives for problematic text in GenAI data.
Key Topics
Must-Know Concepts
- PII masking as a guardrail: detect and replace PII (names, emails, SSNs, phone numbers, addresses) in both inputs and outputs using NER models or regex patterns
- Guardrail techniques for malicious inputs: input classifiers for harmful content, prompt injection detectors, topic scope enforcers, jailbreak detection
- Legal and licensing requirements: understand Creative Commons licenses (CC-BY, CC-BY-SA, CC-BY-NC), copyright restrictions on training data and RAG source documents
- Commercial restrictions: some licenses (CC-BY-NC) prohibit commercial use. Validate all RAG data sources for license compatibility before production deployment
- Alternatives for problematic text: replace with filtered/curated datasets, use content policy flags to skip problematic documents, or apply post-processing to sanitize outputs
- Unity Catalog permissions for GenAI: model serving endpoints run with service principal identity. Grant only necessary privileges following least-privilege principles
Common Exam Traps
Evaluation and Monitoring
This domain covers evaluating and monitoring deployed GenAI applications: selecting LLMs using quantitative metrics, monitoring deployed endpoints, evaluating agents with MLflow, using inference logging, cost control, tracking with Agent Monitoring, identifying evaluation judges, using AI Gateway features, applying custom Scorers, and incorporating SME feedback.
Key Topics
Must-Know Concepts
- mlflow.evaluate() API: pass model URI or function, evaluation dataset, targets column (for ground truth metrics), and list of evaluators/metrics. Results logged to the MLflow run
- LLM-judge metrics (no ground truth needed): faithfulness, answer_relevance, harmfulness, coherence, fluency. Use a powerful LLM endpoint as the judge
- Ground truth metrics (require labeled answers): answer_correctness, exact_match, ROUGE, BLEU. Need a curated test dataset with reference answers
- MLflow Tracing for agents: automatically captures the full execution trace. Identify which tool calls failed, which reasoning steps were wrong, and where latency is spent
- Inference Tables: auto-log every request and response at a model serving endpoint to a Delta table. Enable with one configuration setting on the endpoint
- Agent Monitoring (Lakehouse Monitoring): analyze inference table data to track quality metrics, latency distributions, error rates, and drift over time
- AI Gateway rate limiting: configure requests-per-minute or tokens-per-minute limits per endpoint or user to control costs and prevent abuse
- Usage Tables: log token counts and cost estimates per request through AI Gateway. Join with Inference Tables for cost-quality analysis
- Databricks Scorers: register Python functions as custom mlflow evaluators that score outputs on domain-specific criteria beyond built-in metrics
- SME feedback: collect expert ratings via review apps, annotate correct/incorrect responses, use annotations to identify prompt weaknesses and update RAG content
Common Exam Traps
Concepts You Must Not Confuse
These pairs appear on nearly every exam. Learn the difference and you'll avoid the most common traps.
Top Mistakes to Avoid
Exam-Ready Checklist
Recommended Resources
Free & Official Resources
Paid Courses & Practice Exams
These are recommended if you prefer a structured learning path. They can save time but are not required to pass.