How long should I study for the SnowPro Advanced MLOps Engineer exam?

It depends heavily on your existing Snowflake ML experience. If you actively build and deploy ML models on Snowflake, 2-3 weeks of focused review should suffice. If you have Snowflake experience but are new to the ML platform features (Feature Store, Model Registry, ML Jobs), plan for 4-6 weeks. If you are coming from another cloud ML platform with no Snowflake experience, budget 8-10 weeks to learn both Snowflake fundamentals and the ML-specific capabilities.

What are the prerequisites for the SnowPro Advanced MLOps Engineer certification?

Snowflake recommends at least two years of experience architecting and operationalizing enterprise-scale ML lifecycles in cloud environments, plus one or more years using Snowflake Data Cloud functionality. The SnowPro Core certification is a prerequisite. While these are not strictly enforced (you can register without verification), having real hands-on experience is essential because the exam tests practical scenarios, not just theoretical knowledge.

Is the beta exam easier or harder than the GA version?

Beta exams are not inherently easier or harder — they use the same question pool that will become the final exam. However, beta exams are significantly cheaper ($188 vs $375) and your score may take 6-8 weeks to be released as Snowflake performs psychometric analysis on the questions. Some beta questions may be removed from the final version. The main risk is the longer wait for results, but the cost savings make the beta worthwhile if you are already prepared.

Can I pass using only free resources?

Yes. Snowflake's official documentation is exceptionally comprehensive and covers everything on the exam. The Snowflake ML documentation, quickstart guides, Builders Blog articles, and GitHub sample repositories are all free. Combined with hands-on practice in a Snowflake trial account and free practice questions on this site, you can prepare thoroughly without paid courses. The key is hands-on experience — reading documentation alone is insufficient for the scenario-based questions.

What score do I need to pass?

You need a scaled score of 750 out of 1000. Snowflake uses scaled scoring to ensure consistency across exam versions and item difficulty levels. There is no penalty for wrong answers, so always answer every question even if you are unsure. The exact number of correct answers needed varies by exam form, but aiming for roughly 75-80% correct answers should put you safely above the threshold.

How does this compare to other MLOps certifications like AWS MLA-C01 or Azure AI-300?

The SnowPro Advanced MLOps Engineer is focused specifically on Snowflake's ML platform — Feature Store, Model Registry, SPCS, ML Jobs, and ML Observability. AWS MLA-C01 covers SageMaker, Step Functions, and the broader AWS ML ecosystem. Azure AI-300 covers Azure ML, Microsoft Foundry, and GenAIOps. If your organization uses Snowflake as its primary data platform, this certification validates deep expertise in Snowflake-native ML operations. It complements rather than competes with cloud-provider-specific ML certifications.

Do I need to know how to write Python code?

You should be comfortable reading Python code, especially Snowpark ML APIs, but you will not write code from scratch during the exam. The exam uses multiple choice and multiple select questions, not coding challenges. However, you need to understand Snowpark ML Python APIs (preprocessing, Pipeline objects, GridSearchCV), Feature Store Python API, and ML Jobs configuration. Being able to read code snippets and identify what they do is important for many questions.

Which domain should I focus on most?

Domain 2 (MLOps Infrastructure and Management) at 24% and Domain 4 (Pipeline Orchestration and CI/CD) at 22% together make up 46% of the exam. These are your highest-impact study areas. Domain 1 (Data Preparation and Feature Engineering) at 20% is also substantial. Governance (16%) and Model Serving (18%) are important but have fewer questions. A common strategy is to master Domains 2 and 4 first, then cover Domain 1, then finish with Domains 3 and 5.

Is hands-on experience required to pass?

While technically you could pass with purely theoretical study, the exam heavily tests scenario-based questions that are much easier if you have hands-on experience. The questions present real-world situations like choosing between SPCS and warehouse compute, or deciding between Dynamic Tables and Streams+Tasks for a feature pipeline. Snowflake offers free trial accounts where you can practice with Feature Store, Model Registry, and Snowpark ML. Even a few hours of hands-on labs significantly improve your exam readiness.

What happens if I fail the exam?

You can retake the exam after a 14-day waiting period. The retake costs the full exam fee ($375 for GA, $188 during beta). There is no limit on the number of retake attempts, but each requires the waiting period and full payment. Use the time between attempts to review the domains where you felt weakest. Snowflake provides a domain-level performance breakdown with your score report, which helps target your restudy efforts.

How long is the certification valid?

SnowPro Advanced certifications are valid for two years. To maintain your certification, you can either retake the current exam or pass a renewal exam before the expiration date. Snowflake occasionally updates exam versions (e.g., DEA-C01 to DEA-C02), so renewal may require studying new features added to the platform since your original certification.

Is this certification worth it for career advancement?

If you work with Snowflake in an ML engineering or MLOps role, this certification demonstrates specialized expertise that is increasingly in demand. Snowflake is one of the fastest-growing data platforms, and organizations are actively adopting its ML capabilities (Feature Store, Model Registry, SPCS). The certification differentiates you from generalist ML engineers and signals to employers that you can build production ML systems specifically on Snowflake. The beta exam at $188 is a particularly good value compared to the $375 GA price.

SnowPro Advanced: MLOps Engineer (MLA-B01) Free Study Guide 2026

You Can Pass This Exam For Free

The MLA-B01 exam is passable with free resources if you have hands-on Snowflake ML experience and study consistently for 4-8 weeks:

Snowflake official documentation for Snowflake ML, Feature Store, Model Registry, and ML Observability (free)
Snowflake Quickstart Guides for ML pipelines, model serving, and feature engineering (free)
Snowflake Builders Blog on Medium — deep-dive articles on ML Jobs, SPCS, and pipeline orchestration (free)
Snowflake community forums and knowledge base (free)
Snowflake ML GitHub sample repositories and notebooks (free)
500+ free practice questions on this site

This is a brand-new beta certification launching June 15, 2026. Official study guides are still emerging. Snowflake documentation and hands-on labs are your primary free resources. The beta exam fee is reduced to $188 USD (vs. $375 for GA).

Choose Your Study Path

You have general ML/data science experience but limited Snowflake-specific knowledge. You need to learn Snowflake's ML platform from the ground up.

Week 1Learn Snowflake fundamentals: architecture (virtual warehouses, storage, cloud services layer), Snowpark Python, SQL worksheets, and Snowflake Notebooks. Complete the 'Getting Started with Snowpark' quickstart guide

Week 2Study data preparation in Snowflake: Dynamic Tables, Streams and Tasks, Snowpark DataFrames, and data transformation patterns. Understand when to use Dynamic Tables vs Streams+Tasks

Week 3Deep dive into Feature Store: creating feature entities, feature views, managed vs external feature tables, incremental refresh, point-in-time lookups, and online vs offline serving

Week 4Study Snowpark ML preprocessing and modeling: distributed preprocessing (scalers, encoders), Pipeline objects, model training with scikit-learn/XGBoost/LightGBM wrappers, and distributed hyperparameter tuning with GridSearchCV

Week 5Learn Model Registry and Model Serving: registering models, versioning, deploying to Snowpark Container Services (SPCS), compute pools, GPU workloads, batch vs real-time inference, and autoscaling

Week 6Study ML Jobs and pipeline orchestration: Task Graphs (DAGs), scheduled notebooks, CI/CD with Git integration and Snowflake CLI GitHub Actions, and external orchestrators (Airflow, Prefect, Dagster)

Week 7Cover ML Observability, ML Lineage, and governance: model monitoring, drift detection, RBAC for ML models, dynamic data masking, Snowflake Horizon, and cost management for ML workloads

Week 8Practice questions across all domains. Take a full mock exam. Focus on Domain 2 (MLOps Infrastructure, 24%) and Domain 4 (Pipeline Orchestration, 22%) which together are 46% of the exam

Week 9Review all incorrect answers, re-study weak areas. Take another mock exam aiming for 80%+

Week 10Final review: focus on confusable concepts (Dynamic Tables vs Streams+Tasks, warehouse vs SPCS compute, Feature Store vs manual feature pipelines) and exam traps

Exam Overview

Format

70 questions, 115 minutes. Multiple choice and multiple select. English only.

Scoring

Scaled score 0-1000. Passing: 750. No penalty for wrong answers — always answer every question.

Domains & Weights

Operationalize Data Preparation and Feature Engineering20%
MLOps Infrastructure and Management24%
Model Serving and Deployment Operations18%
Pipeline Orchestration and Automation (CI/CD)22%
Governance, Security and Monitoring16%

Registration

$375 USD. Delivered online via Snowflake's certification portal. Exam fee is $375 USD ($188 during beta period June 15 - July 13, 2026). Requires SnowPro Core certification as a prerequisite.

Topic Priority Table

Not all topics are tested equally. Focus your study time on Tier 1 first, then Tier 2. Tier 3 topics rarely appear — just recognize what they do.

Tier 1: Must KnowYou must understand these deeply and be able to apply them in scenarios. These appear across multiple questions and domains.

Tier 2: Should KnowUnderstand what these are and their key characteristics. May appear in 2-5 questions each.

Tier 3: Recognize OnlyKnow what these are at a high level. Rarely more than 1-2 questions each.

Domain 120% of exam

Operationalize Data Preparation and Feature Engineering

This domain covers building production-grade data pipelines and feature engineering workflows within Snowflake. You need to understand how to transform raw data into ML-ready features using Snowflake Feature Store, Dynamic Tables, Streams and Tasks, and Snowpark ML preprocessing APIs, with emphasis on automation, incremental refresh, and point-in-time correctness.

Key Topics

Snowflake Feature StoreDynamic TablesStreams and TasksSnowpark ML PreprocessingData Metric FunctionsML Lineage

Must-Know Concepts

Feature Store architecture: feature entities (business objects), feature views (transformation logic), managed vs external feature tables, and automated incremental refresh
Point-in-time feature lookups: how Feature Store ensures training data does not leak future information when generating historical feature sets
Online vs offline feature serving: online for low-latency real-time inference, offline for training dataset generation and batch scoring
Dynamic Tables for feature pipelines: declarative SQL transformations with automatic refresh, target lag configuration, and DAG chaining for multi-step feature computation
Streams and Tasks for complex feature pipelines: when to use CDC-based streams with task scheduling instead of Dynamic Tables (MERGE operations, stored procedures, custom error handling)
Snowpark ML preprocessing: distributed scalers (StandardScaler, MinMaxScaler), encoders (OneHotEncoder, OrdinalEncoder), and Pipeline objects for chaining transformations
Data quality monitoring with Data Metric Functions: freshness, completeness, and accuracy checks on feature tables
ML Lineage for feature traceability: tracking data flow from source tables through feature definitions to training datasets and models
Feature engineering patterns: group-by aggregations, window functions, pivot operations, and join strategies using Snowpark DataFrames
Incremental feature computation: how Dynamic Tables and Streams enable efficient processing of only new or changed data

Common Exam Traps

Dynamic Tables handle refresh timing automatically based on target lag. You do NOT need to write scheduling logic — that is the Streams+Tasks approach

Feature Store point-in-time lookups prevent data leakage. If a question describes training data that accidentally includes future information, the answer involves point-in-time correctness

Online feature serving requires additional infrastructure setup beyond standard Feature Store configuration. Not all features automatically get online serving

Snowpark ML preprocessing runs distributed on the warehouse. It is NOT the same as running pandas locally — it scales across Snowflake compute nodes

Data Metric Functions monitor data quality, not model quality. Model quality monitoring is ML Observability

Quick Check: Operationalize Data Preparation and Feature Engineering

Question 1 of 3

A team needs to generate a training dataset using historical feature values without accidentally including data from after each training example's timestamp. Which Snowflake Feature Store capability should they use?

Domain 224% of exam

MLOps Infrastructure and Management

The heaviest domain at 24%, covering the infrastructure and management layer for ML operations in Snowflake. This includes Snowpark Container Services compute pools, Model Registry, Experiment Tracking, Snowpark ML model training, hyperparameter tuning, and the Snowflake ML platform architecture. Master the relationship between warehouses, SPCS, and Container Runtime.

Key Topics

SPCS Compute PoolsModel RegistryExperiment TrackingSnowpark ML ModelingContainer RuntimeML JobsSnowflake Notebooks

Must-Know Concepts

SPCS compute pool architecture: instance families (CPU_X64_S/M/L for CPU, GPU_NV_S/M for A10G workloads, GPU_NV_L for A100 workloads), min/max node autoscaling, and workload-to-pool matching
GPU support in SPCS: A10G and A100 NVIDIA GPUs for model training, fine-tuning, and inference. Know when GPU compute is necessary vs CPU-sufficient
Model Registry as first-class schema objects: model versioning, default version designation, metadata storage, and deployment to warehouse or SPCS
Experiment Tracking: logging hyperparameters, metrics, and artifacts during training runs for comparison and model selection
Snowpark ML model training: built-in wrappers for scikit-learn, XGBoost, and LightGBM that run natively in Snowflake without UDF creation
Distributed hyperparameter optimization: GridSearchCV and RandomSearchCV execution across multiple warehouse or SPCS nodes using UDTFs
Container Runtime for Notebooks: preconfigured ML software stacks (PyTorch, TensorFlow, scikit-learn) on CPU or GPU compute pools, extensible with additional packages
ML Jobs for production training: scheduling model retraining, integrating with external IDEs, and running on Container Runtime
Warehouse compute vs SPCS: warehouse for SQL inference and supported framework training; SPCS for custom containers, arbitrary packages, GPU workloads, and HTTP serving endpoints
Model artifact management: storing, versioning, and comparing trained models and their associated metadata in the Model Registry

Common Exam Traps

Warehouse compute supports Snowpark ML model training for scikit-learn, XGBoost, and LightGBM. You do NOT always need SPCS for model training

SPCS compute pools autoscale between min and max nodes. You configure the range; Snowflake handles the scaling based on workload

Distributed hyperparameter tuning uses UDTFs to parallelize across nodes. Each hyperparameter combination runs as a separate task

Model Registry models are schema-level objects with full Snowflake governance. They are not stored in stages or external locations

Container Runtime for Notebooks is different from SPCS. Container Runtime runs notebooks; SPCS runs services and jobs. Know which to use for each scenario

Quick Check: MLOps Infrastructure and Management

Question 1 of 3

An ML team needs to train a large PyTorch model that requires multiple A100 GPUs and custom Python packages not available in standard Snowflake warehouses. Which compute option should they use?

Domain 318% of exam

Model Serving and Deployment Operations

This domain covers deploying trained models to production for both batch and real-time inference. Key topics include Model Serving on SPCS, deployment automation, inference endpoint management, autoscaling configuration, A/B testing strategies, and the operational aspects of running models in production including rollback and blue-green deployment patterns.

Key Topics

Model ServingSPCS EndpointsModel Registry DeploymentBatch InferenceReal-Time InferenceAutoscaling

Must-Know Concepts

Model Serving architecture: Model Registry model deployed to SPCS as a managed HTTP endpoint. Snowflake automates container image building, deployment, and endpoint setup
Batch vs real-time inference: batch runs on warehouse or SPCS jobs for bulk processing; real-time deploys as HTTP endpoints on SPCS for individual predictions with autoscaling
Deployment process: register model in Model Registry, configure compute pool, deploy to SPCS, verify endpoint health, route traffic
Autoscaling for Model Serving: configure min/max nodes in SPCS compute pools. Snowflake scales based on request load automatically
GPU selection for inference: A10G GPUs for standard inference workloads, A100 GPUs for large models requiring more memory and compute
Model version management: default version designation in Model Registry, deploying specific versions, and rolling back to previous versions
Inference on warehouse vs SPCS: warehouse for SQL-integrated batch predictions with supported models; SPCS for custom containers, arbitrary packages, and HTTP endpoints
Multi-modal batch inference: SPCS job-based batch inference supporting GPU acceleration across multimodal datasets
Deployment patterns: blue-green deployments using multiple Model Serving endpoints, canary releases by splitting traffic between model versions
Endpoint observability: monitoring inference latency, throughput, error rates, and resource utilization for deployed models

Common Exam Traps

Model Serving ONLY runs on SPCS, not on warehouses. If a question asks about HTTP endpoints for model inference, the answer involves SPCS

Autoscaling is configured at the compute pool level, not the model level. You set min/max nodes on the compute pool, not on individual model deployments

Snowflake automates container image building for Model Serving. You do NOT need to build Docker images manually when deploying from Model Registry

Batch inference on SPCS uses jobs (finite execution), while Model Serving uses services (long-running). Know the difference between SPCS jobs and services

Real-time Model Serving endpoints have autoscaling but also have cold-start latency when scaling from zero. Plan for warm-up in production scenarios

Quick Check: Model Serving and Deployment Operations

Question 1 of 3

A production application requires sub-second predictions from a custom TensorFlow model. The model needs to handle variable traffic with automatic scaling. How should the team deploy this model?

Domain 422% of exam

Pipeline Orchestration and Automation (CI/CD)

The second-heaviest domain at 22%, covering end-to-end ML pipeline orchestration and CI/CD automation. Topics include ML Jobs, Task Graphs, scheduled notebooks, Git integration, Snowflake CLI GitHub Actions, external orchestrators (Airflow, Prefect, Dagster), version control for ML artifacts, and automated deployment workflows.

Key Topics

ML JobsTask GraphsSnowflake CLIGit IntegrationGitHub ActionsAirflowScheduled Notebooks

Must-Know Concepts

ML Jobs for pipeline orchestration: scheduling retraining pipelines, connecting multiple steps, and running on Container Runtime with support for external IDE development
Task Graphs (DAGs): chaining Snowflake Tasks using AFTER clauses to create multi-step workflows with CRON scheduling and dependency management
CI/CD with Git integration: connecting Snowflake Notebooks and code to Git repositories (GitHub, GitLab) for version control and collaborative development
Snowflake CLI GitHub Actions: automating deployment pipelines triggered by Git events (merge to main, release tags) to deploy updated DAGs and ML artifacts
External orchestrator integration: using Airflow, Prefect, or Dagster with ML Jobs when organizations have existing orchestration infrastructure
Pipeline stages: data ingestion, feature engineering, model training, model validation, model registration, model deployment, and monitoring — each as an orchestrated step
Version control for ML artifacts: tracking notebook code, model configurations, feature definitions, and pipeline definitions in Git
Automated model retraining: scheduling periodic retraining pipelines triggered by data drift detection or calendar schedules
Deployment automation: from code merge to production deployment using CI/CD pipelines that validate, test, and deploy ML models
Error handling in pipelines: task retry policies, failure notifications, and graceful degradation patterns in production ML pipelines

Common Exam Traps

ML Jobs and Task Graphs serve different purposes. ML Jobs run ML workloads on Container Runtime; Task Graphs orchestrate dependencies between tasks. They can work together

Snowflake CLI GitHub Actions automate CLI commands as part of CI/CD. They are NOT Snowflake Tasks — they run in GitHub, not Snowflake

External orchestrators (Airflow) connect TO Snowflake but run OUTSIDE it. They trigger ML Jobs or Tasks but do not replace Snowflake's internal scheduling

Git integration for Notebooks enables version control but does NOT automatically deploy changes. CI/CD pipelines with GitHub Actions handle automated deployment

Task Graph failures can be configured to halt downstream tasks or continue. Know the default behavior and how to configure failure handling

Quick Check: Pipeline Orchestration and Automation (CI/CD)

Question 1 of 3

An organization wants to automatically redeploy updated ML pipeline code whenever a data engineer merges changes to the main branch in GitHub. Which combination of Snowflake features enables this?

Domain 516% of exam

Governance, Security and Monitoring

This domain covers securing, governing, and monitoring ML systems in production. Topics include RBAC for ML artifacts, dynamic data masking, Snowflake Horizon governance suite, ML Observability for drift detection and performance monitoring, ML Lineage for compliance, cost management for ML workloads, and audit logging for regulatory requirements.

Key Topics

RBACDynamic Data MaskingSnowflake HorizonML ObservabilityML LineageML ExplainabilityCost Management

Must-Know Concepts

RBAC for ML artifacts: securing models, feature tables, compute pools, and inference endpoints using Snowflake's role hierarchy (access roles + functional roles)
Dynamic data masking for ML: applying column-level masking policies to ensure sensitive training data is not exposed to unauthorized roles
Snowflake Horizon governance suite: universal data discovery, Trust Center for security posture, data classification for sensitive data identification
ML Observability: configuring model monitors for regression and binary classification models, tracking performance metrics, detecting data drift using Difference of Means, and setting alerts
Drift detection: understanding data drift (input distribution changes) vs concept drift (relationship between inputs and outputs changes), and how Snowflake detects each
ML Lineage for compliance: tracing data from source through features, datasets, and models for audit trails, reproducibility, and regulatory requirements
ML Explainability with Shapley values: computing feature importance scores for model interpretability and transparency requirements
Cost management for ML workloads: monitoring compute costs across warehouses, SPCS compute pools, and serverless functions. Understanding credit consumption patterns
Audit logging: tracking who accessed which models, when inference was run, and what data was used for training — critical for regulated industries
Data Clean Rooms: privacy-preserving collaboration environments for secure cross-organizational ML without exposing raw data

Common Exam Traps

ML Observability currently supports regression and binary classification models only. Multi-class classification and other model types are not yet supported for automated monitoring

Data drift and concept drift are different. Data drift means the INPUT distribution changed. Concept drift means the relationship between inputs and outputs changed. Both degrade model performance but require different responses

Dynamic data masking applies to data queries, not model weights. Masking protects sensitive data in feature tables and training datasets, not the model parameters themselves

ML Lineage tracks artifacts within Snowflake. If models are trained externally and imported, lineage may not capture the full pipeline

Cost management for SPCS compute pools is separate from warehouse credit consumption. Know how GPU pools are billed differently from CPU warehouses

Quick Check: Governance, Security and Monitoring

Question 1 of 3

A production regression model's prediction accuracy has been declining over the past week. An ML engineer suspects the input data distribution has shifted. Which Snowflake ML capability should they use to confirm this?

Snowflake ML Concepts You Must Not Confuse

These pairs appear on nearly every exam. Learn the difference and you'll avoid the most common traps.

Dynamic Tables vs Streams + Tasks

Use Dynamic Tables when…

Declarative, SQL-based data transformations where you define the desired result. Snowflake handles refresh timing, dependency ordering, and incremental processing automatically.

Use Streams + Tasks when…

Imperative pipeline approach giving full control over execution logic. Required for stored procedures, MERGE operations, external function calls, custom retry logic, and explicit CRON scheduling.

Exam trap

Dynamic Tables are simpler and preferred for straightforward transformations. Streams+Tasks are needed when you require procedural logic, MERGE statements, or custom error handling. The exam tests whether you know which approach fits a given scenario.

Warehouse Compute vs SPCS Compute Pools

Use Warehouse Compute when…

Standard Snowflake virtual warehouses for SQL-based inference, Snowpark ML model training with supported frameworks, and Cortex ML functions. Limited to pre-approved packages.

Use SPCS Compute Pools when…

Container-based compute with GPU support (A10G, A100) for custom model training, real-time model serving via HTTP endpoints, and workloads requiring arbitrary Python packages or distributed GPU clusters.

Exam trap

Warehouse compute is sufficient for many ML workloads and is simpler to manage. SPCS is needed for GPU workloads, custom container images, real-time serving endpoints, and packages not available in warehouses. The exam tests your ability to choose the right compute for a scenario.

Feature Store vs Manual Feature Pipelines

Use Feature Store when…

Managed feature infrastructure with versioned feature definitions, automated incremental refresh, point-in-time lookups, online serving for low-latency inference, and ML Lineage integration.

Use Manual Feature Pipelines when…

Custom-built feature engineering using Snowpark DataFrames, SQL views, or Dynamic Tables without the Feature Store abstraction layer. No built-in versioning, lineage, or online serving.

Exam trap

Feature Store provides governance, lineage, and reproducibility out of the box. Manual pipelines offer more flexibility but lose automated versioning and point-in-time correctness. The exam tests whether you understand the operational benefits of Feature Store vs manual approaches.

Batch Inference vs Real-Time Inference (Model Serving)

Use Batch Inference when…

Process large datasets in bulk using warehouse compute or SPCS job-based execution. Best for periodic scoring, reporting, and offline predictions where latency is not critical.

Use Real-Time Inference (Model Serving) when…

Deploy models as managed HTTP endpoints on SPCS with autoscaling. Best for interactive applications, APIs, and use cases requiring sub-second response times.

Exam trap

Batch inference runs on warehouse or SPCS jobs — it processes data in bulk. Real-time inference requires Model Serving on SPCS with dedicated endpoints. The exam tests when to use each and the infrastructure implications of each choice.

ML Jobs vs Scheduled Notebooks

Use ML Jobs when…

Production-grade ML pipeline orchestration supporting external IDE integration, Container Runtime execution, and integration with Task Graphs for complex DAG workflows.

Use Scheduled Notebooks when…

Lightweight scheduling of Snowflake Notebooks for simpler ML workflows. Good for exploratory-to-production transitions and workflows that fit within a single notebook.

Exam trap

ML Jobs are designed for production pipelines with multi-step dependencies and CI/CD integration. Scheduled Notebooks are simpler but limited to single-notebook execution. The exam tests which orchestration method suits production vs exploratory workflows.

Cortex ML Functions vs Custom Snowpark ML Models

Use Cortex ML Functions when…

Pre-built, SQL-callable ML functions for common tasks: forecasting, anomaly detection, and classification. No model training code needed — just provide data and call the function.

Use Custom Snowpark ML Models when…

Custom models built with scikit-learn, XGBoost, LightGBM, or PyTorch using Snowpark ML APIs. Full control over model architecture, training process, and hyperparameters.

Exam trap

Cortex ML functions are fast to deploy and require no ML expertise, but are limited to supported task types. Custom models offer complete flexibility but require ML engineering effort. The exam tests when each approach is appropriate.

Online Feature Serving vs Offline Feature Serving

Use Online Feature Serving when…

Low-latency feature retrieval for real-time inference use cases. Features are served from an optimized store designed for individual record lookups at millisecond latency.

Use Offline Feature Serving when…

Batch feature retrieval for model training and batch inference. Features are read from Snowflake tables optimized for large-scale scans and joins.

Exam trap

Online serving is for real-time applications that need individual feature vectors quickly. Offline serving is for training datasets and batch scoring. The exam tests whether you can match the serving mode to the use case and understand the infrastructure differences.

Top Mistakes to Avoid

Confusing Dynamic Tables (declarative, automatic refresh) with Streams+Tasks (imperative, manual scheduling) — Dynamic Tables cannot do MERGE or stored procedures

Thinking all model training requires SPCS — standard warehouse compute supports scikit-learn, XGBoost, and LightGBM training via Snowpark ML

Mixing up Model Serving (HTTP endpoints on SPCS for real-time inference) with batch inference on warehouses — they serve different use cases

Confusing data drift (input distribution changes) with concept drift (input-output relationship changes) — both degrade models but require different interventions

Assuming ML Observability supports all model types — currently only regression and binary classification models are supported for automated monitoring

Thinking Feature Store online serving is automatically enabled — it requires additional configuration beyond standard feature table setup

Confusing ML Jobs (production ML pipeline orchestration) with Snowflake Tasks (general SQL/Python task scheduling) — ML Jobs run on Container Runtime, Tasks run on warehouses

Not understanding that Snowflake CLI GitHub Actions run in GitHub, not in Snowflake — they trigger Snowflake operations as part of external CI/CD pipelines

Treating ML Lineage and ML Observability as the same thing — Lineage traces data provenance, Observability monitors production performance and drift

Forgetting that SPCS compute pool costs are separate from warehouse credits — GPU pools especially have different billing patterns

Exam-Ready Checklist

Can explain all 5 exam domains and their relative weights (20%, 24%, 18%, 22%, 16%)

Understand the full Snowflake ML platform: Feature Store, Model Registry, ML Jobs, ML Observability, ML Lineage, ML Explainability, and Experiment Tracking

Can choose between Dynamic Tables and Streams+Tasks for feature engineering pipelines based on requirements

Know when to use warehouse compute vs SPCS compute pools vs Container Runtime for training and inference

Understand Feature Store online vs offline serving and point-in-time lookups for training data generation

Can explain Model Serving deployment process: Model Registry to SPCS endpoint with autoscaling

Know CI/CD patterns: Git integration, Snowflake CLI GitHub Actions, and external orchestrator integration with ML Jobs

Can distinguish between batch inference (warehouse or SPCS jobs) and real-time inference (Model Serving HTTP endpoints)

Understand RBAC for ML artifacts: access roles, functional roles, masking policies, and Snowflake Horizon governance

Can explain ML Observability: drift detection methods, supported model types, alert configuration, and the difference between data drift and concept drift

Know ML Lineage: end-to-end traceability from source data through features, datasets, and models

Understand distributed hyperparameter optimization with GridSearchCV across multi-node warehouses

Scored 75%+ on at least two full practice exams (750/1000 passing score)

Reviewed all incorrect answers across all five domains with special attention to Domain 2 (24%) and Domain 4 (22%)

Recommended Resources

Free & Official Resources

Snowflake ML Official Documentation

Complete documentation for Snowflake ML platform including Feature Store, Model Registry, ML Observability, ML Jobs, and all ML components.

Official

Snowflake Feature Store Documentation

Official Feature Store documentation covering feature entities, feature views, incremental refresh, online serving, and ML Lineage integration.

Official

ML Observability Documentation

Official guide to monitoring production models, detecting drift, tracking performance metrics, and configuring alerts.

Official

Getting Started with Model Serving in SPCS

Hands-on quickstart guide for deploying models from Model Registry to SPCS HTTP endpoints with autoscaling.

Official

Orchestrate ML Pipelines with ML Jobs and Task Graphs

End-to-end guide for building ML pipeline orchestration with ML Jobs, Task Graphs, and DAG workflows.

Official

Snowflake Builders Blog (Medium)

Deep-dive technical articles from Snowflake engineers covering ML Jobs, SPCS, pipeline orchestration, and ML best practices.

Free

SnowPro Certifications Portal

Official Snowflake certification portal with exam details, registration, and links to study resources.

Official

Paid Courses & Practice Exams

These are recommended if you prefer a structured learning path. They can save time but are not required to pass.

Snowflake Data Engineering Professional Certificate (Coursera)

Official Snowflake professional certificate covering data engineering fundamentals, Snowpark, and pipeline development on Snowflake.

Paid

Snowflake Hands-On Essentials (Snowflake Learning)

Official Snowflake learning platform with structured courses, hands-on labs, and guided learning paths for ML and data engineering.

Paid

MLA-B01 Study Guide

You Can Pass This Exam For Free

Choose Your Study Path

Exam Overview

Topic Priority Table

Operationalize Data Preparation and Feature Engineering

Key Topics

Must-Know Concepts

Common Exam Traps

MLOps Infrastructure and Management

Key Topics

Must-Know Concepts

Common Exam Traps

Model Serving and Deployment Operations

Key Topics

Must-Know Concepts

Common Exam Traps

Pipeline Orchestration and Automation (CI/CD)

Key Topics

Must-Know Concepts

Common Exam Traps

Governance, Security and Monitoring

Key Topics

Must-Know Concepts

Common Exam Traps

Snowflake ML Concepts You Must Not Confuse

Top Mistakes to Avoid

Exam-Ready Checklist

Recommended Resources

Free & Official Resources

Paid Courses & Practice Exams

Frequently Asked Questions