SnowflakeMLA-B015 domains

MLA-B01 Exam Notes

Last-minute traps, must-know facts, and scenario tips for the SnowPro Advanced: MLOps Engineer exam.

General Exam Tips

1.Every question is scenario-based — do not look for a single keyword match. Read the full scenario to identify the CONSTRAINT that eliminates wrong answers.
2.Flag and skip questions where you are genuinely stuck. Come back after answering questions you are confident about — seeing more questions often jogs your memory.
3.No penalty for wrong answers. Answer every single question even if you have to guess.
4.Passing score is 750 out of 1000. You need roughly 75-80% correct — you can miss 15-18 questions and still pass.
5.Domains 2 (24%) and 4 (22%) together make up 46% of the exam. Getting those right is the single biggest lever on your score.
6.Multiple-select questions often have 2 or 3 correct answers. Do not select more or fewer than indicated — partial credit is not given.
7.When two answers both sound correct, look for a CONSTRAINT in the scenario (requires GPU, needs HTTP endpoint, needs MERGE, requires sub-second latency) that eliminates one.

Quick Navigation

Operationalize Data Preparation and Feature Engineering MLOps Infrastructure and Management Model Serving and Deployment Operations Pipeline Orchestration and Automation (CI/CD)Governance, Security and Monitoring

Domain 120% of exam

Operationalize Data Preparation and Feature Engineering

Must-Know Facts

Feature Store uses ENTITIES (business object identifiers, e.g., user_id) and FEATURE VIEWS (transformation logic as SQL or DataFrames) as its two core building blocks.
Managed feature views are refreshed automatically by Snowflake on a schedule. External feature views are maintained by outside tools like dbt — Snowflake does NOT refresh them.
Point-in-time lookups use ASOF JOIN under the hood to retrieve feature values that were available at each training example's timestamp, preventing data leakage. This is only relevant when you provide a spine_timestamp_col.
Online feature serving is NOT enabled by default. It requires explicit setup and creates a separate low-latency serving infrastructure. Not all feature views automatically get an online store.
Dynamic Tables use TARGET_LAG to define the acceptable staleness. Snowflake handles scheduling — you never write a Task to refresh a Dynamic Table.
Streams + Tasks are required whenever the pipeline needs: MERGE statements, stored procedures, external function calls, custom retry logic, or explicit CRON scheduling with procedural branching.
Snowpark ML preprocessing (StandardScaler, OneHotEncoder, Pipeline) runs DISTRIBUTED across warehouse nodes — it does NOT require SPCS.
Data Metric Functions (DMFs) measure data quality: freshness, completeness, accuracy. They run on schedules and produce quality scores for feature tables, not model performance scores.
ML Lineage automatically traces: source data -> feature views -> training datasets -> registered models. This traceability is what enables compliance audits and reproducibility.

Common Traps

TrapDynamic Tables handle MERGE operations for feature updates.

RealityDynamic Tables are declarative and only support SELECT-based transformations. MERGE, INSERT, and stored procedures are not supported inside a Dynamic Table definition. Use Streams + Tasks for those patterns.

TrapAll feature views in the Feature Store automatically have both online and offline serving.

RealityOnline serving requires explicit configuration and creates separate infrastructure. By default, Feature Store provides offline (batch) access only. You must explicitly enable online serving.

TrapPoint-in-time correctness is the same as freshness monitoring.

RealityPoint-in-time lookups prevent training data leakage by ensuring features reflect values available AT each training timestamp. Freshness monitoring (done via DMFs) measures how current the data is. These are different problems.

TrapSnowpark ML preprocessing requires Snowpark Container Services to run distributed.

RealitySnowpark ML preprocessing (scalers, encoders, Pipeline) runs distributed on STANDARD VIRTUAL WAREHOUSES. SPCS is not needed. Only custom container dependencies or GPU workloads require SPCS.

TrapDMFs monitor model performance and detect drift.

RealityDMFs check DATA QUALITY dimensions (freshness, completeness, accuracy of the data itself). Drift detection and model performance monitoring are done by ML Observability, which is a separate system.

Confusing Pairs

Dynamic TablesStreams + Tasks

Dynamic Tables = declarative SQL SELECT that Snowflake refreshes automatically based on TARGET_LAG. No scheduling code needed, no procedural logic allowed. Streams + Tasks = imperative pipeline where YOU control the schedule (CRON) and execution logic (stored procedures, MERGE, custom branching). Choose Dynamic Tables when the transformation is a simple SELECT. Choose Streams + Tasks when you need MERGE, stored procs, external calls, or conditional logic.

Managed Feature ViewsExternal Feature Views

Managed = Snowflake owns the refresh cycle, handles incremental updates automatically. External = third-party tool (dbt, Spark) writes features into Snowflake tables, Feature Store just reads them for serving and lineage. Exam tests which type is appropriate when a team already uses dbt for transformation vs when they want Snowflake to own the pipeline.

Online Feature ServingOffline Feature Serving

Online = low-latency retrieval of individual feature vectors for real-time inference (millisecond lookups). Requires extra infrastructure setup. Offline = batch reads from Snowflake tables for training dataset generation and batch scoring. Offline is the default. The exam tests which to use when a question mentions 'real-time predictions' vs 'training dataset' or 'batch scoring'.

Data Metric Functions (DMFs)ML Observability

DMFs = data quality checks (is the feature table fresh? are there nulls? are values in valid range?). They run on FEATURE TABLES. ML Observability = production model monitoring (is the prediction distribution drifting? is model accuracy degrading?). It runs on MODEL OUTPUTS. Both can fire alerts but they watch completely different things.

Scenario Tips

If the question asks about:

When the question describes training data that accidentally includes future information (labels or features from after the training timestamp)...

Answer:

The fix is point-in-time feature lookups via Feature Store with spine_timestamp_col. This uses ASOF JOIN to ensure only historically available feature values are included.

Distractor to avoid:

Dynamic Table target lag might seem relevant because it controls freshness, but it does not control WHICH point-in-time values are returned during training dataset generation.

If the question asks about:

When the question asks about a feature pipeline that must update a staging table using MERGE and then trigger downstream processing only when specific conditions are met...

Answer:

Streams + Tasks with a stored procedure. Streams detect new/changed rows; Tasks execute the stored procedure that performs MERGE and conditional logic.

Distractor to avoid:

Dynamic Tables cannot perform MERGE. Even though they 'automatically refresh,' they only support SELECT transformations.

If the question asks about:

When the question asks how to monitor whether input data to a production model has shifted distribution compared to training...

Answer:

ML Observability drift detection. It compares inference-time input distributions against baseline (training) distributions using Difference of Means.

Distractor to avoid:

DMFs check data quality but cannot compare distributions between training baseline and production inference data.

Last-Minute Facts

1Dynamic Tables: TARGET_LAG defines max staleness. Set in seconds, minutes, hours, or days.

2Feature Store entities: defined with a name and a join_key (the business identifier column).

3Point-in-time lookups require spine_timestamp_col in retrieve_feature_values().

4Snowpark ML Pipeline: chained transformers from snowflake.ml.modeling.preprocessing.

5DMF dimensions: freshness (timeliness), completeness (null rate), accuracy (constraint validation).

Domain 224% of exam

MLOps Infrastructure and Management

Must-Know Facts

Model Registry models are SCHEMA-LEVEL OBJECTS. They are not stored in stages, not in external locations — they live in a Snowflake schema and have full RBAC applied.
A model can have up to 1,000 versions. Each version is unique by name. System aliases DEFAULT, FIRST, and LAST are always available and cannot be overridden.
USAGE privilege on a model = warehouse-only inference + no access to internal details or artifacts. READ privilege = SPCS inference + metadata visibility (comments, tags, metrics, artifacts).
Model artifacts reside on internal stages accessible via snow:// URLs, not standard stage paths. Only model OWNERS can access artifacts.
SPCS compute pool instance families: CPU_X64_S/M/L for CPU workloads. GPU_NV_S = 1x A10G, GPU_NV_M = 4x A10G (standard GPU). GPU_NV_L = 8x A100 (large model GPU). Know which GPU count and type for which workload size.
Container Runtime for Notebooks is a curated ML environment running on SPCS. It is not the same as creating your own SPCS service. It runs notebooks; it does not run long-lived services.
Warehouse compute supports Snowpark ML training for scikit-learn, XGBoost, and LightGBM. You do NOT need SPCS just because you are training an ML model.
Distributed hyperparameter tuning (GridSearchCV, RandomSearchCV) uses UDTFs to parallelize each hyperparameter combination across multiple warehouse nodes.
Experiment Tracking logs hyperparameters, metrics, and artifacts during training runs. It feeds into Model Registry for model selection — they work together.
The target_platforms argument in log_model() determines WHERE a model can be deployed (WAREHOUSE vs SNOWPARK_CONTAINER_SERVICES). Setting the wrong target platform causes log_model() to fail if dependencies conflict.

Common Traps

TrapYou always need SPCS to train ML models in Snowflake.

RealityStandard virtual warehouses support training scikit-learn, XGBoost, and LightGBM models via Snowpark ML APIs. SPCS is only needed for: custom Docker containers, packages not in the Anaconda channel, GPU workloads, or PyTorch/TensorFlow at scale.

TrapContainer Runtime for Notebooks is a separate option from SPCS.

RealityContainer Runtime FOR NOTEBOOKS runs on SPCS compute pools under the hood. However, it is a managed/curated experience — you don't define OCI images yourself. The key distinction: Container Runtime runs notebooks and ML jobs; SPCS runs custom services and job services you define via YAML.

TrapUSAGE privilege on a model allows SPCS-based inference.

RealityUSAGE grants warehouse inference ONLY. For SPCS inference, the grantee needs the READ privilege. This is a hard rule that frequently appears in exam access control questions.

TrapModel Registry is like an MLflow server running inside Snowflake.

RealityModel Registry models are schema-level database objects with Snowflake governance, not a separate MLflow tracking server. They integrate with Experiment Tracking, but the registry itself is a native Snowflake construct, not a hosted MLflow service.

TrapNotebooks on Container Runtime only incur SPCS compute costs.

RealityWhen a Notebook uses Container Runtime (SPCS) for Python execution, you can ALSO incur warehouse compute costs for SQL execution and UI rendering. You may be billed for both simultaneously.

Confusing Pairs

Warehouse ComputeSPCS Compute Pools

Warehouse = virtual compute for SQL, Snowpark Python, and supported ML framework training (sklearn, XGBoost, LightGBM). Billed in Snowflake credits per second. SPCS = containerized compute for custom images, arbitrary packages, GPU workloads, and HTTP service endpoints. Billed separately from warehouse credits. Use warehouse for standard ML training and SQL inference. Use SPCS when you need GPUs, custom containers, real-time HTTP endpoints, or packages outside the Anaconda channel.

Container Runtime for NotebooksSPCS Job Services

Container Runtime = managed, curated ML environment for Snowflake Notebooks. Pre-built images with PyTorch, TensorFlow, sklearn. No custom OCI image required. SPCS Job Services = arbitrary containerized jobs using your own OCI images, defined via YAML spec. Container Runtime is the easy path for notebook-based training; SPCS Job Services are for production pipelines requiring custom environments.

USAGE Privilege (Model)READ Privilege (Model)

USAGE = run the model for WAREHOUSE inference only, cannot see internals, cannot access artifacts. READ = run for SPCS inference AND see metadata (comments, tags, metrics). For exam questions about granting inference access: if inference is warehouse-only, USAGE is sufficient. If SPCS inference is needed, READ is required.

Experiment TrackingModel Registry

Experiment Tracking = records results of training RUNS (hyperparameters, metrics logged during training). Used to COMPARE experiments and SELECT the best model. Model Registry = stores FINISHED, registered models as versioned schema objects for deployment and governance. Tracking feeds INTO the registry — you log experiments during training, then register the winner.

Scenario Tips

If the question asks about:

When the question asks about training a PyTorch model requiring multiple A100 GPUs and packages not available in Snowflake's Anaconda channel...

Answer:

SPCS compute pool with GPU_NV_L instance family (A100). Custom packages require a custom OCI image deployed as an SPCS service or job.

Distractor to avoid:

Container Runtime for Notebooks is tempting but is a curated environment. It does not support fully custom OCI images or multi-node distributed GPU training at production scale.

If the question asks about:

When a question asks which privilege allows a data scientist to view model metrics and use the model for SPCS inference without revealing model weights...

Answer:

READ privilege. It grants SPCS inference + metadata visibility (metrics, tags, comments). OWNERSHIP is not needed and would be overly permissive.

Distractor to avoid:

USAGE grants warehouse inference only. OWNERSHIP grants full control. READ is the specific privilege designed for this access pattern.

If the question asks about:

When a scenario describes registering a scikit-learn model and running batch predictions on a 50-million-row table...

Answer:

Virtual warehouse inference (batch). Scikit-learn models registered in Model Registry can run inference on warehouses directly — no SPCS needed for batch scoring with supported frameworks.

Distractor to avoid:

Model Serving on SPCS adds unnecessary cost and complexity for batch workloads with standard packages.

Last-Minute Facts

1Model Registry limits: 1,000 versions per model, 10 methods per version, 15 GB max model size (warehouse), 100 KB metadata ceiling.

2GPU_NV_S = 1x A10G. GPU_NV_M = 4x A10G. GPU_NV_L = 8x A100. GPU_NV_S and GPU_NV_M are both A10G families; distinguish them by GPU count when sizing multi-GPU workloads.

3log_model() target_platforms: 'WAREHOUSE' or 'SNOWPARK_CONTAINER_SERVICES'. Wrong value causes failure if dependencies conflict.

4System model aliases: DEFAULT (current default), FIRST (oldest by creation), LAST (newest by creation).

5ML Observability: max 250 model monitors per account, max 500 features per monitor.

Domain 318% of exam

Model Serving and Deployment Operations

Must-Know Facts

Model Serving creates managed HTTP endpoints on SPCS. It ONLY runs on SPCS — there is no warehouse-based HTTP endpoint for real-time inference.
Snowflake automates container image building and deployment when you deploy from Model Registry to Model Serving. You do NOT build Docker images manually.
Autoscaling is configured at the COMPUTE POOL level (min_nodes, max_nodes), not at the model or endpoint level.
SPCS SERVICES are long-running (for real-time Model Serving endpoints). SPCS JOBS are finite-duration runs (for batch inference). Know which to use for each pattern.
Batch inference for large datasets uses virtual warehouses or SPCS jobs. Real-time inference uses Model Serving (SPCS service with HTTP endpoint).
Model version rollback: set the previous version as default in Model Registry and redeploy. The Model Registry maintains multiple versions exactly for this purpose.
Blue-green deployment: run two Model Serving endpoints simultaneously (one per version), then redirect traffic. Canary: split traffic between versions during gradual rollout.
A10G GPUs: GPU_NV_S (1x A10G) for single-GPU inference, GPU_NV_M (4x A10G) for heavier concurrent workloads. A100 GPUs: GPU_NV_L (8x A100) for very large models requiring maximum memory. A10G is sufficient for most standard inference workloads.
Cold-start latency exists when a Model Serving endpoint scales from zero. Plan for warm-up time or configure minimum node count to keep at least one node hot.

Common Traps

TrapBatch inference always requires SPCS.

RealityBatch inference for standard frameworks (sklearn, XGBoost) runs on virtual warehouses. SPCS is only needed for batch inference when: custom containers, GPU acceleration, or very large model sizes that exceed warehouse capacity.

TrapAutoscaling is configured on the model deployment, not the compute pool.

RealityAutoscaling parameters (min_nodes, max_nodes) are set on the SPCS compute pool itself. All services and jobs using that pool share the autoscaling configuration. You cannot set autoscaling per model.

TrapModel Serving endpoints can also be deployed on virtual warehouses for high-availability.

RealityModel Serving HTTP endpoints ONLY run on SPCS. There is no option to deploy real-time HTTP serving endpoints on virtual warehouses.

TrapRolling back a model requires retraining.

RealityModel Registry keeps all versions. Rollback means changing the default version pointer to a previous version and redeploying. No retraining required. This is the fastest recovery path.

Confusing Pairs

Batch InferenceReal-Time Inference (Model Serving)

Batch = process many rows in bulk on warehouse or SPCS jobs. Scheduled execution, higher throughput, latency-insensitive. Real-Time = HTTP endpoint on SPCS, individual prediction requests, sub-second latency requirement. Key decision factor: does the application need an immediate response to a single prediction request? Yes = real-time Model Serving. No = batch on warehouse.

SPCS ServicesSPCS Jobs

SPCS Services = long-running containers that stay up and handle requests (used for Model Serving HTTP endpoints). SPCS Jobs = finite-duration containers that run to completion and exit (used for batch inference, training runs). Model Serving uses Services. Batch GPU inference uses Jobs.

Blue-Green DeploymentCanary Release

Blue-Green = two complete environments running simultaneously; traffic switches entirely from old to new at a defined moment. Fast rollback by switching back. Canary = both versions receive traffic simultaneously with a percentage split (e.g., 5% to new version). Used for gradual rollout with production validation. Both use multiple Model Serving endpoints.

Scenario Tips

If the question asks about:

When a question asks about deploying a model for sub-second predictions with variable traffic that can spike 10x during business hours...

Answer:

Model Serving on SPCS with autoscaling compute pool (configure max_nodes to handle peak load). The HTTP endpoint handles variable traffic and autoscaling manages node count.

Distractor to avoid:

Virtual warehouse batch inference cannot provide sub-second latency for individual prediction requests. Scheduled tasks cannot respond to real-time traffic spikes.

If the question asks about:

When a production model shows accuracy degradation after a new version was deployed and the team needs the fastest possible recovery...

Answer:

Set the previous model version as default in Model Registry and redeploy. No retraining needed — the old version is already registered.

Distractor to avoid:

Retraining takes hours or days. Deleting the compute pool causes downtime. Increasing node count addresses capacity, not accuracy.

If the question asks about:

When the question asks about running GPU-accelerated inference on a large custom PyTorch model serving 100 concurrent real-time requests...

Answer:

Model Serving on SPCS with GPU_NV_S or GPU_NV_M compute pool (A10G GPUs). Real-time concurrent requests require the HTTP endpoint provided by Model Serving, not batch jobs.

Distractor to avoid:

SPCS Jobs handle batch inference but terminate after completion — they cannot serve concurrent real-time requests.

Last-Minute Facts

1Model Serving = SPCS Service (long-running). Batch GPU inference = SPCS Job (finite).

2Autoscaling: set min_nodes and max_nodes on the compute pool, not on the model.

3Cold-start: set min_nodes >= 1 to avoid cold-start latency in production.

4A10G: GPU_NV_S = 1x A10G, GPU_NV_M = 4x A10G. A100: GPU_NV_L = 8x A100. Use GPU_NV_S for single-GPU inference; GPU_NV_M for heavier multi-GPU inference workloads.

5Rollback = change Model Registry default version + redeploy. No retraining.

Domain 422% of exam

Pipeline Orchestration and Automation (CI/CD)

Must-Know Facts

ML Jobs are designed for PRODUCTION ML pipeline orchestration on Container Runtime. They support external IDE integration, run on SPCS Container Runtime, and integrate with Task Graphs.
Task Graphs (DAGs) chain Snowflake Tasks using AFTER clauses. Each task runs only when its predecessor succeeds. Root tasks have the CRON schedule; child tasks inherit execution.
Snowflake CLI GitHub Actions run in GITHUB, not in Snowflake. They are triggered by Git events (push, merge, tag) and execute Snowflake CLI commands to deploy artifacts.
Git integration for Notebooks enables version control and collaboration, but does NOT automatically deploy changes. CI/CD pipelines with GitHub Actions are required for automated deployment.
External orchestrators (Airflow, Prefect, Dagster) connect TO Snowflake from OUTSIDE. They trigger ML Jobs or Tasks via Snowflake connectors but do not replace Snowflake's internal scheduling.
Task failure handling: tasks can be configured to either halt the graph (default) or continue downstream tasks despite upstream failures. Know how SUSPEND_TASK_AFTER_NUM_FAILURES works.
Pipeline stages in a proper ML CI/CD: data ingestion → feature engineering → training → validation → registration → deployment → monitoring. Each step is an orchestrated task or job.
ML Jobs vs Scheduled Notebooks: ML Jobs support multi-step production pipelines with Container Runtime and external IDEs. Scheduled Notebooks are simpler, single-notebook scheduling for lighter workloads.

Common Traps

TrapML Jobs and Snowflake Tasks are the same thing — both orchestrate ML pipelines.

RealityML Jobs run on Container Runtime (SPCS) and are designed for ML workloads requiring Python ML libraries, custom packages, and GPU. Tasks run on virtual warehouses for SQL and lightweight Python. They can work together: a Task Graph can trigger ML Jobs as steps.

TrapConnecting a Notebook to a Git repo automatically deploys code changes to production.

RealityGit integration enables version control. Deployment requires a separate CI/CD action — specifically, Snowflake CLI GitHub Actions triggered by Git events (merge to main, release tags). Connection + versioning ≠ automated deployment.

TrapTo use Airflow with Snowflake ML, you must convert all Airflow DAGs to Snowflake Task Graphs.

RealityAirflow remains external and uses Snowflake operators or the Snowflake provider to trigger ML Jobs or Tasks. You do not need to migrate or replace Airflow. Both systems coexist with Airflow as the orchestrator calling Snowflake.

TrapA Task Graph root task runs immediately when a child task completes.

RealityOnly ROOT tasks have CRON schedules. Child tasks run when their predecessor tasks succeed. The root task drives the entire graph on its schedule, not on external events (unless triggered via SYSTEM$TRIGGER_TASK_GRAPH).

Confusing Pairs

ML JobsTask Graphs (DAGs)

ML Jobs = execute ML workloads (training, preprocessing, batch inference) on Container Runtime with Python ML libraries. Task Graphs = orchestrate dependencies between arbitrary SQL/Python steps on warehouse. They are complementary: Task Graphs can include steps that trigger ML Jobs. For pure orchestration, use Task Graphs. For ML compute execution, use ML Jobs.

Snowflake CLI GitHub ActionsSnowflake Tasks

GitHub Actions = CI/CD automation running in GitHub's infrastructure, triggered by Git events, executes Snowflake CLI commands to deploy notebooks, DAGs, and artifacts. Tasks = Snowflake-native scheduled execution inside Snowflake on a CRON or after a predecessor. GitHub Actions are external CI/CD; Tasks are internal scheduling.

Scheduled NotebooksML Jobs

Scheduled Notebooks = single-notebook execution on a schedule (lightweight, no external IDE support, good for exploratory-to-production quick wins). ML Jobs = production-grade, multi-step pipeline orchestration with Container Runtime, external IDE integration, DAG connectivity. Use Scheduled Notebooks for simple recurring workflows. Use ML Jobs for production training pipelines.

External Orchestrators (Airflow)Native Task Graphs

Airflow = runs OUTSIDE Snowflake, triggers Snowflake operations via API/connector, best when organization already has Airflow infrastructure and wants centralized orchestration across multiple platforms. Task Graphs = native Snowflake-only scheduling, simpler setup, no external dependency. Choose external orchestrators when the pipeline spans Snowflake and non-Snowflake systems. Choose Task Graphs for Snowflake-native pipelines.

Scenario Tips

If the question asks about:

When the question describes a scenario where code changes to a notebook should automatically deploy to production when merged to the main branch...

Answer:

Git integration (to connect notebooks to GitHub) PLUS Snowflake CLI GitHub Actions (to trigger deployment on merge). Both are needed.

Distractor to avoid:

Git integration alone only provides version control. It does not trigger deployment. A scheduled Task cannot watch for Git events.

If the question asks about:

When the question describes a 5-step ML pipeline (ingest, feature compute, train, validate, deploy) where each step must only run after the previous succeeds...

Answer:

Task Graph with AFTER clauses chaining 5 tasks. The root task has the CRON schedule. Each subsequent task runs only when its predecessor succeeds.

Distractor to avoid:

Dynamic Tables handle data transformation, not arbitrary pipeline steps. Scheduled Notebooks cannot chain multi-step dependencies with failure gates.

If the question asks about:

When a team uses Airflow for all their orchestration but wants training to run on Snowflake's GPU Container Runtime...

Answer:

Keep Airflow as the orchestrator. Use Airflow's Snowflake provider to trigger ML Jobs that execute on Container Runtime. No migration of Airflow DAGs is needed.

Distractor to avoid:

Converting all Airflow DAGs to Snowflake Task Graphs is unnecessary and disruptive. The exam favors integration over replacement.

Last-Minute Facts

1Root Tasks have CRON schedules. Child Tasks have AFTER clauses. Only root tasks can be manually resumed/suspended.

2SYSTEM$TRIGGER_TASK_GRAPH('<task_name>') manually triggers a task graph without waiting for the CRON schedule.

3SUSPEND_TASK_AFTER_NUM_FAILURES: default is 10 failures before automatic suspension.

4GitHub Actions run in GitHub infrastructure. Snowflake CLI commands execute against Snowflake from GitHub runners.

5ML Jobs run on Container Runtime (SPCS). Tasks run on virtual warehouses.

Domain 516% of exam

Governance, Security and Monitoring

Must-Know Facts

ML Observability ONLY supports regression and binary classification models. Multi-class classification, clustering, and other model types are NOT supported for automated monitoring.
Data drift = INPUT feature distribution changed from training baseline. Concept drift = the RELATIONSHIP between inputs and outputs changed. Both degrade accuracy but need different responses.
ML Observability uses Difference of Means as its drift detection statistical method for comparing inference-time distributions against baseline.
Model monitors have hard limits: 250 monitors per account maximum, 500 features monitored per model maximum, minimum 1-day aggregation window.
Timestamp columns for model monitors MUST be TIMESTAMP_NTZ type. Prediction and actual columns MUST be NUMBER type. Wrong types cause monitor failure.
RBAC for ML: access roles define object-level permissions (SELECT, USAGE, READ on models and feature tables). Functional roles map to job functions (data_scientist, ml_engineer). Best practice: custom functional roles grant to SYSADMIN, not ACCOUNTADMIN.
Dynamic Data Masking applies to COLUMNS in feature tables and training data. It does NOT mask model weights or prediction outputs — those are governed via RBAC on the model object.
ML Lineage traces from source tables through feature views through training datasets through registered models. If a model is trained EXTERNALLY and imported, lineage within Snowflake is partial.
Snowflake Horizon is the governance suite umbrella: data discovery (catalog), Trust Center (security posture monitoring), compliance center, and data clean rooms.
ML Explainability computes Shapley values (SHAP) to explain which features drive individual predictions. Used for governance and regulatory transparency requirements.

Common Traps

TrapML Observability monitors all model types including multi-class classification.

RealityML Observability currently ONLY supports regression and binary classification. This is a tested limitation. Multi-class classification models cannot be automatically monitored with ML Observability.

TrapData drift and concept drift mean the same thing — both refer to the model degrading over time.

RealityData drift = INPUT distribution changed (features look different than at training time). Concept drift = the RELATIONSHIP between inputs and outputs changed (same inputs now produce different correct outputs). A price prediction model that degrades after an economic shock is concept drift. The same model degrading because customer demographics shifted is data drift.

TrapDynamic data masking hides sensitive training data from all users including model owners.

RealityMasking policies are applied by role. Roles with appropriate privileges (e.g., ACCOUNTADMIN or roles with the masking policy applied to them differently) can see unmasked data. Masking is column-level on Snowflake tables — it does not mask model weights or prevent model owners from accessing artifacts.

TrapML Lineage fully tracks models that were trained externally and imported into Model Registry.

RealityML Lineage tracks artifacts created WITHIN Snowflake's ecosystem. When a model is trained externally (e.g., in AWS SageMaker) and then logged into Model Registry, lineage is captured from the point of import forward, not including the external training provenance.

TrapSnowflake Horizon and ML Observability are the same governance tool.

RealityHorizon is the data governance suite (catalog, Trust Center, compliance, clean rooms). ML Observability is the model performance and drift monitoring system specifically for production ML models. Horizon governs DATA; ML Observability monitors MODEL BEHAVIOR.

Confusing Pairs

ML LineageML Observability

ML Lineage = end-to-end traceability of ARTIFACTS (data source -> feature view -> training dataset -> model version). Answers 'what data was used to train this model?' Used for compliance and reproducibility. ML Observability = real-time monitoring of PRODUCTION MODEL BEHAVIOR (performance metrics, drift detection). Answers 'is this deployed model still performing correctly?' They are complementary but serve different governance needs.

Data DriftConcept Drift

Data drift = the statistical distribution of INPUT FEATURES changed from training distribution. Model may still be correct for what it sees, but it's seeing different inputs than expected. Concept drift = the true relationship between inputs and outputs has changed in the real world. The model's learned mapping is now incorrect even for similar inputs. Exam questions will describe a scenario — look for whether INPUTS changed (data drift) or the CORRECT ANSWER changed (concept drift).

USAGE Privilege (Model)READ Privilege (Model)

USAGE = warehouse-only inference, no metadata, no artifacts. READ = SPCS inference + metadata access. In governance scenarios, READ grants more transparency (auditors can see model metrics/tags) while USAGE limits access to bare inference capability.

ML Explainability (Shapley values)ML Observability (Drift Detection)

Explainability = computes feature importance for INDIVIDUAL PREDICTIONS using SHAP values. Answers 'why did the model make THIS prediction?' Used for regulatory transparency. Observability = monitors AGGREGATE model performance over time. Answers 'is the model still working correctly overall?' Use explainability for per-prediction auditing; use observability for ongoing monitoring.

Scenario Tips

If the question asks about:

When a question asks about setting up automated monitoring for a multi-class classification model to detect when predictions degrade...

Answer:

ML Observability does NOT support multi-class classification. The team would need custom monitoring using Snowpark or SQL-based alerting on prediction distributions.

Distractor to avoid:

The exam expects you to know the limitation. Do not select 'configure an ML Observability model monitor' for multi-class classification.

If the question asks about:

When a question asks how to give an auditor read-only access to model metadata (tags, metrics, version history) and the ability to run SPCS inference without exposing model weights...

Answer:

Grant READ privilege on the model to the auditor's role. READ grants SPCS inference + metadata visibility without revealing internal model weights or artifacts.

Distractor to avoid:

OWNERSHIP is too permissive. USAGE grants warehouse inference only and no metadata. READ is the exact privilege for this pattern.

If the question asks about:

When a scenario describes a production model that shows identical input feature distributions to training time, but prediction accuracy has dropped significantly after a major market event...

Answer:

This is CONCEPT DRIFT — the relationship between inputs and outputs changed (the market event changed what the correct prediction should be), not the input distribution. The response is retraining on post-event data, not fixing data pipelines.

Distractor to avoid:

Data drift refers specifically to input distribution changes. If inputs look the same but outputs are wrong, that's concept drift.

Last-Minute Facts

1ML Observability supported model types: REGRESSION and BINARY CLASSIFICATION only.

2Model monitor limits: 250 per account, 500 features per monitor, min 1-day aggregation window.

3Timestamp column for monitor: must be TIMESTAMP_NTZ. Prediction/actual columns: must be NUMBER.

4ML Observability drift method: Difference of Means for statistical distribution comparison.

5RBAC best practice: functional roles should roll up to SYSADMIN, not ACCOUNTADMIN.

6Shapley values: measure feature contribution to individual predictions. Computed by ML Explainability.

7Horizon Trust Center: monitors security POSTURE, not model performance.

Feeling confident?

Put your knowledge to the test with a timed MLA-B01 mock exam.