SnowflakeMLA-B0183 concepts

MLA-B01 Cheat Sheet

Quick reference for the SnowPro Advanced: MLOps Engineer exam.

Quick Navigation

Snowflake ML Platform Overview Feature Store — Entities and Feature Views Snowpark ML — Preprocessing and Model Training Model Registry — Register, Version, Deploy SPCS Compute Pools and Model Serving ML Jobs — Pipeline Orchestration Task Graphs (DAGs) and CI/CD Automation Dynamic Tables vs Streams + Tasks ML Observability — Drift and Performance Monitoring Governance, RBAC, and ML Lineage Cortex ML and LLM Functions Key Exam Distinctions and Common Traps

Snowflake ML Platform Overview

Snowflake ML Stack (top to bottom): Feature Store (feature engineering) → Snowpark ML Modeling (training) → Model Registry (versioning) → Model Serving / Batch Inference (deployment) → ML Observability (monitoring). ML Jobs and Task Graphs orchestrate the full pipeline.
snowflake-ml-python package: Install via pip install snowflake-ml-python; all ML APIs (Feature Store, Model Registry, ML Jobs, preprocessing) are in this package. Minimum version 1.8.0 for Model Registry; 1.8.2+ for ML Jobs.
Snowpark Session (entry point): from snowflake.snowpark import Session; session = Session.builder.configs(connection_params).create() — required before calling any ML API. All operations push down to Snowflake SQL.
Warehouse compute vs SPCS: Warehouse: SQL-based inference, Snowpark ML training for scikit-learn/XGBoost/LightGBM, Cortex ML functions — no container setup needed. SPCS: custom containers, GPU workloads, arbitrary Python packages, real-time HTTP serving endpoints.
Container Runtime vs SPCS: Container Runtime = preconfigured ML environment for Notebooks and ML Jobs (PyTorch, TF, scikit-learn). SPCS = general container runtime for services and long-running jobs. Container Runtime runs ON SPCS but has its own abstractions.
Exam domain weights: Domain 2 MLOps Infrastructure 24% + Domain 4 Pipeline Orchestration 22% = 46% of exam. Domain 1 Feature Engineering 20%, Domain 3 Model Serving 18%, Domain 5 Governance 16%.

Feature Store — Entities and Feature Views

Create FeatureStore object: from snowflake.ml.feature_store import FeatureStore; fs = FeatureStore(session=session, database='ML_DB', name='MY_FS', default_warehouse='ML_WH', creation_mode=CreationMode.CREATE_IF_NOT_EXIST)
Define Entity: from snowflake.ml.feature_store import Entity; entity = Entity(name='CUSTOMER', join_keys=['CUSTOMER_ID']) — an entity represents a business object (user, product, transaction) and its join key columns.
Create Managed FeatureView: from snowflake.ml.feature_store import FeatureView; fv = FeatureView(name='CUSTOMER_FEATURES', entities=[entity], feature_df=my_snowpark_df, timestamp_col='ts', refresh_freq='5 minutes') — refresh_freq creates a Dynamic Table underneath.
Register FeatureView: fs.register_feature_view(feature_view=fv, version='V1', block=True, overwrite=False) — block=True waits for initial data refresh before returning. Overwrite=True replaces existing version.
Retrieve training dataset with point-in-time lookup: fs.retrieve_feature_values(spine_df=label_df, features=[fv], spine_timestamp_col='EVENT_TS') — joins features at each row's EVENT_TS preventing data leakage; use for generating historical training datasets.
External FeatureView (dbt / manual): FeatureView(name='EXT_FV', entities=[entity], feature_df=df, refresh_freq=None) — refresh_freq=None means Snowflake does NOT manage refresh; an external process (dbt, Airflow) maintains the underlying table.
Online vs Offline serving: Offline: fs.retrieve_feature_values() reads from Snowflake tables — batch training and scoring. Online: requires separate online store configuration for sub-millisecond individual record lookup; NOT automatically enabled with standard Feature Store setup.
List and discover feature views: fs.list_feature_views(entity_name='CUSTOMER') — returns a DataFrame of all feature views for that entity. Use for feature discovery across teams.

Snowpark ML — Preprocessing and Model Training

Distributed preprocessing (StandardScaler): from snowflake.ml.modeling.preprocessing import StandardScaler; scaler = StandardScaler(input_cols=['AGE','INCOME'], output_cols=['AGE_SCALED','INCOME_SCALED']); scaler.fit(train_df); scaled_df = scaler.transform(train_df)
OneHotEncoder: from snowflake.ml.modeling.preprocessing import OneHotEncoder; ohe = OneHotEncoder(input_cols=['CATEGORY'], output_cols=['CATEGORY_OHE'], drop_input_cols=True) — distributed across warehouse nodes, not pandas.
Pipeline (chain transformers + model): from snowflake.ml.modeling.pipeline import Pipeline; pipeline = Pipeline(steps=[('scaler', StandardScaler(...)), ('ohe', OneHotEncoder(...)), ('model', XGBClassifier(input_cols=..., label_cols=..., output_cols=...))]) — fit and predict end-to-end.
XGBoost training (Snowpark ML wrapper): from snowflake.ml.modeling.xgboost import XGBClassifier; model = XGBClassifier(input_cols=FEATURE_COLS, label_cols=['LABEL'], output_cols=['PREDICTION']); model.fit(train_df) — runs natively in Snowflake, no UDF creation needed.
Distributed GridSearchCV: from snowflake.ml.modeling.model_selection import GridSearchCV; cv = GridSearchCV(estimator=XGBClassifier(...), param_grid={'n_estimators':[100,200], 'max_depth':[3,5]}, n_jobs=-1); cv.fit(train_df) — parallelizes combinations across warehouse nodes via UDTFs.
RandomizedSearchCV: from snowflake.ml.modeling.model_selection import RandomizedSearchCV; cv = RandomizedSearchCV(estimator=model, param_distributions=param_grid, n_iter=20) — samples random combinations; faster than GridSearchCV for large search spaces.
Key distinction: Snowpark ML vs local scikit-learn: Snowpark ML preprocessing (StandardScaler, OneHotEncoder) runs distributed on warehouse compute — scales to millions of rows. It is NOT pandas; data stays in Snowflake and no rows are pulled to the client.

Model Registry — Register, Version, Deploy

Get Model Registry: from snowflake.ml.registry import Registry; registry = Registry(session=session, database_name='ML_DB', schema_name='MODELS') — Registry is a schema-level object; models are first-class Snowflake objects with full RBAC.
Log a model: model_ver = registry.log_model(my_model, model_name='CHURN_MODEL', version_name='V2', sample_input_data=X_train, metrics={'rmse': 0.12}, comment='XGB v2 with new features') — stores model artifact, metrics, and metadata.
Retrieve model version: mv = registry.get_model('CHURN_MODEL').version('V2') — retrieves a specific version for inference or inspection. mv.show_functions() lists callable methods (predict, predict_proba, etc.).
Set default version: registry.get_model('CHURN_MODEL').default = registry.get_model('CHURN_MODEL').version('V2') — the default version is used when no version is specified at inference time; enables fast rollback.
Batch inference on warehouse: predictions = mv.run(test_df, function_name='predict') — runs batch inference using warehouse compute; most cost-effective for large datasets with standard scikit-learn/XGBoost packages. Returns a Snowpark DataFrame.
Deploy to SPCS for real-time serving: mv.create_service(service_name='CHURN_SERVING', compute_pool='GPU_NV_S_POOL', image_repo='my_db.my_schema.my_repo', ingress_enabled=True) — Snowflake automates container image build; no Dockerfile needed.
Experiment Tracking — log metrics: model_ver = registry.log_model(my_model, model_name='CHURN_MODEL', version_name='V2', metrics={'accuracy': 0.95, 'f1': 0.91}); model_ver.set_metric('auc', 0.98) — metrics dict passed to log_model or set_metric() on the returned ModelVersion; registry.log_model() is NOT a context manager.

SPCS Compute Pools and Model Serving

Create CPU compute pool: CREATE COMPUTE POOL IF NOT EXISTS MY_CPU_POOL MIN_NODES = 1 MAX_NODES = 5 INSTANCE_FAMILY = CPU_X64_M AUTO_RESUME = TRUE AUTO_SUSPEND_SECS = 300;
Create GPU compute pool (A10G): CREATE COMPUTE POOL IF NOT EXISTS MY_GPU_POOL MIN_NODES = 1 MAX_NODES = 3 INSTANCE_FAMILY = GPU_NV_S AUTO_RESUME = TRUE AUTO_SUSPEND_SECS = 600; — GPU_NV_S = 1x NVIDIA A10G; GPU_NV_M = 4x A10G.
A100 GPU pool (large models): CREATE COMPUTE POOL IF NOT EXISTS MY_A100_POOL MIN_NODES = 1 MAX_NODES = 2 INSTANCE_FAMILY = GPU_NV_L; — GPU_NV_L provides A100 GPUs for large model training, fine-tuning, or memory-intensive inference.
SPCS instance family guide: CPU_X64_XS/S/M/L/XL for CPU workloads (ordered smallest to largest). GPU_NV_S = 1x A10G, GPU_NV_M = 4x A10G, GPU_NV_L = 8x A100. Use SHOW COMPUTE POOL INSTANCE FAMILIES for current list.
SPCS autoscaling behavior: Autoscaling is configured at the compute pool level (MIN_NODES / MAX_NODES). Snowflake scales nodes based on service demand automatically. You configure the range — Snowflake handles the scaling.
Batch inference on SPCS (job vs service): SPCS job = finite execution (runs to completion, used for batch inference). SPCS service = long-running process (used for Model Serving HTTP endpoints). Know the distinction — they differ in lifecycle.
Model Serving endpoint (real-time): Model Serving deploys models from Model Registry as managed HTTPS endpoints on SPCS with autoscaling. Snowflake automates container image building and endpoint setup — no Docker management required.

ML Jobs — Pipeline Orchestration

@remote decorator (function dispatch): from snowflake.ml.jobs import remote; @remote('MY_COMPUTE_POOL', stage_name='payload_stage', session=session) def train_model(table: str): ... — serializes function + deps, uploads to stage, runs on Container Runtime.
Submit script as ML Job: from snowflake.ml.jobs import submit_file; job = submit_file('train.py', 'MY_COMPUTE_POOL', stage_name='payload_stage', args=['--data', 'my_table'], session=session) — runs a Python script on Container Runtime.
Submit directory as ML Job: from snowflake.ml.jobs import submit_directory; job = submit_directory('./ml_project/', 'MY_COMPUTE_POOL', entrypoint='train.py', stage_name='payload_stage', session=session) — uploads entire project directory.
Monitor job status and logs: print(job.status) # PENDING | RUNNING | FAILED | DONE; print(job.get_logs()) — poll status; job.result() blocks until completion and returns the return value.
Pin Container Runtime version: @remote('MY_POOL', stage_name='stage', runtime_environment='2.3.0', session=session) def train(): ... — pins to a specific Container Runtime version for reproducibility; use latest for security updates.
Custom pip packages in ML Jobs: @remote('MY_POOL', stage_name='stage', pip_requirements=['custom-lib==1.0.*'], external_access_integrations=['PYPI_EAI'], session=session) def train(): ... — install packages not in standard Container Runtime.
ML Jobs vs Scheduled Notebooks: ML Jobs: production-grade, multi-step dependencies, external IDE development, Container Runtime execution, integrates with Task Graphs. Scheduled Notebooks: simpler, single-notebook scope, good for exploratory-to-production transitions.

Task Graphs (DAGs) and CI/CD Automation

Create root task (with schedule): CREATE TASK root_task WAREHOUSE = ML_WH SCHEDULE = 'USING CRON 0 2 * * * UTC' AS CALL ingest_data(); ALTER TASK root_task RESUME; — only root tasks have SCHEDULE; child tasks use AFTER instead.
Chain child task (DAG dependency): CREATE TASK train_task WAREHOUSE = ML_WH AFTER root_task AS CALL train_model(); CREATE TASK deploy_task WAREHOUSE = ML_WH AFTER train_task AS CALL deploy_model(); ALTER TASK train_task RESUME; ALTER TASK deploy_task RESUME;
Task with WHEN clause (conditional): CREATE TASK process_task WAREHOUSE = ML_WH SCHEDULE = '5 MINUTES' WHEN SYSTEM$STREAM_HAS_DATA('feature_stream') AS INSERT INTO features SELECT * FROM feature_stream; — runs only when stream has unconsumed data.
Serverless task (auto-scale): CREATE TASK serverless_task USER_TASK_MANAGED_INITIAL_WAREHOUSE_SIZE = 'MEDIUM' SCHEDULE = '10 MINUTES' AS CALL retrain_if_drift(); — Snowflake-managed compute, scales to workload size, no warehouse needed.
Git integration for Notebooks / SQL: CREATE GIT REPOSITORY my_repo API_INTEGRATION = my_github_integration ORIGIN = 'https://github.com/org/repo.git'; — connects code repository; enables EXECUTE IMMEDIATE FROM @my_repo/main/scripts/deploy.sql in pipelines.
Snowflake CLI GitHub Actions step: uses: snowflakedb/snowflake-cli-action@v1 — installs Snowflake CLI in GitHub Actions runner; follow with snow sql -f deploy.sql or snow snowpark deploy for automated deployments triggered by Git events.
External orchestrators (Airflow integration): Airflow triggers ML Jobs or Tasks via Snowflake connection (SnowflakeOperator / SnowflakeSqlApiOperator). Airflow runs OUTSIDE Snowflake — it triggers Snowflake operations. It does not replace native Task Graphs.
Task retry configuration: CREATE TASK my_task ... TASK_AUTO_RETRY_ATTEMPTS = 3 SUSPEND_TASK_AFTER_NUM_FAILURES = 5 AS ...; — automatically retries on transient failures before marking as failed.

Dynamic Tables vs Streams + Tasks

Dynamic Table for feature pipeline (declarative): CREATE DYNAMIC TABLE customer_features TARGET_LAG = '10 minutes' WAREHOUSE = ML_WH AS SELECT customer_id, AVG(amount) avg_spend, COUNT(*) txn_count FROM orders GROUP BY customer_id; — Snowflake handles refresh timing automatically.
Dynamic Table chaining (multi-step): CREATE DYNAMIC TABLE enriched_features TARGET_LAG = DOWNSTREAM WAREHOUSE = ML_WH AS SELECT f.*, c.region FROM customer_features f JOIN customers c ON f.customer_id = c.id; — DOWNSTREAM inherits lag from consuming object.
Streams + Tasks (imperative pipeline): CREATE STREAM orders_stream ON TABLE orders; CREATE TASK process_stream WAREHOUSE = ML_WH SCHEDULE = '1 MINUTES' WHEN SYSTEM$STREAM_HAS_DATA('orders_stream') AS MERGE INTO features USING orders_stream ON ...; — supports MERGE, stored procedures, custom error handling.
When to choose Dynamic Tables: Use Dynamic Tables when: pipeline is pure SQL SELECT transformations, you want declarative refresh management, and no MERGE/stored procedure logic is needed. Simpler to maintain and monitor.
When to choose Streams + Tasks: Use Streams + Tasks when: MERGE operations are needed (SCD Type 2, upserts), stored procedures with custom logic, external function calls, multi-target writes, or fine-grained retry/error handling is required.

ML Observability — Drift and Performance Monitoring

Create Model Monitor (SQL): CREATE MODEL MONITOR my_monitor WITH MODEL_NAME = 'ML_DB.MODELS.CHURN_MODEL' MODEL_VERSION_NAME = 'V2' TASK = 'REGRESSION' PREDICTION_CLASS_LABELS = () TIMESTAMP_COLUMN = 'SCORED_AT' SOURCE = 'ML_DB.PROD.PREDICTIONS' BASELINE = 'ML_DB.PROD.TRAINING_DATA' WAREHOUSE = ML_WH REFRESH_INTERVAL = '1 HOUR';
Query drift metrics: SELECT * FROM TABLE(MODEL_MONITOR_DRIFT_METRIC('MY_MONITOR', 'DIFFERENCE_OF_MEANS', 'FEATURE_COL', 'DAY', '2026-06-01', '2026-06-13', {})); — Difference of Means is the primary statistical method for detecting data drift.
Query performance metrics: SELECT * FROM TABLE(MODEL_MONITOR_PERFORMANCE_METRIC('MY_MONITOR', 'RMSE', 'DAY', '2026-06-01', '2026-06-13', {})); — tracks model accuracy over time. Supported for regression and binary classification models.
Suspend / Resume monitor: ALTER MODEL MONITOR my_monitor SUSPEND; ALTER MODEL MONITOR my_monitor RESUME; — pause monitoring during planned maintenance windows without deleting the monitor configuration.
Data drift vs concept drift: Data drift = input feature distribution changes (e.g., customer age skews younger). Concept drift = relationship between inputs and outputs changes (e.g., model predictions are wrong even with unchanged inputs). Both degrade model performance differently.
ML Observability supported model types: Currently supports REGRESSION and binary CLASSIFICATION models only. Multi-class classification, ranking, and other model types are NOT yet supported for automated ML Observability monitors.
Data Metric Functions (DMFs) vs ML Observability: DMFs (freshness, completeness, accuracy) monitor DATA QUALITY on feature tables — they check if input data is valid. ML Observability monitors MODEL QUALITY — drift and prediction performance after deployment.

Governance, RBAC, and ML Lineage

RBAC pattern for ML artifacts (access + functional roles): CREATE ROLE model_inference_access; GRANT USAGE ON MODEL CHURN_MODEL TO ROLE model_inference_access; GRANT ROLE model_inference_access TO ROLE junior_data_scientist; — access roles hold object permissions; functional roles map to job functions.
Dynamic data masking on feature tables: CREATE MASKING POLICY ssn_mask AS (val STRING) RETURNS STRING -> CASE WHEN IS_ROLE_IN_SESSION('DATA_STEWARD') THEN val ELSE '***-**-' || RIGHT(val,4) END; ALTER TABLE features MODIFY COLUMN SSN SET MASKING POLICY ssn_mask;
ML Lineage overview: ML Lineage provides end-to-end traceability: source tables → Feature Store feature views → training datasets → registered model versions. Enables audit trails, reproducibility, and compliance without manual documentation.
ML Lineage vs ML Observability distinction: ML Lineage = WHERE did data come from (provenance, reproducibility, compliance). ML Observability = HOW is the model performing (production drift, accuracy degradation). They are separate features solving different problems.
ML Explainability (Shapley values): mv.run(test_df, function_name='explain') or SHAP values via snowflake.ml.explainability — returns per-feature contribution scores for individual predictions; required for model interpretability and governance in regulated industries.
Snowflake Horizon — Trust Center: Snowflake Horizon = integrated governance suite: universal data discovery (catalog), Trust Center (security posture / misconfiguration detection), data classification for PII/sensitive data, and Compliance Center for regulatory readiness.
Cost management: SPCS vs warehouse billing: Virtual warehouse costs = Snowflake credits per compute-hour (billed per second after 60s minimum). SPCS compute pool costs = separate node-hour billing for compute pools; GPU pools (GPU_NV_S/M/L) have distinct pricing from CPU pools.
Audit ML access with ACCESS_HISTORY: SELECT query_id, user_name, direct_objects_accessed FROM SNOWFLAKE.ACCOUNT_USAGE.ACCESS_HISTORY WHERE query_start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP) ORDER BY query_start_time DESC; — tracks who accessed which model artifacts.

Cortex ML and LLM Functions

Cortex ML Forecasting: CREATE SNOWFLAKE.ML.FORECAST sales_forecast (INPUT_DATA => TABLE(sales_data), TIMESTAMP_COLNAME => 'DATE_COL', TARGET_COLNAME => 'SALES_COL'); CALL sales_forecast!FORECAST(FORECASTING_PERIODS => 7); — object-based API under SNOWFLAKE.ML namespace; no Python model training needed.
Cortex ML Anomaly Detection: CREATE SNOWFLAKE.ML.ANOMALY_DETECTION my_detector (INPUT_DATA => TABLE(metrics), TIMESTAMP_COLNAME => 'ts', TARGET_COLNAME => 'value', LABEL_COLNAME => 'label'); CALL my_detector!DETECT_ANOMALIES(TABLE(new_metrics), 'ts', 'value', ...);
Cortex ML Classification: CREATE SNOWFLAKE.ML.CLASSIFICATION my_classifier (INPUT_DATA => TABLE(training_data), TARGET_COLNAME => 'LABEL'); CALL my_classifier!PREDICT(TABLE(test_data)); — SQL-callable classification without Python model training.
Cortex ML vs Custom Snowpark ML models: Cortex ML = pre-built SQL-callable functions (Forecast, Anomaly, Classification); no training code, limited to supported task types. Custom Snowpark ML = full control with scikit-learn/XGBoost/PyTorch; requires ML engineering effort.
COMPLETE (LLM text generation): SELECT SNOWFLAKE.CORTEX.COMPLETE('mistral-7b', 'Summarize this review: ' || review_text) AS summary FROM product_reviews; — runs inside Snowflake, data never leaves the account.
SENTIMENT and SUMMARIZE: SELECT SNOWFLAKE.CORTEX.SENTIMENT(feedback) AS score, SNOWFLAKE.CORTEX.SUMMARIZE(doc_text) AS summary FROM feedback_table; — SENTIMENT returns FLOAT -1.0 to 1.0; SUMMARIZE returns a text summary. Neither requires a model parameter.

Key Exam Distinctions and Common Traps

Model Serving ONLY runs on SPCS: HTTP inference endpoints require SPCS — they cannot run on standard virtual warehouses. If a question asks about real-time / sub-second predictions with HTTP endpoint, the answer always involves SPCS Model Serving.
Warehouse IS sufficient for many training workloads: scikit-learn, XGBoost, and LightGBM training via Snowpark ML works on standard warehouse compute — you do NOT always need SPCS. SPCS is needed for custom containers, GPU, arbitrary packages, or real-time HTTP endpoints.
Feature Store online serving requires extra setup: Online feature serving (low-latency individual lookups) requires additional online store configuration — it is NOT automatically enabled when you create a feature view. Standard feature views support offline retrieval only.
Snowflake CLI GitHub Actions run in GitHub, not Snowflake: The snowflakedb/snowflake-cli-action@v1 GitHub Action installs and runs Snowflake CLI inside the GitHub Actions runner — it triggers Snowflake operations from GitHub but is NOT a Snowflake Task running inside Snowflake.
ML Jobs run on Container Runtime; Tasks run on warehouses: ML Jobs = Python workloads on SPCS Container Runtime (ML training, batch inference with custom packages). Snowflake Tasks = SQL or Python on virtual warehouse compute (data transformation, scheduling). They can be combined in pipelines.
Autoscaling is at compute pool level, not model level: You configure MIN_NODES and MAX_NODES on the SPCS compute pool — not on individual model deployments. One compute pool can serve multiple Model Serving services; scaling applies pool-wide.
Container Runtime for Notebooks vs SPCS for services: Container Runtime = ML-optimized environment for interactive Notebooks and ML Jobs. SPCS = container platform for production services and jobs. Container Runtime notebooks run ON SPCS but use simplified runtime APIs.

Ready to test yourself?

Start a timed MLA-B01 mock exam or review practice questions by domain.