Quick Navigation
Snowflake ML Platform OverviewFeature Store — Entities and Feature ViewsSnowpark ML — Preprocessing and Model TrainingModel Registry — Register, Version, DeploySPCS Compute Pools and Model ServingML Jobs — Pipeline OrchestrationTask Graphs (DAGs) and CI/CD AutomationDynamic Tables vs Streams + TasksML Observability — Drift and Performance MonitoringGovernance, RBAC, and ML LineageCortex ML and LLM FunctionsKey Exam Distinctions and Common Traps
Snowflake ML Platform Overview
- Snowflake ML Stack (top to bottom)
- Feature Store (feature engineering) → Snowpark ML Modeling (training) → Model Registry (versioning) → Model Serving / Batch Inference (deployment) → ML Observability (monitoring). ML Jobs and Task Graphs orchestrate the full pipeline.
- snowflake-ml-python package
- Install via pip install snowflake-ml-python; all ML APIs (Feature Store, Model Registry, ML Jobs, preprocessing) are in this package. Minimum version 1.8.0 for Model Registry; 1.8.2+ for ML Jobs.
- Snowpark Session (entry point)
- from snowflake.snowpark import Session; session = Session.builder.configs(connection_params).create() — required before calling any ML API. All operations push down to Snowflake SQL.
- Warehouse compute vs SPCS
- Warehouse: SQL-based inference, Snowpark ML training for scikit-learn/XGBoost/LightGBM, Cortex ML functions — no container setup needed. SPCS: custom containers, GPU workloads, arbitrary Python packages, real-time HTTP serving endpoints.
- Container Runtime vs SPCS
- Container Runtime = preconfigured ML environment for Notebooks and ML Jobs (PyTorch, TF, scikit-learn). SPCS = general container runtime for services and long-running jobs. Container Runtime runs ON SPCS but has its own abstractions.
- Exam domain weights
- Domain 2 MLOps Infrastructure 24% + Domain 4 Pipeline Orchestration 22% = 46% of exam. Domain 1 Feature Engineering 20%, Domain 3 Model Serving 18%, Domain 5 Governance 16%.
Feature Store — Entities and Feature Views
- Create FeatureStore object
- from snowflake.ml.feature_store import FeatureStore; fs = FeatureStore(session=session, database='ML_DB', name='MY_FS', default_warehouse='ML_WH', creation_mode=CreationMode.CREATE_IF_NOT_EXIST)
- Define Entity
- from snowflake.ml.feature_store import Entity; entity = Entity(name='CUSTOMER', join_keys=['CUSTOMER_ID']) — an entity represents a business object (user, product, transaction) and its join key columns.
- Create Managed FeatureView
- from snowflake.ml.feature_store import FeatureView; fv = FeatureView(name='CUSTOMER_FEATURES', entities=[entity], feature_df=my_snowpark_df, timestamp_col='ts', refresh_freq='5 minutes') — refresh_freq creates a Dynamic Table underneath.
- Register FeatureView
- fs.register_feature_view(feature_view=fv, version='V1', block=True, overwrite=False) — block=True waits for initial data refresh before returning. Overwrite=True replaces existing version.
- Retrieve training dataset with point-in-time lookup
- fs.retrieve_feature_values(spine_df=label_df, features=[fv], spine_timestamp_col='EVENT_TS') — joins features at each row's EVENT_TS preventing data leakage; use for generating historical training datasets.
- External FeatureView (dbt / manual)
- FeatureView(name='EXT_FV', entities=[entity], feature_df=df, refresh_freq=None) — refresh_freq=None means Snowflake does NOT manage refresh; an external process (dbt, Airflow) maintains the underlying table.
- Online vs Offline serving
- Offline: fs.retrieve_feature_values() reads from Snowflake tables — batch training and scoring. Online: requires separate online store configuration for sub-millisecond individual record lookup; NOT automatically enabled with standard Feature Store setup.
- List and discover feature views
- fs.list_feature_views(entity_name='CUSTOMER') — returns a DataFrame of all feature views for that entity. Use for feature discovery across teams.
Snowpark ML — Preprocessing and Model Training
- Distributed preprocessing (StandardScaler)
- from snowflake.ml.modeling.preprocessing import StandardScaler; scaler = StandardScaler(input_cols=['AGE','INCOME'], output_cols=['AGE_SCALED','INCOME_SCALED']); scaler.fit(train_df); scaled_df = scaler.transform(train_df)
- OneHotEncoder
- from snowflake.ml.modeling.preprocessing import OneHotEncoder; ohe = OneHotEncoder(input_cols=['CATEGORY'], output_cols=['CATEGORY_OHE'], drop_input_cols=True) — distributed across warehouse nodes, not pandas.
- Pipeline (chain transformers + model)
- from snowflake.ml.modeling.pipeline import Pipeline; pipeline = Pipeline(steps=[('scaler', StandardScaler(...)), ('ohe', OneHotEncoder(...)), ('model', XGBClassifier(input_cols=..., label_cols=..., output_cols=...))]) — fit and predict end-to-end.
- XGBoost training (Snowpark ML wrapper)
- from snowflake.ml.modeling.xgboost import XGBClassifier; model = XGBClassifier(input_cols=FEATURE_COLS, label_cols=['LABEL'], output_cols=['PREDICTION']); model.fit(train_df) — runs natively in Snowflake, no UDF creation needed.
- Distributed GridSearchCV
- from snowflake.ml.modeling.model_selection import GridSearchCV; cv = GridSearchCV(estimator=XGBClassifier(...), param_grid={'n_estimators':[100,200], 'max_depth':[3,5]}, n_jobs=-1); cv.fit(train_df) — parallelizes combinations across warehouse nodes via UDTFs.
- RandomizedSearchCV
- from snowflake.ml.modeling.model_selection import RandomizedSearchCV; cv = RandomizedSearchCV(estimator=model, param_distributions=param_grid, n_iter=20) — samples random combinations; faster than GridSearchCV for large search spaces.
- Key distinction: Snowpark ML vs local scikit-learn
- Snowpark ML preprocessing (StandardScaler, OneHotEncoder) runs distributed on warehouse compute — scales to millions of rows. It is NOT pandas; data stays in Snowflake and no rows are pulled to the client.
Model Registry — Register, Version, Deploy
- Get Model Registry
- from snowflake.ml.registry import Registry; registry = Registry(session=session, database_name='ML_DB', schema_name='MODELS') — Registry is a schema-level object; models are first-class Snowflake objects with full RBAC.
- Log a model
- model_ver = registry.log_model(my_model, model_name='CHURN_MODEL', version_name='V2', sample_input_data=X_train, metrics={'rmse': 0.12}, comment='XGB v2 with new features') — stores model artifact, metrics, and metadata.
- Retrieve model version
- mv = registry.get_model('CHURN_MODEL').version('V2') — retrieves a specific version for inference or inspection. mv.show_functions() lists callable methods (predict, predict_proba, etc.).
- Set default version
- registry.get_model('CHURN_MODEL').default = registry.get_model('CHURN_MODEL').version('V2') — the default version is used when no version is specified at inference time; enables fast rollback.
- Batch inference on warehouse
- predictions = mv.run(test_df, function_name='predict') — runs batch inference using warehouse compute; most cost-effective for large datasets with standard scikit-learn/XGBoost packages. Returns a Snowpark DataFrame.
- Deploy to SPCS for real-time serving
- mv.create_service(service_name='CHURN_SERVING', compute_pool='GPU_NV_S_POOL', image_repo='my_db.my_schema.my_repo', ingress_enabled=True) — Snowflake automates container image build; no Dockerfile needed.
- Experiment Tracking — log metrics
- model_ver = registry.log_model(my_model, model_name='CHURN_MODEL', version_name='V2', metrics={'accuracy': 0.95, 'f1': 0.91}); model_ver.set_metric('auc', 0.98) — metrics dict passed to log_model or set_metric() on the returned ModelVersion; registry.log_model() is NOT a context manager.
SPCS Compute Pools and Model Serving
- Create CPU compute pool
- CREATE COMPUTE POOL IF NOT EXISTS MY_CPU_POOL MIN_NODES = 1 MAX_NODES = 5 INSTANCE_FAMILY = CPU_X64_M AUTO_RESUME = TRUE AUTO_SUSPEND_SECS = 300;
- Create GPU compute pool (A10G)
- CREATE COMPUTE POOL IF NOT EXISTS MY_GPU_POOL MIN_NODES = 1 MAX_NODES = 3 INSTANCE_FAMILY = GPU_NV_S AUTO_RESUME = TRUE AUTO_SUSPEND_SECS = 600; — GPU_NV_S = 1x NVIDIA A10G; GPU_NV_M = 4x A10G.
- A100 GPU pool (large models)
- CREATE COMPUTE POOL IF NOT EXISTS MY_A100_POOL MIN_NODES = 1 MAX_NODES = 2 INSTANCE_FAMILY = GPU_NV_L; — GPU_NV_L provides A100 GPUs for large model training, fine-tuning, or memory-intensive inference.
- SPCS instance family guide
- CPU_X64_XS/S/M/L/XL for CPU workloads (ordered smallest to largest). GPU_NV_S = 1x A10G, GPU_NV_M = 4x A10G, GPU_NV_L = 8x A100. Use SHOW COMPUTE POOL INSTANCE FAMILIES for current list.
- SPCS autoscaling behavior
- Autoscaling is configured at the compute pool level (MIN_NODES / MAX_NODES). Snowflake scales nodes based on service demand automatically. You configure the range — Snowflake handles the scaling.
- Batch inference on SPCS (job vs service)
- SPCS job = finite execution (runs to completion, used for batch inference). SPCS service = long-running process (used for Model Serving HTTP endpoints). Know the distinction — they differ in lifecycle.
- Model Serving endpoint (real-time)
- Model Serving deploys models from Model Registry as managed HTTPS endpoints on SPCS with autoscaling. Snowflake automates container image building and endpoint setup — no Docker management required.
ML Jobs — Pipeline Orchestration
- @remote decorator (function dispatch)
- from snowflake.ml.jobs import remote; @remote('MY_COMPUTE_POOL', stage_name='payload_stage', session=session) def train_model(table: str): ... — serializes function + deps, uploads to stage, runs on Container Runtime.
- Submit script as ML Job
- from snowflake.ml.jobs import submit_file; job = submit_file('train.py', 'MY_COMPUTE_POOL', stage_name='payload_stage', args=['--data', 'my_table'], session=session) — runs a Python script on Container Runtime.
- Submit directory as ML Job
- from snowflake.ml.jobs import submit_directory; job = submit_directory('./ml_project/', 'MY_COMPUTE_POOL', entrypoint='train.py', stage_name='payload_stage', session=session) — uploads entire project directory.
- Monitor job status and logs
- print(job.status) # PENDING | RUNNING | FAILED | DONE; print(job.get_logs()) — poll status; job.result() blocks until completion and returns the return value.
- Pin Container Runtime version
- @remote('MY_POOL', stage_name='stage', runtime_environment='2.3.0', session=session) def train(): ... — pins to a specific Container Runtime version for reproducibility; use latest for security updates.
- Custom pip packages in ML Jobs
- @remote('MY_POOL', stage_name='stage', pip_requirements=['custom-lib==1.0.*'], external_access_integrations=['PYPI_EAI'], session=session) def train(): ... — install packages not in standard Container Runtime.
- ML Jobs vs Scheduled Notebooks
- ML Jobs: production-grade, multi-step dependencies, external IDE development, Container Runtime execution, integrates with Task Graphs. Scheduled Notebooks: simpler, single-notebook scope, good for exploratory-to-production transitions.
Task Graphs (DAGs) and CI/CD Automation
- Create root task (with schedule)
- CREATE TASK root_task WAREHOUSE = ML_WH SCHEDULE = 'USING CRON 0 2 * * * UTC' AS CALL ingest_data(); ALTER TASK root_task RESUME; — only root tasks have SCHEDULE; child tasks use AFTER instead.
- Chain child task (DAG dependency)
- CREATE TASK train_task WAREHOUSE = ML_WH AFTER root_task AS CALL train_model(); CREATE TASK deploy_task WAREHOUSE = ML_WH AFTER train_task AS CALL deploy_model(); ALTER TASK train_task RESUME; ALTER TASK deploy_task RESUME;
- Task with WHEN clause (conditional)
- CREATE TASK process_task WAREHOUSE = ML_WH SCHEDULE = '5 MINUTES' WHEN SYSTEM$STREAM_HAS_DATA('feature_stream') AS INSERT INTO features SELECT * FROM feature_stream; — runs only when stream has unconsumed data.
- Serverless task (auto-scale)
- CREATE TASK serverless_task USER_TASK_MANAGED_INITIAL_WAREHOUSE_SIZE = 'MEDIUM' SCHEDULE = '10 MINUTES' AS CALL retrain_if_drift(); — Snowflake-managed compute, scales to workload size, no warehouse needed.
- Git integration for Notebooks / SQL
- CREATE GIT REPOSITORY my_repo API_INTEGRATION = my_github_integration ORIGIN = 'https://github.com/org/repo.git'; — connects code repository; enables EXECUTE IMMEDIATE FROM @my_repo/main/scripts/deploy.sql in pipelines.
- Snowflake CLI GitHub Actions step
- uses: snowflakedb/snowflake-cli-action@v1 — installs Snowflake CLI in GitHub Actions runner; follow with snow sql -f deploy.sql or snow snowpark deploy for automated deployments triggered by Git events.
- External orchestrators (Airflow integration)
- Airflow triggers ML Jobs or Tasks via Snowflake connection (SnowflakeOperator / SnowflakeSqlApiOperator). Airflow runs OUTSIDE Snowflake — it triggers Snowflake operations. It does not replace native Task Graphs.
- Task retry configuration
- CREATE TASK my_task ... TASK_AUTO_RETRY_ATTEMPTS = 3 SUSPEND_TASK_AFTER_NUM_FAILURES = 5 AS ...; — automatically retries on transient failures before marking as failed.
Dynamic Tables vs Streams + Tasks
- Dynamic Table for feature pipeline (declarative)
- CREATE DYNAMIC TABLE customer_features TARGET_LAG = '10 minutes' WAREHOUSE = ML_WH AS SELECT customer_id, AVG(amount) avg_spend, COUNT(*) txn_count FROM orders GROUP BY customer_id; — Snowflake handles refresh timing automatically.
- Dynamic Table chaining (multi-step)
- CREATE DYNAMIC TABLE enriched_features TARGET_LAG = DOWNSTREAM WAREHOUSE = ML_WH AS SELECT f.*, c.region FROM customer_features f JOIN customers c ON f.customer_id = c.id; — DOWNSTREAM inherits lag from consuming object.
- Streams + Tasks (imperative pipeline)
- CREATE STREAM orders_stream ON TABLE orders; CREATE TASK process_stream WAREHOUSE = ML_WH SCHEDULE = '1 MINUTES' WHEN SYSTEM$STREAM_HAS_DATA('orders_stream') AS MERGE INTO features USING orders_stream ON ...; — supports MERGE, stored procedures, custom error handling.
- When to choose Dynamic Tables
- Use Dynamic Tables when: pipeline is pure SQL SELECT transformations, you want declarative refresh management, and no MERGE/stored procedure logic is needed. Simpler to maintain and monitor.
- When to choose Streams + Tasks
- Use Streams + Tasks when: MERGE operations are needed (SCD Type 2, upserts), stored procedures with custom logic, external function calls, multi-target writes, or fine-grained retry/error handling is required.
ML Observability — Drift and Performance Monitoring
- Create Model Monitor (SQL)
- CREATE MODEL MONITOR my_monitor WITH MODEL_NAME = 'ML_DB.MODELS.CHURN_MODEL' MODEL_VERSION_NAME = 'V2' TASK = 'REGRESSION' PREDICTION_CLASS_LABELS = () TIMESTAMP_COLUMN = 'SCORED_AT' SOURCE = 'ML_DB.PROD.PREDICTIONS' BASELINE = 'ML_DB.PROD.TRAINING_DATA' WAREHOUSE = ML_WH REFRESH_INTERVAL = '1 HOUR';
- Query drift metrics
- SELECT * FROM TABLE(MODEL_MONITOR_DRIFT_METRIC('MY_MONITOR', 'DIFFERENCE_OF_MEANS', 'FEATURE_COL', 'DAY', '2026-06-01', '2026-06-13', {})); — Difference of Means is the primary statistical method for detecting data drift.
- Query performance metrics
- SELECT * FROM TABLE(MODEL_MONITOR_PERFORMANCE_METRIC('MY_MONITOR', 'RMSE', 'DAY', '2026-06-01', '2026-06-13', {})); — tracks model accuracy over time. Supported for regression and binary classification models.
- Suspend / Resume monitor
- ALTER MODEL MONITOR my_monitor SUSPEND; ALTER MODEL MONITOR my_monitor RESUME; — pause monitoring during planned maintenance windows without deleting the monitor configuration.
- Data drift vs concept drift
- Data drift = input feature distribution changes (e.g., customer age skews younger). Concept drift = relationship between inputs and outputs changes (e.g., model predictions are wrong even with unchanged inputs). Both degrade model performance differently.
- ML Observability supported model types
- Currently supports REGRESSION and binary CLASSIFICATION models only. Multi-class classification, ranking, and other model types are NOT yet supported for automated ML Observability monitors.
- Data Metric Functions (DMFs) vs ML Observability
- DMFs (freshness, completeness, accuracy) monitor DATA QUALITY on feature tables — they check if input data is valid. ML Observability monitors MODEL QUALITY — drift and prediction performance after deployment.
Governance, RBAC, and ML Lineage
- RBAC pattern for ML artifacts (access + functional roles)
- CREATE ROLE model_inference_access; GRANT USAGE ON MODEL CHURN_MODEL TO ROLE model_inference_access; GRANT ROLE model_inference_access TO ROLE junior_data_scientist; — access roles hold object permissions; functional roles map to job functions.
- Dynamic data masking on feature tables
- CREATE MASKING POLICY ssn_mask AS (val STRING) RETURNS STRING -> CASE WHEN IS_ROLE_IN_SESSION('DATA_STEWARD') THEN val ELSE '***-**-' || RIGHT(val,4) END; ALTER TABLE features MODIFY COLUMN SSN SET MASKING POLICY ssn_mask;
- ML Lineage overview
- ML Lineage provides end-to-end traceability: source tables → Feature Store feature views → training datasets → registered model versions. Enables audit trails, reproducibility, and compliance without manual documentation.
- ML Lineage vs ML Observability distinction
- ML Lineage = WHERE did data come from (provenance, reproducibility, compliance). ML Observability = HOW is the model performing (production drift, accuracy degradation). They are separate features solving different problems.
- ML Explainability (Shapley values)
- mv.run(test_df, function_name='explain') or SHAP values via snowflake.ml.explainability — returns per-feature contribution scores for individual predictions; required for model interpretability and governance in regulated industries.
- Snowflake Horizon — Trust Center
- Snowflake Horizon = integrated governance suite: universal data discovery (catalog), Trust Center (security posture / misconfiguration detection), data classification for PII/sensitive data, and Compliance Center for regulatory readiness.
- Cost management: SPCS vs warehouse billing
- Virtual warehouse costs = Snowflake credits per compute-hour (billed per second after 60s minimum). SPCS compute pool costs = separate node-hour billing for compute pools; GPU pools (GPU_NV_S/M/L) have distinct pricing from CPU pools.
- Audit ML access with ACCESS_HISTORY
- SELECT query_id, user_name, direct_objects_accessed FROM SNOWFLAKE.ACCOUNT_USAGE.ACCESS_HISTORY WHERE query_start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP) ORDER BY query_start_time DESC; — tracks who accessed which model artifacts.
Cortex ML and LLM Functions
- Cortex ML Forecasting
- CREATE SNOWFLAKE.ML.FORECAST sales_forecast (INPUT_DATA => TABLE(sales_data), TIMESTAMP_COLNAME => 'DATE_COL', TARGET_COLNAME => 'SALES_COL'); CALL sales_forecast!FORECAST(FORECASTING_PERIODS => 7); — object-based API under SNOWFLAKE.ML namespace; no Python model training needed.
- Cortex ML Anomaly Detection
- CREATE SNOWFLAKE.ML.ANOMALY_DETECTION my_detector (INPUT_DATA => TABLE(metrics), TIMESTAMP_COLNAME => 'ts', TARGET_COLNAME => 'value', LABEL_COLNAME => 'label'); CALL my_detector!DETECT_ANOMALIES(TABLE(new_metrics), 'ts', 'value', ...);
- Cortex ML Classification
- CREATE SNOWFLAKE.ML.CLASSIFICATION my_classifier (INPUT_DATA => TABLE(training_data), TARGET_COLNAME => 'LABEL'); CALL my_classifier!PREDICT(TABLE(test_data)); — SQL-callable classification without Python model training.
- Cortex ML vs Custom Snowpark ML models
- Cortex ML = pre-built SQL-callable functions (Forecast, Anomaly, Classification); no training code, limited to supported task types. Custom Snowpark ML = full control with scikit-learn/XGBoost/PyTorch; requires ML engineering effort.
- COMPLETE (LLM text generation)
- SELECT SNOWFLAKE.CORTEX.COMPLETE('mistral-7b', 'Summarize this review: ' || review_text) AS summary FROM product_reviews; — runs inside Snowflake, data never leaves the account.
- SENTIMENT and SUMMARIZE
- SELECT SNOWFLAKE.CORTEX.SENTIMENT(feedback) AS score, SNOWFLAKE.CORTEX.SUMMARIZE(doc_text) AS summary FROM feedback_table; — SENTIMENT returns FLOAT -1.0 to 1.0; SUMMARIZE returns a text summary. Neither requires a model parameter.
Key Exam Distinctions and Common Traps
- Model Serving ONLY runs on SPCS
- HTTP inference endpoints require SPCS — they cannot run on standard virtual warehouses. If a question asks about real-time / sub-second predictions with HTTP endpoint, the answer always involves SPCS Model Serving.
- Warehouse IS sufficient for many training workloads
- scikit-learn, XGBoost, and LightGBM training via Snowpark ML works on standard warehouse compute — you do NOT always need SPCS. SPCS is needed for custom containers, GPU, arbitrary packages, or real-time HTTP endpoints.
- Feature Store online serving requires extra setup
- Online feature serving (low-latency individual lookups) requires additional online store configuration — it is NOT automatically enabled when you create a feature view. Standard feature views support offline retrieval only.
- Snowflake CLI GitHub Actions run in GitHub, not Snowflake
- The snowflakedb/snowflake-cli-action@v1 GitHub Action installs and runs Snowflake CLI inside the GitHub Actions runner — it triggers Snowflake operations from GitHub but is NOT a Snowflake Task running inside Snowflake.
- ML Jobs run on Container Runtime; Tasks run on warehouses
- ML Jobs = Python workloads on SPCS Container Runtime (ML training, batch inference with custom packages). Snowflake Tasks = SQL or Python on virtual warehouse compute (data transformation, scheduling). They can be combined in pipelines.
- Autoscaling is at compute pool level, not model level
- You configure MIN_NODES and MAX_NODES on the SPCS compute pool — not on individual model deployments. One compute pool can serve multiple Model Serving services; scaling applies pool-wide.
- Container Runtime for Notebooks vs SPCS for services
- Container Runtime = ML-optimized environment for interactive Notebooks and ML Jobs. SPCS = container platform for production services and jobs. Container Runtime notebooks run ON SPCS but use simplified runtime APIs.