CertPrepNow
SnowflakeCOF-C0362 concepts

COF-C03 Cheat Sheet

Quick reference for the SnowPro Core Certification exam.

Three-Layer Architecture

Cloud Services Layer: - Authentication & authorization - Query parsing & optimization - Metadata management - Infrastructure management Compute Layer (Virtual Warehouses): - CPU, memory, local SSD cache - Executes queries and DML Storage Layer: - Centralized cloud object storage - Micro-partitions (columnar, compressed)
Snowflake's three layers scale independently — storage costs are separate from compute costs, and multiple warehouses can query the same storage simultaneously.
Micro-partitions: - 50–500 MB compressed columnar units - Immutable (never overwritten in-place) - Metadata tracks min/max values per column - Enable partition pruning without manual indexes
Snowflake automatically creates and manages micro-partitions. The metadata cache in cloud services enables instant COUNT, MIN, MAX without scanning data.
Snowflake Editions and key feature gates: - Standard: 1-day max Time Travel, single-cluster WH - Enterprise: 90-day Time Travel, multi-cluster WH, materialized views, dynamic data masking - Business Critical: customer-managed keys (Tri-Secret Secure), private connectivity (PrivateLink) - VPS: Dedicated Snowflake environment
Edition determines which features are available. Multi-cluster warehouses, 90-day Time Travel, and materialized views all require Enterprise or higher.
Table types and data protection: - Permanent: Time Travel 0–90 days (Enterprise+), 7-day Fail-safe - Transient: Time Travel 0–1 day, 0-day Fail-safe - Temporary: Time Travel 0–1 day, 0-day Fail-safe, session-scoped - External: No Time Travel, no Fail-safe, read-only
Transient and temporary tables have ZERO Fail-safe days — once Time Travel expires, data is permanently lost. Use transient tables for staging or regenerable data.
CREATE TABLE my_table ( id NUMBER, name STRING ) DATA_RETENTION_TIME_IN_DAYS = 30;
Set Time Travel retention at table creation. Enterprise+ is required for values greater than 1 day. Alter any time with ALTER TABLE ... SET DATA_RETENTION_TIME_IN_DAYS.
Data Encryption: - At rest: AES-256 encryption, always-on, not configurable - In transit: TLS 1.2 or higher, always-on - Key management: Standard/Enterprise: Snowflake-managed keys Business Critical: Customer-managed keys (CMK) Tri-Secret Secure: Dual key model — Snowflake key + customer key BOTH required to decrypt; data is inaccessible if customer revokes key
All data in Snowflake is encrypted at rest and in transit by default — no configuration needed. Tri-Secret Secure (Business Critical+) ensures Snowflake itself cannot decrypt data without the customer's key.

Virtual Warehouses

CREATE WAREHOUSE my_wh WITH WAREHOUSE_SIZE = 'MEDIUM' AUTO_SUSPEND = 300 AUTO_RESUME = TRUE INITIALLY_SUSPENDED = TRUE;
Create a warehouse that auto-suspends after 5 minutes of inactivity and auto-resumes on the next query. INITIALLY_SUSPENDED avoids billing before first use.
ALTER WAREHOUSE my_wh SET WAREHOUSE_SIZE = 'LARGE'; ALTER WAREHOUSE my_wh SUSPEND; ALTER WAREHOUSE my_wh RESUME;
Resize a warehouse online (queries waiting in queue will use the new size). Manually suspend or resume to control costs.
Warehouse credit consumption (per hour, running): XS = 1 S = 2 M = 4 L = 8 XL = 16 2XL = 32 3XL = 64 4XL = 128 5XL = 256 6XL = 512
Credits double with each size increase. A 6XL costs 512x more than an XS per hour. Billing starts when the warehouse resumes and stops 60 seconds after it suspends.
-- Multi-cluster warehouse (Enterprise+ required) CREATE WAREHOUSE my_wh WITH WAREHOUSE_SIZE = 'MEDIUM' MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 5 SCALING_POLICY = 'STANDARD'; -- or 'ECONOMY'
Multi-cluster warehouses scale OUT (add clusters) for concurrency. STANDARD adds clusters immediately; ECONOMY waits until queue will last 6+ minutes.
Scale UP → bigger warehouse size → complex, slow queries Scale OUT → more clusters (MCW) → many concurrent users Scaling policies: STANDARD: adds clusters immediately (performance priority) ECONOMY: waits ~6 min queue threshold (cost priority)
Choose scale-up to speed up a single heavy query. Choose scale-out to handle more simultaneous users. Multi-cluster warehouses require Enterprise edition.

Time Travel and Fail-safe

-- Query data at a specific timestamp SELECT * FROM orders AT(TIMESTAMP => '2026-06-01 09:00:00'::TIMESTAMP_TZ); -- Query data N seconds ago SELECT * FROM orders AT(OFFSET => -60 * 60); -- 1 hour ago -- Query data before a specific statement ran SELECT * FROM orders BEFORE(STATEMENT => '<query_id>');
Three Time Travel reference modes: TIMESTAMP, OFFSET (seconds from now), and STATEMENT (before a specific query ID ran). Use AT for 'as of', BEFORE for 'just before'.
-- Restore a dropped table, schema, or database UNDROP TABLE my_table; UNDROP SCHEMA my_schema; UNDROP DATABASE my_db; -- Clone a table to its state 1 hour ago CREATE TABLE orders_restored CLONE orders AT(OFFSET => -3600);
UNDROP restores the most recently dropped object if within the Time Travel retention window. Cloning with AT creates an independent copy at a past point in time.
Time Travel → USER-accessible, configurable (0–90 days) → Accessed via AT/BEFORE and UNDROP Fail-safe → SNOWFLAKE-ONLY, non-configurable 7 days → Only Snowflake Support can recover data → Begins AFTER Time Travel period ends → NOT available for transient or temporary tables
The most-tested distinction on the exam: if YOU need to recover data, use Time Travel. Fail-safe is a last-resort Snowflake-internal recovery mechanism.
ALTER TABLE my_table SET DATA_RETENTION_TIME_IN_DAYS = 0;
Setting retention to 0 disables Time Travel for that table. This also means 0-day Fail-safe for transient/temporary tables — data is permanently lost on drop.

Zero-Copy Cloning

-- Clone a database, schema, or table CREATE DATABASE dev_db CLONE prod_db; CREATE SCHEMA dev_schema CLONE prod_schema; CREATE TABLE dev_orders CLONE prod_orders; -- Clone at a past point in time CREATE TABLE orders_backup CLONE orders BEFORE(STATEMENT => '<query_id>');
Zero-copy cloning creates a metadata-only pointer to existing micro-partitions. No data is duplicated — storage cost is zero until the clone diverges from the source.
Cloning rules: - Initial storage cost = 0 bytes (metadata only) - Storage grows only as clone data diverges from source - Clones are independent objects — DML on clone does NOT affect source and vice versa - Clones inherit Time Travel history of the source - Can clone: databases, schemas, tables, streams, tasks, sequences, file formats, stages
Cloning is commonly used for dev/test environments, protecting data before bulk DML, or restoring dropped tables. The exam tests that initial storage cost is zero.
Cloning and data protection interaction: - A clone of a PERMANENT table inherits 7-day Fail-safe - A clone of a TRANSIENT table has 0-day Fail-safe - Cloning does NOT reset the Time Travel window — the clone can travel back to the source's history - TRUNCATE TABLE on source does NOT affect the clone - DROP TABLE on source does NOT affect the clone
Clones are fully independent after creation. The exam tests that cloning a transient table still yields a transient clone with 0-day Fail-safe, not a protected permanent copy.

Data Loading: Stages and COPY INTO

Stage types: @~ User stage (per-user, cannot be dropped) @%my_table Table stage (per-table, cannot be dropped) @my_stage Named internal stage (configurable) @my_ext_stage External stage (S3 / GCS / Azure Blob)
User and table stages cannot be altered or dropped. Named internal stages are the most flexible. External stages reference your own cloud storage.
-- Create a named internal stage CREATE STAGE my_stage FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = ',' SKIP_HEADER = 1); -- Create an external stage (S3) CREATE STAGE my_s3_stage URL = 's3://my-bucket/path/' STORAGE_INTEGRATION = my_s3_integration;
Use STORAGE_INTEGRATION objects instead of embedding credentials in stage definitions. Storage integrations store the cloud provider IAM configuration securely.
-- Bulk load from a stage COPY INTO orders FROM @my_stage/orders/ FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1) ON_ERROR = 'CONTINUE' PURGE = TRUE;
COPY INTO is the primary bulk load command. PURGE removes files from the stage after successful load. ON_ERROR = CONTINUE loads valid rows and skips error rows.
-- Validate files without loading any data COPY INTO orders FROM @my_stage/ FILE_FORMAT = (TYPE = CSV) VALIDATION_MODE = 'RETURN_ALL_ERRORS';
VALIDATION_MODE performs a dry run — returns errors but loads ZERO rows. Use before a full load to catch format mismatches early.
-- ON_ERROR options: -- CONTINUE → skip bad rows, load rest -- SKIP_FILE → skip whole file on any error -- ABORT_STATEMENT → halt entire load (default) -- SKIP_FILE_<n> → skip file after n errors -- SKIP_FILE_<n>% → skip file after n% error rate -- Force reload of previously loaded files COPY INTO orders FROM @my_stage FORCE = TRUE;
COPY INTO skips files already loaded (tracked by load history for 64 days). Use FORCE = TRUE to reload files regardless of prior load status.
-- PUT: upload local file to internal stage (SnowSQL only) PUT file:///tmp/orders.csv @my_stage; -- GET: download file from internal stage to local (SnowSQL only) GET @my_stage/orders.csv file:///tmp/;
PUT and GET only work from SnowSQL CLI or JDBC/ODBC connections. They DO NOT work in the Snowsight web UI or via the REST API.
-- Unload data to a stage COPY INTO @my_stage/exports/orders_ FROM (SELECT order_id, amount, region FROM orders) FILE_FORMAT = (TYPE = PARQUET) HEADER = TRUE OVERWRITE = TRUE;
COPY INTO can also unload table data to a stage. Use a SELECT subquery to filter or transform. Output files are automatically partitioned into 250 MB chunks.
-- Transformation during COPY INTO load COPY INTO orders_clean (order_id, region, amount_usd) FROM ( SELECT $1::INTEGER, -- reorder columns UPPER($3::STRING), -- transform on load $2::FLOAT * 1.1 -- compute new value FROM @my_stage/orders.csv ) FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1);
COPY INTO supports inline transformation using a SELECT subquery. Supports column reordering, casting, and simple expressions. Complex joins and aggregations are NOT supported during load.

Snowpipe, Connectivity, and Load History

-- Create a Snowpipe (auto-ingest from S3) CREATE PIPE my_pipe AUTO_INGEST = TRUE AS COPY INTO orders FROM @my_s3_stage FILE_FORMAT = (TYPE = JSON);
Snowpipe with AUTO_INGEST uses S3 event notifications (SNS/SQS) or Azure Event Grid or GCS Pub/Sub to trigger loads when new files arrive — no warehouse required.
Snowpipe vs COPY INTO: COPY INTO → requires active warehouse, warehouse credits, batch-triggered, best for large scheduled loads Snowpipe → serverless compute, per-GB billing (since Dec 2025), event-triggered, best for near-real-time ingestion
Snowpipe billing changed to simplified per-GB pricing (0.0037 credits per GB ingested) in December 2025. COPY INTO still uses virtual warehouse credits billed per credit-hour.
-- Snowpipe REST API call (when AUTO_INGEST is not used) SELECT SYSTEM$PIPE_STATUS('my_pipe'); -- Check Snowpipe load history (available 14 days) SELECT * FROM TABLE(INFORMATION_SCHEMA.COPY_HISTORY( TABLE_NAME => 'ORDERS', START_TIME => DATEADD(HOUR, -1, CURRENT_TIMESTAMP) ));
Use SYSTEM$PIPE_STATUS to check if a pipe is running and its queue depth. COPY_HISTORY in INFORMATION_SCHEMA tracks all loads (both COPY INTO and Snowpipe) for the last 14 days.
Snowflake connectivity options: Python Connector → native Snowflake Python library Spark Connector → read/write Snowflake from Apache Spark Kafka Connector → stream Kafka topics into Snowflake JDBC Driver → Java/JVM applications ODBC Driver → BI tools (Tableau, Power BI, Excel) Node.js Driver → JavaScript/Node applications .NET Driver → .NET/C# applications SnowSQL CLI → interactive SQL + scripting REST API → programmatic access (no client library)
Know which connector to use for each scenario: Spark for large-scale ETL, Kafka for streaming ingestion, JDBC/ODBC for BI tools, and Python Connector for data science workflows.

Semi-Structured Data (VARIANT)

-- VARIANT column stores JSON, Avro, Parquet, ORC, XML CREATE TABLE events ( raw_data VARIANT ); -- Insert JSON INSERT INTO events SELECT PARSE_JSON('{"user":"alice","clicks":5}'); -- Dot notation to access nested fields SELECT raw_data:user::STRING, raw_data:clicks::INTEGER FROM events;
VARIANT stores up to 16 MB of semi-structured data per value. Access nested keys with colon notation (raw_data:key) and cast with :: to convert to a SQL type.
-- Flatten a JSON array into rows SELECT f.value:name::STRING AS product_name FROM orders, LATERAL FLATTEN(input => orders.items) f;
LATERAL FLATTEN explodes arrays or objects within a VARIANT column into individual rows. Each element becomes a row with a VALUE column containing the array element.
-- Load JSON with schema inference COPY INTO events FROM @my_stage/events.json FILE_FORMAT = (TYPE = JSON) MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
MATCH_BY_COLUMN_NAME maps source JSON fields to table columns by name (case-insensitive). Without it, JSON is loaded into a VARIANT column by default.

Performance Optimization and Caching

Three caching layers: 1. Metadata cache (cloud services) → Instant COUNT(*), MIN, MAX, DISTINCT without warehouse → Always on, no configuration needed 2. Result cache (cloud services) → Reuses exact query result for up to 24 hours → FREE — zero warehouse compute cost → Invalidated by: data change, role change, query text change, certain param changes 3. Warehouse (data) cache (local SSD on compute nodes) → Caches raw micro-partition data on warehouse nodes → Requires a RUNNING warehouse → Lost when warehouse SUSPENDS
The three caches serve different scenarios. Result cache is the most impactful cost saver for repeated identical queries. Warehouse cache is lost on suspension — balance cost vs warm cache.
-- Clustering key for large tables with known filter patterns ALTER TABLE orders CLUSTER BY (region, order_date); -- Check clustering quality SELECT SYSTEM$CLUSTERING_INFORMATION('orders', '(region, order_date)');
Clustering keys co-locate data by column values across micro-partitions, enabling pruning. Automatic Clustering (Enterprise+) maintains clustering over time for an ongoing compute cost.
When clustering HELPS: - Multi-TB tables with consistent filter predicates - Date range queries on high-cardinality date columns When clustering HURTS (avoid): - Small tables (micro-partition pruning is already fast) - Tables with frequent random writes or full truncate+reload - No consistent filter pattern → clustering cost with no benefit
Clustering is NOT always beneficial. It adds ongoing Automatic Clustering maintenance credits. Only cluster large tables with predictable, repetitive filter patterns.
-- Diagnose a slow query with Query Profile -- Available in Snowsight: Activity → Query History → Profile Spilling indicators (fix by upsizing warehouse): Bytes spilled to local storage → memory insufficient Bytes spilled to remote storage → severely undersized WH Pruning indicator (good = fewer partitions scanned): Partitions scanned vs Partitions total
Query Profile visualizes the execution plan with operators, data flow, and statistics. Spilling to remote storage is the worst symptom — a larger warehouse fixes it.
-- Materialized view (Enterprise+ only) CREATE MATERIALIZED VIEW mv_daily_sales AS SELECT region, order_date, SUM(amount) AS total FROM orders GROUP BY region, order_date; -- Limitations: no JOINs, no UDFs, single base table only
Materialized views are pre-computed and auto-refreshed by Snowflake. They incur background maintenance compute charges. Best for expensive aggregations on large, slowly changing tables.
-- Search Optimization Service (Enterprise+ only) ALTER TABLE events ADD SEARCH OPTIMIZATION; -- Check search optimization status SELECT * FROM INFORMATION_SCHEMA.SEARCH_OPTIMIZATION_HISTORY WHERE TABLE_NAME = 'EVENTS'; Best for: - Point lookup queries (WHERE id = 12345) - High-cardinality column searches - Substring/regex searches on large text columns - Queries on VARIANT paths (semi-structured data)
Search Optimization Service builds a persistent search access path that accelerates equality and IN predicates on large tables. It incurs a one-time build cost and ongoing storage for the search index.

Streams and Tasks (ETL Automation)

-- Standard stream: captures INSERT, UPDATE, DELETE CREATE STREAM orders_stream ON TABLE orders; -- Append-only stream: captures INSERT only CREATE STREAM orders_append_stream ON TABLE orders APPEND_ONLY = TRUE; -- Query stream change records SELECT *, METADATA$ACTION, METADATA$ISUPDATE FROM orders_stream;
Streams record a table's DML changes as a Change Data Capture (CDC) offset. METADATA$ACTION is INSERT or DELETE. Updates appear as a DELETE row (pre-image) and INSERT row (post-image).
-- Task with warehouse compute CREATE TASK process_orders_task WAREHOUSE = my_wh SCHEDULE = '15 MINUTE' WHEN SYSTEM$STREAM_HAS_DATA('orders_stream') AS INSERT INTO orders_target SELECT * FROM orders_stream WHERE METADATA$ACTION = 'INSERT'; -- Start the task (tasks are suspended by default) ALTER TASK process_orders_task RESUME;
WHEN SYSTEM$STREAM_HAS_DATA() skips the task run if the stream is empty, avoiding unnecessary warehouse usage. Tasks are created in SUSPENDED state and must be explicitly resumed.
-- Serverless task (no warehouse needed) CREATE TASK serverless_task USER_TASK_MANAGED_INITIAL_WAREHOUSE_SIZE = 'XSMALL' SCHEDULE = 'USING CRON 0 2 * * * UTC' AS CALL my_stored_procedure(); -- Task DAG: child task runs after parent CREATE TASK child_task AFTER parent_task AS MERGE INTO target USING source ON ...
Serverless tasks use Snowflake-managed compute billed per second. AFTER creates DAG dependencies — child tasks execute only after parent completes successfully.

Account Management and RBAC

System-defined role hierarchy: ACCOUNTADMIN ├── SECURITYADMIN │ └── USERADMIN └── SYSADMIN └── (custom roles should be granted here) PUBLIC → auto-granted to all users
ACCOUNTADMIN is the top-level role combining SYSADMIN and SECURITYADMIN. Custom roles must be granted to SYSADMIN (or its hierarchy) so SYSADMIN can manage the objects they own.
Role responsibilities: ACCOUNTADMIN → top-level, manages account settings, sees all grants, creates resource monitors. Limit to 2-3 users with MFA enabled. SECURITYADMIN → manages grants (GRANT/REVOKE on any object) USERADMIN → creates and manages users and roles SYSADMIN → creates and manages objects (warehouses, databases, schemas, tables) PUBLIC → default role for every user, minimal privs
USERADMIN creates roles but cannot grant object privileges. SECURITYADMIN manages grants but typically delegates user creation. ACCOUNTADMIN should NOT be used for daily operations.
-- Create a custom role and grant it to SYSADMIN CREATE ROLE analyst_role; GRANT ROLE analyst_role TO ROLE sysadmin; -- Grant object privileges to the role GRANT USAGE ON DATABASE analytics TO ROLE analyst_role; GRANT USAGE ON SCHEMA analytics.public TO ROLE analyst_role; GRANT SELECT ON ALL TABLES IN SCHEMA analytics.public TO ROLE analyst_role; -- Assign role to a user GRANT ROLE analyst_role TO USER alice;
Always grant custom roles to SYSADMIN to prevent orphan roles. Object access requires cascading USAGE grants on database and schema before granting table privileges.
-- Resource monitor with credit quota and actions CREATE RESOURCE MONITOR monthly_limit WITH CREDIT_QUOTA = 1000 TRIGGERS ON 75 PERCENT DO NOTIFY ON 100 PERCENT DO SUSPEND ON 110 PERCENT DO SUSPEND_IMMEDIATE; -- Assign to a specific warehouse ALTER WAREHOUSE my_wh SET RESOURCE_MONITOR = monthly_limit;
Resource monitors track CREDIT usage (not storage). SUSPEND waits for current queries to finish. SUSPEND_IMMEDIATE cancels running queries. Only ACCOUNTADMIN can create resource monitors.
-- Network policy: restrict by IP CREATE NETWORK POLICY corp_only ALLOWED_IP_LIST = ('192.168.1.0/24', '10.0.0.0/8') BLOCKED_IP_LIST = ('192.168.1.99'); -- Apply at account level (ACCOUNTADMIN) ALTER ACCOUNT SET NETWORK_POLICY = corp_only; -- Apply at user level ALTER USER alice SET NETWORK_POLICY = corp_only;
Network policies control which IP addresses can connect to Snowflake. Applied at account level or per-user level. Blocked IPs override allowed IPs.
-- Object tags: attach metadata labels to any Snowflake object CREATE TAG sensitivity_level ALLOWED_VALUES 'PUBLIC', 'CONFIDENTIAL', 'RESTRICTED'; ALTER TABLE customers SET TAG sensitivity_level = 'CONFIDENTIAL'; ALTER TABLE customers ALTER COLUMN email SET TAG sensitivity_level = 'RESTRICTED'; -- Query tag assignments SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.TAG_REFERENCES WHERE TAG_NAME = 'SENSITIVITY_LEVEL'; -- Data Classification: automatically tag columns with sensitivity categories SELECT EXTRACT_SEMANTIC_CATEGORIES('CUSTOMERS'); CALL ASSOCIATE_SEMANTIC_CATEGORY_TAGS('CUSTOMERS', EXTRACT_SEMANTIC_CATEGORIES('CUSTOMERS'));
Object tags label any Snowflake object for governance tracking. Data classification uses built-in functions to automatically detect and tag PII columns (email, SSN, phone, etc.) — key COF-C03 governance topic.
-- Secondary roles: use privileges from multiple roles simultaneously ALTER USER alice SET DEFAULT_SECONDARY_ROLES = ('ALL'); -- In a session, activate secondary roles USE SECONDARY ROLES ALL; -- MFA (multi-factor authentication) best practices: -- Required best practice for ACCOUNTADMIN role users -- Uses Duo Mobile app for TOTP-based 2FA -- Enforce MFA at account level with account policies
Secondary roles let a session combine privileges from all roles assigned to a user without switching roles. MFA is required for ACCOUNTADMIN users and is enforced via Snowflake account policies.

Data Governance: Masking and Row Access

-- Create a column masking policy CREATE MASKING POLICY email_mask AS (val STRING) RETURNS STRING -> CASE WHEN current_role() IN ('ANALYST') THEN val ELSE '****@****.com' END; -- Apply to a table column ALTER TABLE customers ALTER COLUMN email SET MASKING POLICY email_mask;
Masking policies apply at query time based on the current role. The column is visible in schema but values are conditionally hidden. Requires Enterprise edition or higher.
-- Create a row access policy CREATE ROW ACCESS POLICY region_policy AS (region STRING) RETURNS BOOLEAN -> CASE WHEN current_role() = 'GLOBAL_ANALYST' THEN TRUE ELSE region = current_user() END; -- Apply to a table ALTER TABLE sales ADD ROW ACCESS POLICY region_policy ON (region);
Row access policies filter entire rows based on role or user context. Rows returning FALSE from the policy body are invisible to the querying user.
Masking policy → COLUMN-level → hides/transforms values → Column still visible, data is masked Row access policy → ROW-level → filters out entire rows → Matching rows invisible in results Both apply at QUERY TIME based on current role. Both can coexist on the same table.
Key exam distinction: masking policies operate on column values (present but transformed), row access policies remove entire rows from results. Both are Enterprise+ features.

Secure Data Sharing and Collaboration

-- Provider: create a share CREATE SHARE my_share; GRANT USAGE ON DATABASE analytics TO SHARE my_share; GRANT USAGE ON SCHEMA analytics.public TO SHARE my_share; GRANT SELECT ON TABLE analytics.public.sales TO SHARE my_share; -- Add consumer account to the share ALTER SHARE my_share ADD ACCOUNTS = consumer_account_name; -- Consumer: create a database from the share CREATE DATABASE shared_analytics FROM SHARE provider.my_share;
Secure Data Sharing requires no data movement — data remains in the provider's account. Consumers pay only for their own compute. Only ACCOUNTADMIN can create shares by default.
Secure Data Sharing rules: - Shared data is READ-ONLY for consumers - No data is copied — consumers query provider's storage - Consumer pays their own warehouse compute costs - Only SECURE views, SECURE materialized views, and SECURE UDFs can be added to shares - Regular (non-secure) views CANNOT be shared - Cross-region sharing requires database replication
The top exam traps: consumers cannot modify shared data, regular views cannot be shared, and cross-region sharing requires replication setup.
Reader accounts: CREATE MANAGED ACCOUNT reader_corp ADMIN_NAME = 'admin' ADMIN_PASSWORD = '...'; - Created BY provider FOR non-Snowflake customers - Provider pays compute costs for the reader account - Consumer does NOT need their own Snowflake account - Data shared via standard share mechanism
Reader accounts let providers share data with partners who have no Snowflake account. The provider bears all compute costs for the reader account's queries.
Collaboration features: Marketplace → public/paid data listings (powered by sharing) Data Exchange → private, curated sharing hub for partners Data Clean Rooms → multi-party analysis without raw data exposure Native Apps → deploy code + data as installable apps Replication → copy databases across regions/cloud providers
Snowflake Marketplace, Data Exchange, and Clean Rooms all use Secure Data Sharing under the hood. Native Apps extend this to bundled applications.

COF-C03 New Topics: Cortex AI, Iceberg, Notebooks

-- Cortex AI SQL functions (no model hosting needed) SELECT SNOWFLAKE.CORTEX.SENTIMENT(review_text) FROM customer_reviews; SELECT SNOWFLAKE.CORTEX.SUMMARIZE(article_body) FROM news_articles; SELECT SNOWFLAKE.CORTEX.TRANSLATE(text, 'en', 'fr') FROM messages;
Cortex AI provides SQL-callable LLM functions (SENTIMENT, SUMMARIZE, TRANSLATE, COMPLETE, etc.). Data stays in Snowflake — no external API calls or model management required.
Cortex AI components (know at concept level): Cortex Functions → SQL-callable AI: sentiment, summarize, translate, complete, classify, embed_text Cortex Search → semantic/hybrid search over documents Cortex Analyst → natural language to SQL (BI on your data) Cortex Fine-tuning → fine-tune LLMs on your own data
COF-C03 tests conceptual knowledge of Cortex AI capabilities, not API syntax. Know what each component does and that all processing occurs within Snowflake.
Apache Iceberg Tables in Snowflake: Native Iceberg → Snowflake manages metadata + storage, full read/write with Snowflake performance External Iceberg → files in your cloud storage, Snowflake provides query layer only Key use case: multi-engine interoperability (Snowflake + Spark + Trino + other engines on same data)
Iceberg tables enable open data lake access. Unlike native Snowflake tables, Iceberg tables use open formats other engines can read. The exam tests the WHY, not the operational details.
Snowflake Notebooks: - Interactive cell-based environment (SQL, Python, Markdown) - Runs inside Snowflake (compute stays in your account) - Supports Snowpark for Python transformations - Version control via Git integration - Collaborative: share notebooks across team members
Snowflake Notebooks replace external Jupyter/Databricks notebooks for Snowflake-native development. Know the use cases and that they run on Snowflake compute — no data leaves the platform.
-- Git integration: link a Snowflake Git repository object CREATE GIT REPOSITORY my_repo API_INTEGRATION = github_integration ORIGIN = 'https://github.com/org/repo.git'; -- Fetch latest changes ALTER GIT REPOSITORY my_repo FETCH; -- Execute a file from the repo EXECUTE IMMEDIATE FROM @my_repo/branches/main/scripts/etl.sql;
Git integration enables CI/CD workflows within Snowflake. Stored procedures, notebooks, and SQL scripts can be version-controlled and executed directly from a linked Git repository.
Snowpark (Python / Java / Scala): - Developer framework to write data pipelines in code - Execution happens INSIDE Snowflake — data never leaves - Uses DataFrame API (similar to Spark/Pandas) - Runs on virtual warehouse compute or serverless - Supports Stored Procedures, UDTFs, and vectorized UDFs -- Simple Snowpark Python example from snowflake.snowpark import Session session = Session.builder.configs(conn_params).create() df = session.table('orders') df.filter(col('region') == 'US').write.save_as_table('us_orders')
Snowpark keeps data transformation logic inside Snowflake rather than pulling data out to a client. The exam tests positioning (when to use Snowpark vs SQL), not the API syntax.

Ready to test yourself?

Start a timed COF-C03 mock exam or review practice questions by domain.