CertPrepNow
SnowflakeCOF-C035 domains

COF-C03 Exam Notes

Last-minute traps, must-know facts, and scenario tips for the SnowPro Core Certification exam.

General Exam Tips

  • 1.Read ALL answer options before selecting — many SnowPro Core questions have two answers that seem correct, and the distinction is in a single qualifier like 'user-accessible' vs 'Snowflake-only'
  • 2.There is no penalty for wrong answers — flag difficult questions and return to them, but never leave any blank
  • 3.Multiple-select questions explicitly state how many answers to choose (e.g., 'Choose 2'). If you select the wrong number, the entire question is marked wrong
  • 4.At 69 seconds per question (115 minutes / 100 questions), you cannot spend more than 90 seconds on any single question without creating time pressure — flag and move on
  • 5.Domain 1 (31%) and Domain 4 (21%) together cover more than half the exam — if you ran out of study time, these two domains are your triage priority
  • 6.When stuck between two answers, ask: does the scenario involve ONE user's query performance (scale up) or MANY concurrent users queuing (scale out)?
  • 7.Questions about recovery almost always want you to distinguish between Time Travel (user can do it) and Fail-safe (only Snowflake support can do it)
  • 8.When a question mentions 'no data movement' or 'no data copying,' the answer is almost certainly related to Secure Data Sharing or Zero-Copy Cloning
  • 9.Questions about billing traps: Snowpipe uses serverless credits per GB, COPY INTO uses warehouse credits, result cache costs NOTHING
  • 10.After creating a TASK, remember it must be explicitly RESUMED with ALTER TASK ... RESUME before it executes — a common hands-on trap
Domain 131% of exam

Snowflake AI Data Cloud Features and Architecture

Must-Know Facts

  • Three-layer architecture: cloud services layer (query optimization, metadata, auth, access control), compute layer (virtual warehouses — CPU/memory/temp storage), storage layer (micro-partitions in cloud object storage). Each layer scales INDEPENDENTLY — this is Snowflake's core differentiator
  • Virtual warehouse credit consumption doubles with each size step: XS=1, S=2, M=4, L=8, XL=16, 2XL=32, 3XL=64, 4XL=128, 5XL=256, 6XL=512 credits per hour
  • Multi-cluster warehouses require Enterprise edition minimum. Standard scaling policy starts additional clusters immediately when a query queues. Economy scaling policy waits until the estimated queue time exceeds 6 minutes
  • Micro-partitions: 50–500 MB immutable, compressed, columnar storage units. Snowflake stores min/max metadata per partition per column — this metadata enables partition pruning to skip irrelevant micro-partitions at query time
  • Table type data protection matrix: Permanent = Time Travel (0–90 days on Enterprise) + 7-day Fail-safe. Transient = Time Travel (0–1 day max) + 0-day Fail-safe. Temporary = Time Travel (0–1 day max) + 0-day Fail-safe + session-scoped. External = no Time Travel, no Fail-safe
  • Time Travel AT and BEFORE clauses: AT queries AT the specified timestamp, BEFORE queries just before. UNDROP restores dropped objects within the Time Travel window
  • Fail-safe: fixed 7-day period that begins AFTER Time Travel expires. Snowflake-internal only — support engineers can recover data, but YOU cannot access Fail-safe directly
  • Zero-copy cloning: creates a metadata pointer, not a data copy. Storage cost is ZERO at clone creation. New costs only accrue when rows are modified in the clone (new micro-partitions branch off)
  • Snowflake editions in ascending order: Standard, Enterprise, Business Critical, Virtual Private Snowflake (VPS). Enterprise adds multi-cluster warehouses, 90-day Time Travel, materialized views, dynamic data masking, search optimization. Business Critical adds customer-managed keys (Tri-Secret Secure), HIPAA/PCI compliance, private connectivity
  • Cortex AI: built-in LLM-powered SQL functions (SENTIMENT, SUMMARIZE, TRANSLATE, COMPLETE), Cortex Search (semantic search on unstructured data), Cortex Analyst (natural language to SQL). Know the use cases, not API syntax
  • Apache Iceberg Tables: open table format for interoperability with Spark, Trino, etc. Snowflake can act as the Iceberg catalog and read/write files in object storage. Key difference from native tables: data lives in your cloud storage, not Snowflake-managed storage
  • Encryption is always on: AES-256 at rest, TLS 1.2 in transit. No option to disable encryption

Common Traps

TrapFail-safe is just an extended version of Time Travel that I can query
RealityFail-safe is ENTIRELY different. You cannot query it, run UNDROP against it, or access it in any way. Only Snowflake support can recover data from Fail-safe, and only in emergencies. Time Travel is the user-accessible recovery feature
TrapTransient tables have 7-day Fail-safe just like permanent tables
RealityTransient tables have ZERO days of Fail-safe. This is intentional — they are designed for staging data you can regenerate. After their 1-day Time Travel window, dropped or purged data is gone permanently
TrapZero-copy cloning a 5 TB table will immediately cost me 5 TB of additional storage
RealityClone creation costs zero storage. The clone references the same underlying micro-partitions. You only pay for additional storage as rows in the clone diverge from the source through DML operations
TrapA larger virtual warehouse always means faster query results
RealityLarger warehouses help with complex queries that need more memory or parallelism. Simple queries return in the same time regardless of warehouse size. You will just spend more credits for the same result
TrapAuto-suspend at 5 minutes means the warehouse shuts down 5 minutes after it starts
RealityAuto-suspend counts 5 minutes of INACTIVITY. Each query resets the timer. A warehouse receiving queries every 4 minutes will never auto-suspend regardless of the 5-minute setting
TrapEconomy scaling mode means Snowflake reduces warehouse size to save money
RealityEconomy scaling is a multi-cluster warehouse property that controls when additional CLUSTERS are added. It delays adding a second cluster until the queue is estimated to take over 6 minutes. It has nothing to do with reducing warehouse size

Confusing Pairs

Time TravelFail-safe

Time Travel = YOU can recover data using AT/BEFORE/UNDROP. Configurable retention (0–1 day on Standard, 0–90 days on Enterprise). Fail-safe = Snowflake-ONLY. Fixed 7 days. Starts AFTER Time Travel expires. Not available for transient/temporary tables. If the question involves user-initiated recovery, it is Time Travel

Scaling Up (Resize)Scaling Out (Multi-Cluster)

Scale UP = increase warehouse SIZE (XS → S → M) = more resources per query = fixes slow complex queries. Scale OUT = add more CLUSTERS to a multi-cluster warehouse = more warehouses running in parallel = fixes concurrency (many users queuing). Scale UP for complexity, Scale OUT for concurrency. Multi-cluster requires Enterprise edition

Standard Scaling PolicyEconomy Scaling Policy

Standard = adds a cluster the MOMENT any query queues = minimum wait time = higher credit cost. Economy = waits until queue is estimated to last 6+ minutes before adding a cluster = users may wait = lower credit cost. Standard is the default and appropriate for performance-sensitive workloads

Permanent TablesTransient TablesTemporary Tables

Permanent = full protection: Time Travel up to 90 days (Enterprise) + 7-day Fail-safe. Transient = reduced protection: Time Travel up to 1 day + ZERO Fail-safe. Temporary = same as Transient PLUS session-scoped (auto-dropped when session ends). External tables = read-only, no Time Travel, no Fail-safe

Snowflake Enterprise EditionSnowflake Business Critical Edition

Enterprise adds: multi-cluster warehouses, 90-day Time Travel, materialized views, dynamic data masking, row access policies, Automatic Clustering, search optimization. Business Critical adds ON TOP: customer-managed keys (Tri-Secret Secure), dedicated metadata store, HIPAA/PCI/SOC1 compliance, private connectivity (AWS PrivateLink etc.). If a question asks about CMK or dedicated infrastructure, it requires Business Critical

Scenario Tips

If the question asks about:

Question describes a data team where 50 analysts run queries simultaneously and experience long wait times. Should they resize to a 4XL or enable multi-cluster?

Answer:

Enable multi-cluster warehouse (scaling out). The problem is concurrency — many queries are queuing. Multi-cluster adds additional warehouse clusters to handle parallel query demand

Distractor to avoid:

Resizing to a larger single warehouse (scaling up) helps with complex individual queries, not concurrent load. 50 simultaneous users is a concurrency problem, not a query complexity problem

If the question asks about:

A company drops a critical permanent table. 12 days have passed. Can they recover it?

Answer:

No, unless Enterprise edition with 90-day Time Travel was configured AND they set the retention period to >12 days. Standard edition allows only 1 day. After Time Travel expires, Fail-safe runs for 7 days (days 2–8), then data is permanently gone by day 9 on Standard edition

Distractor to avoid:

Candidates often forget to check the edition AND the configured retention period. Always note: (1) which edition? (2) what was the TIME_TRAVEL_IN_DAYS setting? (3) how many days have passed?

If the question asks about:

A table must be accessible to Spark, Trino, AND Snowflake without data duplication. What table type?

Answer:

Apache Iceberg Table. Iceberg is an open table format stored in object storage that multiple engines can read/write natively

Distractor to avoid:

External tables in Snowflake are read-only. Native Snowflake tables cannot be read by Spark/Trino natively. Iceberg is the correct answer for multi-engine interoperability

If the question asks about:

A business analyst needs to run a natural language question against their Snowflake sales data and get a SQL result — without writing SQL themselves. Which Cortex AI component fits?

Answer:

Cortex Analyst. It takes natural language input and generates SQL queries against structured Snowflake data, functioning as a conversational BI layer. All processing occurs inside Snowflake

Distractor to avoid:

Cortex Search is for semantic/hybrid search over documents. Cortex COMPLETE is a general LLM completion function. Neither is purpose-built for natural language to SQL over structured tables — that is Cortex Analyst's specific use case

Last-Minute Facts

1Credit doubling pattern: XS=1, S=2, M=4, L=8, XL=16, 2XL=32 (continues doubling per size step)
2Micro-partition size range: 50 MB to 500 MB of uncompressed data
3Fail-safe: always 7 days, always non-configurable, always Snowflake-only access
4Standard edition max Time Travel: 1 day. Enterprise+ max: 90 days
5Transient and temporary tables: 0-day Fail-safe (not 7 days)
6Temporary tables: session-scoped, disappear when session ends
7Multi-cluster warehouses: minimum Enterprise edition
8Materialized views: minimum Enterprise edition
9Customer-managed keys (Tri-Secret Secure): minimum Business Critical edition
10Economy scaling policy waits: estimated queue duration > 6 minutes before adding a cluster
Domain 220% of exam

Account Management and Data Governance

Must-Know Facts

  • System-defined role hierarchy from top: ACCOUNTADMIN (holds SYSADMIN + SECURITYADMIN). SECURITYADMIN manages grants globally. USERADMIN creates users and custom roles. SYSADMIN creates and owns database objects. PUBLIC is the base role every user automatically receives
  • ACCOUNTADMIN best practices: limited to a small number of users, must use MFA, should NOT be used for daily work. It does not automatically see all objects — it still requires proper role hierarchy for object access
  • Custom role best practice: always grant custom roles UP to SYSADMIN (directly or via hierarchy). Roles that are not connected to SYSADMIN become orphaned — no one in SYSADMIN's hierarchy can manage those objects
  • SECURITYADMIN can manage grants on ANY securable object, even objects it does not own. USERADMIN creates users/roles but cannot grant object-level privileges
  • PUBLIC role: every user in the account has this role automatically. Any privilege granted to PUBLIC is accessible by ALL users — be extremely careful
  • Resource monitors: created by ACCOUNTADMIN. Track credit usage (NOT storage). Can be assigned to the account level or individual warehouses. Actions: NOTIFY (alert only), SUSPEND (finish current queries then stop), SUSPEND_IMMEDIATELY (kill queries instantly). Each warehouse can have one resource monitor
  • Dynamic data masking: column-level security. The masking policy applies at QUERY TIME based on the executing role — not at write time. The same column returns different values to different roles
  • Row access policies: row-level security. Entire rows are hidden from roles that do not meet the policy condition. Multiple policies can be applied but only one per table/view
  • Object tags: key-value metadata labels applied to Snowflake objects (databases, schemas, tables, columns). Used for governance, data classification, and lineage tracking
  • Network policies: IP allowlists/blocklists applied at account level or per user. Applied at connection time. Can override account-level policies with user-level policies
  • Secondary roles: a session can use multiple roles simultaneously (primary + secondary). Secondary roles provide additional privileges without explicitly switching roles. COF-C03 new topic
  • Data classification: automated scanning to detect and tag sensitive data categories (PII, financial) using CLASSIFY_SCHEMA or similar functions. Tags are applied based on classification results
  • ACCOUNT_USAGE schema: historical data, up to 1-year retention, 45-minute to 3-hour latency. INFORMATION_SCHEMA: real-time data, shorter retention (7 days to 6 months depending on the view)

Common Traps

TrapACCOUNTADMIN can see and control all objects in the account
RealityACCOUNTADMIN has account-level privileges (billing, security settings, resource monitors), but object visibility still depends on the role hierarchy and grants. ACCOUNTADMIN does NOT automatically see every table in every database unless it has been granted access through the hierarchy
TrapResource monitors control and limit storage costs
RealityResource monitors track and limit CREDIT consumption (compute costs from virtual warehouses). They have no ability to monitor or cap storage usage
TrapA masking policy hides rows that contain sensitive data
RealityMasking policies operate at the COLUMN level — they transform or obscure the value returned for a specific column. Row access policies are what filter (hide) entire rows. These are two distinct governance features
TrapUSERADMIN can grant privileges on database objects to users
RealityUSERADMIN's scope is limited to creating and managing users and roles. Granting privileges on objects (tables, warehouses, databases) is SECURITYADMIN's domain, not USERADMIN's
TrapAfter creating a custom role, SYSADMIN automatically manages it
RealityCustom roles must be EXPLICITLY granted to SYSADMIN (or to a role in SYSADMIN's hierarchy). If you create a custom role and only assign users to it without connecting it to SYSADMIN, it becomes an orphaned role that SYSADMIN cannot manage
TrapOne account can have multiple resource monitors assigned to it
RealityOne account has ONE account-level resource monitor. Individual warehouses can also each have their own resource monitor. But you cannot assign multiple account-level monitors to a single account

Confusing Pairs

SECURITYADMINUSERADMIN

SECURITYADMIN = manages grants and privilege assignments across ALL objects in the account. USERADMIN = creates and manages users and custom role definitions. USERADMIN cannot grant object privileges. SECURITYADMIN does not typically create users. They have distinct responsibilities

Dynamic Data MaskingRow Access Policies

Masking = COLUMN-level. The column is visible, but values are transformed (e.g., '****' instead of real email). Row Access = ROW-level. Rows that fail the policy condition are completely invisible to the querying role. Masking hides VALUES within a visible column. Row access hides ENTIRE ROWS from the result set

ACCOUNT_USAGE SchemaINFORMATION_SCHEMA

ACCOUNT_USAGE: latency of 45 min to 3 hours, up to 1 year of retention, requires no active warehouse, account-wide visibility. INFORMATION_SCHEMA: real-time data, shorter retention (7 days to 6 months), requires an active warehouse to query, only shows objects the current role can access. Use ACCOUNT_USAGE for billing audits and historical analysis; use INFORMATION_SCHEMA for real-time object inspection

Network Policy at Account LevelNetwork Policy at User Level

Account-level network policy applies to ALL users unless a user-level policy overrides it. User-level network policy takes precedence over the account policy for that specific user. You can use user-level policies to grant exceptions (e.g., allow an admin to connect from any IP while restricting the rest of the account)

Scenario Tips

If the question asks about:

An analyst role should see customer phone numbers as '***-***-1234' (last 4 digits only), while the data steward role sees the full number. What do you implement?

Answer:

Dynamic data masking policy on the phone_number column. The policy function checks the current role and returns the full number for data stewards, masked value for all others

Distractor to avoid:

Row access policies would hide the entire row, not mask a column value. Secure views could achieve this but masking policies are the purpose-built Snowflake feature and the exam answer

If the question asks about:

A company needs to automatically stop all warehouse activity if monthly credit spend exceeds 10,000 credits to avoid budget overrun. What object and setting?

Answer:

Create a resource monitor with a monthly quota of 10,000 credits and set the action to SUSPEND_IMMEDIATELY. Assign it at the account level so all warehouses are covered

Distractor to avoid:

Auto-suspend on a warehouse stops the warehouse after inactivity — it does not enforce a credit budget. Auto-suspend does not know anything about cumulative monthly spend

If the question asks about:

A data engineer creates a custom role ANALYST_ROLE. Users have the role but SYSADMIN cannot transfer ownership of the tables owned by ANALYST_ROLE. Why?

Answer:

ANALYST_ROLE was not granted to SYSADMIN (or to a role in SYSADMIN's hierarchy). Orphaned roles create objects that fall outside the standard management hierarchy

Distractor to avoid:

Candidates assume ACCOUNTADMIN handles this automatically. Object ownership management goes through SYSADMIN's hierarchy, and that requires the custom role to be explicitly connected

Last-Minute Facts

1Role hierarchy top to bottom: ACCOUNTADMIN > (SECURITYADMIN, SYSADMIN) > USERADMIN, PUBLIC
2ACCOUNTADMIN = SYSADMIN + SECURITYADMIN combined
3Resource monitors: track CREDITS not storage. One account-level monitor max per account
4Resource monitor suspend actions: NOTIFY (alert) → SUSPEND (finish current) → SUSPEND_IMMEDIATELY (kill queries)
5MFA is required as a best practice for ACCOUNTADMIN (uses Duo Mobile)
6ACCOUNT_USAGE latency: up to 3 hours. INFORMATION_SCHEMA: real-time
7ACCOUNT_USAGE retention: up to 1 year. INFORMATION_SCHEMA: 7 days to 6 months
8Secondary roles: allow privileges from multiple roles in a single session without role switching (COF-C03 feature)
9PUBLIC role: auto-granted to every user — privileges on PUBLIC affect ALL users in the account
Domain 318% of exam

Data Loading, Unloading, and Connectivity

Must-Know Facts

  • COPY INTO <table>: bulk load from stages into tables. Requires an active virtual warehouse. Key options: ON_ERROR (CONTINUE/SKIP_FILE/ABORT_STATEMENT), VALIDATION_MODE (RETURN_ERRORS/RETURN_N_ROWS/RETURN_ALL_ERRORS — dry run only, no data loaded), PURGE (delete staged files after successful load), FORCE (reload already-loaded files)
  • COPY INTO <location>: bulk unload data from a table to a stage. Supports partitioning output files, max file size, and all supported file formats
  • Stage types: user stage (@~) — each user has one, cannot be altered or dropped, tied to the user. Table stage (@%tablename) — each table has one, cannot be altered or dropped, files not copied between stages. Named internal stage (@stagename) — the most flexible, can have file formats, encryption settings, can be managed. External stage — points to S3/GCS/Azure Blob, requires storage integration or credentials
  • Snowpipe: serverless continuous ingestion. Does not use virtual warehouse compute. Triggered by cloud event notifications (S3 SQS, Azure Event Grid, GCS Pub/Sub) in auto-ingest mode, or by REST API calls. Billed on a simplified per-GB ingested model (as of December 2025)
  • Snowpipe does NOT automatically delete staged files after loading. You must run REMOVE @stage/path to clean up loaded files
  • PUT command: uploads local files to an INTERNAL stage. ONLY works from SnowSQL CLI, JDBC, or ODBC drivers. Does NOT work in the Snowsight web UI
  • GET command: downloads files from an INTERNAL stage to a local machine. Same connectivity requirements as PUT — SnowSQL/JDBC/ODBC only
  • VARIANT data type: stores semi-structured data (JSON, Avro, ORC, Parquet, XML). Access nested fields with colon notation (src:address:city) or bracket notation. VARIANT column holds the entire document
  • FLATTEN function: explodes arrays or objects in VARIANT into individual rows. Use with LATERAL for correlated expansion. The output columns include KEY, VALUE, INDEX, PATH, SEQ
  • Storage integrations: named objects that hold cloud provider credentials and trust relationships for external stage access. Preferred over embedding credentials directly in stage DDL
  • File formats supported: CSV, JSON, Avro, ORC, Parquet, XML. File format objects can be created as reusable objects or defined inline in COPY statements
  • Transformation during COPY INTO load: SELECT statement can reorder columns, cast types, omit columns, and apply simple expressions during the load process — no separate transformation step needed

Common Traps

TrapVALIDATION_MODE in COPY INTO loads valid rows and only rejects the bad ones
RealityVALIDATION_MODE is a DRY RUN. It validates the file against the target table schema and returns error information, but loads ZERO rows. No data enters the table when VALIDATION_MODE is specified
TrapI can use PUT and GET commands directly in the Snowsight web interface
RealityPUT and GET ONLY work through SnowSQL CLI, JDBC drivers, or ODBC drivers. The web UI does not support these commands. Candidates who primarily use the UI get this wrong
TrapSnowpipe automatically cleans up staged files once they are loaded
RealitySnowpipe loads the data but leaves the source files in place. You must explicitly run REMOVE @stage commands to delete files after loading. Unmanaged file accumulation causes storage costs
TrapUser stages and table stages can be configured like named stages
RealityUser stages (@~) and table stages (@%table) cannot be altered or dropped. You cannot set file formats, encryption, or URL on them. Only named stages support full configuration
TrapSnowpipe's billing is the same as using COPY INTO with a warehouse
RealitySnowpipe uses serverless compute billed per GB of data ingested (simplified pricing since December 2025). COPY INTO uses virtual warehouse credits per hour. They are completely different billing models
TrapFLATTEN alone on a VARIANT column explodes arrays into rows
RealityYou typically need LATERAL FLATTEN to properly explode arrays in a VARIANT column into individual rows within a query. LATERAL allows the FLATTEN to reference the column from the same FROM clause
TrapSnowpipe auto-ingest from S3 is triggered via Amazon SNS directly
RealitySnowpipe auto-ingest from S3 uses an SQS queue that Snowflake provisions per pipe. S3 event notifications are delivered to that SQS queue. SNS can optionally be used as a fan-out intermediary to broadcast to the SQS queue, but the direct trigger mechanism is SQS, not SNS

Confusing Pairs

COPY INTO (Bulk Loading)Snowpipe (Continuous Loading)

COPY INTO = user-triggered batch loading, requires active warehouse, billed as warehouse credits per hour, best for large scheduled batches. Snowpipe = event-triggered continuous loading, serverless (no warehouse), billed per GB ingested, best for real-time or near-real-time file arrival. If the scenario says 'as files arrive' or 'real-time' — choose Snowpipe

Internal StagesExternal Stages

Internal stages (user, table, named) = files stored WITHIN Snowflake-managed cloud storage. Snowflake controls security and access. External stages = files stored in YOUR own cloud bucket (S3/GCS/Azure). You own the storage and manage IAM/permissions via storage integrations. Internal = Snowflake manages it. External = you manage it

User Stage (@~)Table Stage (@%table)Named Stage (@name)

User stage: one per user, tied to a user, cannot be altered/dropped. Table stage: one per table, cannot be altered/dropped, cannot copy files between stages. Named stage: created explicitly, fully configurable (file format, URL, encryption), can be altered and dropped, most flexible and recommended for production pipelines

ON_ERROR = ABORT_STATEMENTON_ERROR = SKIP_FILEON_ERROR = CONTINUE

ABORT_STATEMENT (default): rolls back the entire COPY and loads nothing if any error is found. SKIP_FILE: skips the entire errored file but loads all other files successfully. CONTINUE: loads all valid rows and skips only the individual bad rows. CONTINUE gives the most partial data; ABORT_STATEMENT is all-or-nothing

Scenario Tips

If the question asks about:

Files arrive in S3 every 30 seconds and must be loaded into Snowflake within a minute. Which approach?

Answer:

Snowpipe with auto-ingest using S3 event notifications (SQS). Snowflake provisions an SQS queue per pipe and S3 delivers object-created events to it. Snowpipe is triggered by file arrival events and uses serverless compute — no warehouse needs to be running

Distractor to avoid:

A task with COPY INTO running every minute would work but introduces warehouse costs and 1-minute maximum latency. Snowpipe gives true near-real-time loading without a warehouse

If the question asks about:

A COPY INTO command with VALIDATION_MODE='RETURN_ALL_ERRORS' runs. The engineer then queries the target table and finds it empty. Is this a bug?

Answer:

No — this is expected behavior. VALIDATION_MODE is a dry run. The command validates files but loads ZERO rows regardless of validation results. To actually load data, run COPY INTO without VALIDATION_MODE

Distractor to avoid:

Candidates expect partial loading (valid rows in, bad rows rejected). That behavior requires ON_ERROR = CONTINUE, not VALIDATION_MODE

If the question asks about:

A developer wants to upload a 500 MB CSV file from their laptop to a named internal stage. They log into Snowsight and try to use the PUT command. What happens?

Answer:

The PUT command fails — it is not supported in the Snowsight web UI. The developer must use SnowSQL CLI, JDBC, or ODBC driver to execute PUT

Distractor to avoid:

Snowsight supports many SQL commands, so candidates assume PUT works there too. This is one of the more reliably tested traps in the loading domain

Last-Minute Facts

1PUT and GET: SnowSQL/JDBC/ODBC ONLY — not available in Snowsight web UI
2VALIDATION_MODE: dry run — validates files, loads ZERO rows
3Snowpipe: serverless, per-GB billing (since December 2025), does NOT auto-delete staged files
4COPY INTO ON_ERROR default: ABORT_STATEMENT (all-or-nothing)
5User stage: @~ (each user has exactly one, cannot alter/drop)
6Table stage: @%tablename (each table has exactly one, cannot alter/drop)
7VARIANT stores semi-structured data; access nested fields with colon notation
8LATERAL FLATTEN explodes arrays in VARIANT into individual rows
9Storage integrations: preferred over embedding credentials in stage DDL
10Snowpipe does not delete staged files — you must run REMOVE commands manually
Domain 421% of exam

Performance Optimization, Querying, and Transformation

Must-Know Facts

  • Three caching tiers: (1) Metadata cache — cloud services layer, instant responses for COUNT/MIN/MAX/DISTINCT on micro-partition metadata, no warehouse needed. (2) Result cache — cloud services layer, stores exact query results for 24 hours of inactivity (extended up to 31 days with repeated execution), no warehouse needed, free. (3) Warehouse cache — local SSD on virtual warehouse nodes, caches raw data from previous scans, requires a RUNNING warehouse
  • Result cache conditions: exact query text match (whitespace matters), same session role, no underlying table data changes, no relevant session parameter changes. ANY change breaks the cache
  • Warehouse cache is lost when the warehouse SUSPENDS. Auto-suspend that is too aggressive eliminates cache warmth and degrades repeated query performance
  • Query Profile: visual DAG of query execution operators. Key things to look for: large table scans with no partition pruning (fix: clustering key), exploding joins, spilling to local or remote storage (fix: larger warehouse), inefficient remote I/O
  • Spilling to disk: occurs when warehouse memory is exhausted. First spills to local disk (SSD), then to remote storage (much slower). Visible in Query Profile under SPILL columns. Fix: upsize warehouse or reduce data volume per query
  • Partition pruning: Snowflake uses micro-partition metadata (min/max per column per partition) to skip irrelevant partitions at query time. Effective pruning is the single biggest driver of fast query performance on large tables
  • Clustering keys: user-defined sort columns that organize data within micro-partitions for better pruning. Best for large tables (multi-TB+) with well-known, consistent filter patterns (e.g., always filtering on date_column). Automatic Clustering (Enterprise+) maintains cluster order over time — this incurs ongoing background compute costs
  • Poor candidates for clustering: small tables, tables with random write patterns, tables with many equally-frequent query patterns. Clustering ALWAYS costs maintenance credits — only cluster when the query savings outweigh the cost
  • Materialized views: pre-computed, Snowflake-auto-refreshed query results. Enterprise+ only. Limitations: base table must be a SINGLE table (no joins), no UDFs, no subqueries in the view, no nesting of materialized views. Auto-refresh incurs background compute costs
  • Streams: CDC (change data capture) objects on tables. Standard stream: captures inserts, updates, and deletes. Append-only stream: captures inserts only (more efficient for append-only workloads). Streams include metadata columns: METADATA$ACTION, METADATA$ISUPDATE, METADATA$ROW_ID
  • Tasks: scheduled SQL execution. Must be RESUMED with ALTER TASK ... RESUME after creation — tasks do not start automatically. Use SYSTEM$STREAM_HAS_DATA() in the WHEN clause to skip task runs when the stream is empty
  • Search Optimization Service: Enterprise+ background service that accelerates point lookups and equality/substring searches on VARIANT fields. Applied per-table as a managed service — adds storage cost for the search access path
  • Automatic Query Acceleration (AQA): automatically offloads eligible portions of a long-running query to serverless compute, helping warehouses that are occasionally overwhelmed by outlier queries

Common Traps

TrapThe result cache works as long as I use the same query
RealityResult cache requires (1) exact query text match including whitespace, (2) same role, AND (3) no changes to underlying data. Even a single space difference invalidates the cache. Role switching also invalidates it
TrapA task I just created will run automatically on its next scheduled time
RealityNewly created tasks are in SUSPENDED state by default. You must explicitly run ALTER TASK <name> RESUME before the task will execute on its schedule. This is a very commonly missed operational detail
TrapKeeping auto-suspend disabled saves money because the warehouse cache stays warm
RealityA perpetually running idle warehouse consumes credits even when no queries run. Balance: short auto-suspend saves credits but destroys cache warmth. Longer auto-suspend preserves cache but burns credits. The right setting depends on workload patterns
TrapMaterialized views can be created on joins between multiple tables
RealityMaterialized views in Snowflake must be based on a SINGLE table. No joins allowed in the materialized view definition. This is a significant limitation compared to other platforms
TrapAdding a clustering key on a low-cardinality column like a boolean will greatly improve performance
RealityClustering works best on columns with MODERATE cardinality where data can be meaningfully co-located. A boolean column (2 distinct values) or a unique key (every value different) are both poor clustering choices. Partition pruning is most effective when the filter column has a moderate number of distinct values spread across time-ordered data
TrapSpilling to remote storage is the same speed as spilling to local disk
RealityLocal disk (SSD) spilling is much faster than remote storage spilling. Remote storage spilling severely degrades query performance and is visible in the Query Profile. The fix is a larger warehouse with more memory to avoid spilling at all

Confusing Pairs

Result CacheWarehouse CacheMetadata Cache

Metadata cache: cloud services layer, no warehouse needed, serves only aggregate queries that can be answered from partition metadata. Result cache: cloud services layer, stores full query output for 24 hours, no warehouse needed, completely FREE. Warehouse cache: SSD on compute nodes, stores raw data blocks, requires RUNNING warehouse, survives between queries but dies on warehouse suspend. All three save cost — only Result cache is truly free and requires no warehouse

Streams (Standard)Streams (Append-Only)

Standard stream: tracks ALL DML changes — inserts, updates, AND deletes. Use when you need to propagate updates and deletes downstream. Append-only stream: tracks INSERTS only. More efficient and lower overhead. Use when source table is append-only (logs, events) and you do not need to track updates/deletes

Clustering KeysSearch Optimization Service

Clustering keys: physical reorganization of micro-partitions by specified columns, helps range scans and filter queries on those columns. Search Optimization Service: index-like access paths for POINT LOOKUPS and substring/equality searches, especially on VARIANT fields. Use clustering for range queries on large tables. Use search optimization for finding specific records by exact value or substring match

Scenario Tips

If the question asks about:

A query runs in 2 seconds on Monday but 40 seconds on Tuesday with identical query text and unchanged data. Most likely cause?

Answer:

The result cache expired (24-hour TTL). On Monday, the query hit the cache. On Tuesday, after >24 hours of inactivity, the cache expired and the full query re-executed with warehouse compute

Distractor to avoid:

Candidates may blame warehouse resizing or data changes, but the scenario specifies unchanged data — result cache expiration is the designed explanation

If the question asks about:

The Query Profile shows a table scan of 800M rows where only 50K rows match the WHERE clause filter. No partition pruning is occurring. What is the best fix?

Answer:

Add a clustering key on the filter column. This co-locates related data within micro-partitions so the pruning metadata can skip irrelevant partitions, dramatically reducing rows scanned

Distractor to avoid:

Upsizing the warehouse processes the same 800M rows faster but does not reduce the data scanned. The root problem is no pruning — the solution is better data organization, not more compute

If the question asks about:

A pipeline uses a stream to capture changes and a task to process them on a 10-minute schedule. The task runs even when no changes have occurred. What should be added to avoid unnecessary runs?

Answer:

Add WHEN SYSTEM$STREAM_HAS_DATA('<stream_name>') to the task definition. This causes the task to skip execution when the stream has no unconsumed change records

Distractor to avoid:

Increasing the task interval from 10 to 60 minutes reduces runs but still executes unnecessarily on empty streams. SYSTEM$STREAM_HAS_DATA is the correct mechanism to skip empty-stream runs

If the question asks about:

A MERGE operation on a 10TB table spills to remote storage in the Query Profile. The current warehouse is Large. What is the recommended fix?

Answer:

Upsize to an XL or 2XL warehouse to provide more memory. Remote storage spilling indicates the warehouse memory is exhausted, severely degrading performance. More memory eliminates the spill

Distractor to avoid:

Enabling multi-cluster (scaling out) would help with concurrent queries, not with a single memory-hungry MERGE operation. The right fix for memory-exhausted single queries is scaling UP, not OUT

Last-Minute Facts

1Result cache TTL: 24 hours of inactivity (can extend to 31 days with repeated identical queries)
2Result cache is FREE — no warehouse required
3Warehouse cache dies when the warehouse suspends
4Tasks are SUSPENDED after creation — must run ALTER TASK ... RESUME
5SYSTEM$STREAM_HAS_DATA() checks if a stream has unconsumed records
6Materialized views: single-table only, no joins, Enterprise+ required
7Clustering keys: best for large tables (multi-TB) with consistent filter patterns
8Poor clustering candidates: boolean columns, unique-key columns
9Spill to remote storage = severe performance degradation — fix by upsizing warehouse
10Search Optimization Service: designed for point lookups and VARIANT field searches (Enterprise+)
11Automatic Clustering = ongoing background compute cost — only cluster when query savings > maintenance cost
Domain 510% of exam

Data Collaboration

Must-Know Facts

  • Secure Data Sharing: data stays in the PROVIDER's account — no data is copied or moved. Consumers access a live, read-only view of the provider's data. Consumers pay only for their own compute (their warehouse) to query shared data. No storage cost for the consumer
  • Shared objects are ALWAYS read-only for consumers. There is no mechanism for consumers to write back to shared data
  • Only SECURE views, SECURE materialized views, and SECURE UDFs can be added to a share. Regular (non-secure) views cannot be shared because their underlying query logic would be exposed
  • Sharing within the SAME region works natively. Cross-region or cross-cloud sharing requires database replication to move data to a secondary account first
  • Reader accounts: Snowflake accounts created and paid for by the data provider, for consumers who do not have their own Snowflake account. The provider pays ALL compute costs for reader accounts
  • Snowflake Marketplace: public data marketplace where providers publish listings (datasets, apps). Powered by Secure Data Sharing under the hood. Listings can be free or paid. Consumers do not always get data for free
  • Data Exchange: private, invitation-only data sharing hub for controlled sharing within an organization or with selected partners. More controlled than the public Marketplace
  • Data clean rooms: privacy-preserving multi-party collaboration. Multiple organizations can analyze overlapping data without exposing raw records to each other. Uses cryptographic and access control techniques
  • Native Apps Framework: build and distribute applications packaged with Snowflake data and logic. Consumers install the app into their own account. Apps run in the consumer's account and can access the provider's shared data
  • Secure views hide the underlying DDL — consumers cannot see the view definition or base table structure. Essential when sharing derived data to protect intellectual property or source schema
  • Database replication: replicate a database across Snowflake accounts (same or different clouds/regions). Used for disaster recovery, data locality compliance, and enabling cross-region sharing

Common Traps

TrapA data consumer can modify or insert rows into shared data
RealitySecure Data Sharing is strictly READ-ONLY for consumers. Consumers can query shared objects but have no write access whatsoever. This is a fundamental property of the sharing model
TrapI can share a regular view with consumers through Secure Data Sharing
RealityOnly SECURE views can be added to shares. Regular views expose their DDL definition to consumers, which is a security risk. Secure views hide the query definition. This applies equally to materialized views and UDFs — they must be the SECURE variant
TrapCross-region sharing works the same as same-region sharing
RealityNative Secure Data Sharing only works within the SAME cloud region. To share data across regions or cloud providers, you must first replicate the database to a secondary account in the target region, then share from that secondary account
TrapWhen a consumer uses a reader account, the consumer pays for the compute
RealityReader accounts are created AND paid for by the PROVIDER. The provider bears all compute costs for queries run by reader account consumers. This is a key distinguishing feature of reader accounts vs regular consumer accounts
TrapThe Snowflake Marketplace always provides data for free
RealityMarketplace listings can be free OR paid. Data providers choose their pricing model. Consumers may need to agree to pricing terms and pay for premium datasets. Do not assume Marketplace data is always free

Confusing Pairs

Secure Data Sharing (Direct)Snowflake Marketplace

Direct sharing: point-to-point, provider explicitly adds a specific consumer account to a share. Private and controlled. Marketplace: public storefront, anyone can discover and request access, provider can set pricing. Both use the same underlying technology (Secure Data Sharing), but Marketplace is for broad distribution and discovery while direct sharing is for specific partner relationships

Data ExchangeSnowflake Marketplace

Data Exchange: private, invitation-only hub for sharing within a defined community (e.g., a company and its vendors). Controlled membership. Marketplace: public, open to all Snowflake customers. If the scenario involves a company sharing internally or with specific vetted partners, the answer is Data Exchange. For public data monetization, it is Marketplace

Reader AccountsRegular Consumer Accounts

Reader accounts: created by the provider, consumer has no Snowflake subscription, PROVIDER pays for all compute. Regular consumer accounts: consumer has their own Snowflake account and subscription, CONSUMER pays for their own compute. Use reader accounts when the data recipient is not a Snowflake customer and you want to share data with them without requiring them to purchase Snowflake

Secure Views (in Shares)Regular Views (in Shares)

Secure views: hide the view DDL from consumers, can be added to shares. Regular views: expose their DDL to all users who can query them, CANNOT be added to shares. If a question asks what must be created to share derived data without revealing base table structure, the answer is a secure view

Scenario Tips

If the question asks about:

A hospital wants to share patient data with a research institution that does not have Snowflake. The hospital wants the institution to be able to query the data. What do they create?

Answer:

Create a reader account for the research institution. The hospital (provider) creates and pays for the reader account. The institution queries shared data through this account without needing their own Snowflake subscription

Distractor to avoid:

Snowflake Marketplace is for public data distribution, not for sharing sensitive patient data with a specific partner. Direct share requires the consumer to have their own Snowflake account

If the question asks about:

A company shares a view of a sales table with a partner account. The partner reports seeing the full underlying SQL of the view. What went wrong?

Answer:

A regular (non-secure) view was shared instead of a secure view. Only secure views hide their DDL from consumers. The fix is to recreate the view as a SECURE view and add it to the share instead

Distractor to avoid:

Network policies or masking policies would not prevent DDL exposure. The issue is the view type — only SECURE views can be safely shared without exposing the query logic

If the question asks about:

A European company wants to share data with a US partner's Snowflake account (different region). What extra step is required?

Answer:

Database replication must be set up first — replicate the source database to a Snowflake account in the US region, then create a share from that US-region account

Distractor to avoid:

Creating a direct share from EU to US will fail — cross-region sharing is not supported without replication. The exam tests whether candidates know this limitation

Last-Minute Facts

1Shared data: always READ-ONLY for consumers. No exceptions
2Only SECURE views, SECURE materialized views, SECURE UDFs can be added to shares
3Regular views CANNOT be added to shares
4Same-region sharing: native. Cross-region: requires database replication first
5Reader accounts: created by provider, PROVIDER pays for compute
6Regular consumer accounts: consumer pays for their own compute
7Snowflake Marketplace: public, free OR paid listings
8Data Exchange: private, invitation-only, for controlled communities
9Data clean rooms: multi-party analysis without exposing raw data
10Native Apps Framework: providers package apps, consumers install in their own account

Feeling confident?

Put your knowledge to the test with a timed COF-C03 mock exam.