Quick Navigation
Snowflake Architecture FundamentalsData Loading and StagesSnowpipe and Streaming IngestionTasks, Streams, and Dynamic TablesData Transformation and UDFsSnowpark DataFramesSnowflake Cortex LLM FunctionsPerformance Optimization and WarehousesMicro-partitions, Clustering, and Search OptimizationTime Travel and Data ProtectionData Governance and Security
Snowflake Architecture Fundamentals
- Three-Layer Architecture
- Cloud Services layer (auth, metadata, optimizer), Virtual Warehouse layer (compute — MPP query execution), Storage layer (S3/Blob/GCS micro-partitions). Each layer scales independently.
- Micro-partitions
- Snowflake automatically divides all table data into 50-500MB compressed columnar micro-partitions. Each micro-partition stores min/max metadata per column enabling partition pruning without a user-defined index.
- Virtual Warehouse Sizes
- X-Small (1 node) → Small (2) → Medium (4) → Large (8) → XL (16) → 2XL (32) → 3XL (64) → 4XL (128) → 5XL (256) → 6XL (512). Credits double with each size.
- Auto-suspend / Auto-resume
- ALTER WAREHOUSE my_wh SET AUTO_SUSPEND = 60 AUTO_RESUME = TRUE; — suspends after 60 seconds of inactivity. Auto-resume starts the warehouse instantly on the next query without user action.
- Multi-cluster Warehouse
- ALTER WAREHOUSE my_wh SET MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 5 SCALING_POLICY = STANDARD; — auto-scales clusters for concurrency. Use for many concurrent users, not for single complex queries.
- Warehouse Minimum Billing
- ALL warehouse sizes (X-Small through 6XL) bill a minimum of 60 seconds per run. After 60 seconds, billing is per-second. Credits double with each size tier.
- Snowflake Editions
- Standard: 1-day Time Travel max. Enterprise: 90-day Time Travel, multi-cluster WH, materialized views, column-level security, row-level security. Business Critical: HIPAA, PCI, HITRUST compliance, PrivateLink.
Data Loading and Stages
- User Stage
- PUT file://local/file.csv @~/my_dir; — uploads to the current user's personal stage. Accessible only by the owning user. Path prefix: @~.
- Table Stage
- PUT file://local/file.csv @%my_table; — uploads to a table's dedicated stage. Only loads into that specific table. Path prefix: @%table_name.
- Named Stage (Internal)
- CREATE STAGE my_stage FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY = '"') ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE'); — reusable, shareable, configurable.
- Named Stage (External S3)
- CREATE STAGE ext_stage URL = 's3://my-bucket/data/' STORAGE_INTEGRATION = my_s3_integration FILE_FORMAT = (TYPE = 'PARQUET'); — uses storage integration for credential management.
- COPY INTO (Basic)
- COPY INTO my_table FROM @my_stage/data/ FILE_FORMAT = (TYPE = 'CSV' SKIP_HEADER = 1) ON_ERROR = 'SKIP_FILE'; — loads all staged files. ON_ERROR options: CONTINUE, SKIP_FILE, ABORT_STATEMENT.
- COPY INTO with Transformation
- COPY INTO target(col1, col2) FROM (SELECT $1, $3 FROM @my_stage/file.csv); — select and transform columns from staged files during load. Avoids an intermediate staging table.
- COPY INTO Options
- PURGE = TRUE removes files after successful load. FORCE = TRUE reloads previously loaded files (bypasses load history). PATTERN = '.*\.csv' filters files by regex.
- Data Unloading
- COPY INTO @ext_stage/export/ FROM (SELECT * FROM orders WHERE status = 'COMPLETE') FILE_FORMAT = (TYPE = 'PARQUET') HEADER = TRUE MAX_FILE_SIZE = 524288000; — exports to external stage.
- List and Remove Stage Files
- LIST @my_stage; — shows all files in stage. REMOVE @my_stage/file.csv; — deletes specific file. REMOVE @my_stage/path/ PATTERN = '.*old.*'; — removes matching files.
Snowpipe and Streaming Ingestion
- Create Snowpipe
- CREATE PIPE my_pipe AUTO_INGEST = TRUE AS COPY INTO my_table FROM @ext_stage FILE_FORMAT = (TYPE = 'JSON'); — AUTO_INGEST uses SQS notification for S3. Copy the SQS ARN from SHOW PIPES for S3 event config.
- Manual Snowpipe Trigger (REST)
- POST to https://<account>.snowflakecomputing.com/v1/data/pipes/<pipe>/insertFiles with a list of staged file paths. Use for non-cloud-native sources or when auto_ingest is not available.
- Snowpipe Status and History
- SELECT SYSTEM$PIPE_STATUS('my_pipe'); — returns JSON with pendingFileCount, lastIngestedEventTime. Use COPY_HISTORY view in INFORMATION_SCHEMA for per-file load status.
- Snowpipe Streaming (Java SDK)
- SnowflakeStreamingIngestClient client = ... — channel.insertRow(rowData, offsetToken) — rows go directly to Snowflake table without staging files. Sub-second latency. No SQS needed.
- Snowpipe vs Snowpipe Streaming
- Snowpipe: file-based, SQS/Event Grid trigger, 1-5 min latency, serverless. Streaming: SDK row-level, channel-based offsets, <1 sec latency, no files. Use Streaming for CDC or IoT real-time ingestion.
- Snowpipe Delivery Semantics
- Snowpipe provides AT-LEAST-ONCE delivery — duplicate rows are possible if a file is retried. Implement deduplication logic downstream (unique constraints or QUALIFY ROW_NUMBER() = 1) if exactly-once is required.
- Refresh Snowpipe (backfill)
- ALTER PIPE my_pipe REFRESH MODIFIED_AFTER = '2026-01-01'; — re-queues files in the stage for ingestion. Use to backfill files that arrived before the pipe was created.
Tasks, Streams, and Dynamic Tables
- Create Task (Scheduled)
- CREATE TASK daily_transform WAREHOUSE = my_wh SCHEDULE = 'USING CRON 0 2 * * * UTC' AS INSERT INTO target SELECT * FROM source WHERE date = CURRENT_DATE - 1; ALTER TASK daily_transform RESUME;
- Create Task (Serverless)
- CREATE TASK serverless_task USER_TASK_MANAGED_INITIAL_WAREHOUSE_SIZE = 'MEDIUM' SCHEDULE = '5 MINUTES' AS CALL my_procedure(); — uses Snowflake-managed compute, auto-scales to query size.
- Task DAG (Predecessor)
- CREATE TASK child_task WAREHOUSE = my_wh AFTER parent_task AS ...; -- child_task runs after parent_task completes. Root task must have SCHEDULE; child tasks must NOT. Resume root last.
- Stream on Table (Standard)
- CREATE STREAM orders_stream ON TABLE orders; -- captures all DML changes: INSERT, UPDATE, DELETE. Metadata columns: METADATA$ACTION (INSERT/DELETE), METADATA$ISUPDATE, METADATA$ROW_ID.
- Stream on Table (Append-only)
- CREATE STREAM orders_append_stream ON TABLE orders APPEND_ONLY = TRUE; -- captures INSERT operations only. More efficient than standard streams for insert-only pipelines.
- Check Stream for Data
- SELECT SYSTEM$STREAM_HAS_DATA('orders_stream'); -- returns TRUE if stream has unconsumed changes. Use in task WHEN clause: CREATE TASK ... WHEN SYSTEM$STREAM_HAS_DATA('orders_stream') AS ...
- Consume Stream in Task
- CREATE TASK process_orders WAREHOUSE = my_wh SCHEDULE = '1 MINUTES' WHEN SYSTEM$STREAM_HAS_DATA('orders_stream') AS INSERT INTO orders_processed SELECT order_id, amount FROM orders_stream WHERE METADATA$ACTION = 'INSERT';
- Dynamic Table (Basic)
- CREATE DYNAMIC TABLE customer_summary TARGET_LAG = '1 minute' WAREHOUSE = my_wh AS SELECT customer_id, SUM(amount) total FROM orders GROUP BY customer_id; -- auto-refreshes to stay within 1-minute lag.
- Dynamic Table (Downstream Lag)
- CREATE DYNAMIC TABLE final_report TARGET_LAG = DOWNSTREAM WAREHOUSE = my_wh AS SELECT * FROM customer_summary JOIN products ...; -- DOWNSTREAM: lag is determined by any consuming dynamic table.
- Dynamic Table Refresh
- ALTER DYNAMIC TABLE customer_summary REFRESH; -- triggers immediate manual refresh. ALTER DYNAMIC TABLE customer_summary SUSPEND / RESUME; -- pause/resume automatic refresh.
- Streams vs Dynamic Tables
- Streams+Tasks: procedural, supports stored procedures + Snowpark, custom error handling, multi-target writes. Dynamic Tables: declarative SQL, simpler to maintain, automatic incremental refresh. Use DTs for pure SQL pipelines.
Data Transformation and UDFs
- SQL Scalar UDF
- CREATE FUNCTION usd_to_eur(amount FLOAT) RETURNS FLOAT AS $$ amount * 0.92 $$; -- called in SELECT: SELECT usd_to_eur(price) FROM products;
- JavaScript UDF
- CREATE FUNCTION parse_json_field(v VARIANT, key STRING) RETURNS STRING LANGUAGE JAVASCRIPT AS $$ return V[KEY] ? V[KEY].toString() : null; $$;
- Python UDF
- CREATE FUNCTION detect_language(text STRING) RETURNS STRING LANGUAGE PYTHON RUNTIME_VERSION='3.9' PACKAGES=('langdetect') HANDLER='detect' AS $$ from langdetect import detect def detect(text): return detect(text) $$;
- Vectorized (Batch) Python UDF
- CREATE FUNCTION clean_text_batch(text STRING) RETURNS STRING LANGUAGE PYTHON RUNTIME_VERSION='3.9' PACKAGES=('pandas') HANDLER='BatchUDF.compute' AS $$ import pandas as pd class BatchUDF: @staticmethod def compute(s: pd.Series) -> pd.Series: return s.str.strip().str.lower() $$;
- SQL UDTF (Table Function)
- CREATE FUNCTION split_tags(tags STRING) RETURNS TABLE (tag STRING) AS $$ SELECT value FROM TABLE(SPLIT_TO_TABLE(tags, ',')) $$; -- returns multiple rows per input row.
- External Function
- CREATE EXTERNAL FUNCTION score_customer(customer_id NUMBER) RETURNS VARIANT API_INTEGRATION = my_api_integration AS 'https://api.example.com/score'; -- calls HTTPS endpoint. Requires API integration object.
- Stored Procedure (Python)
- CREATE PROCEDURE load_and_validate() RETURNS STRING LANGUAGE PYTHON RUNTIME_VERSION='3.9' PACKAGES=('snowflake-snowpark-python') HANDLER='run' EXECUTE AS OWNER AS $$ def run(session): session.sql('COPY INTO ...').collect(); return 'done' $$;
- EXECUTE AS CALLER vs OWNER
- EXECUTE AS OWNER: procedure runs with owner's privileges (enables privilege escalation for controlled access). EXECUTE AS CALLER: runs with caller's privileges (caller must have direct object access).
- Semi-structured: FLATTEN
- SELECT f.value:product_id::STRING, f.value:qty::INT FROM orders o, LATERAL FLATTEN(input => o.items) f; -- explodes JSON array into rows. LATERAL FLATTEN is the standard way to unnest arrays.
- Semi-structured: Dot / Bracket Notation
- SELECT v:address.city::STRING, v['phone']::STRING FROM contacts; -- v:field casts VARIANT field. Use :: to cast to target type. Bracket notation for fields with special characters or dynamic keys.
- PARSE_JSON and Object Construction
- SELECT PARSE_JSON('{"id":1}') AS obj; SELECT OBJECT_CONSTRUCT('id', 1, 'name', 'Alice') AS obj; SELECT ARRAY_CONSTRUCT(1, 2, 3) AS arr; -- build VARIANT structures from SQL values.
Snowpark DataFrames
- Snowpark Session (Python)
- from snowflake.snowpark import Session; session = Session.builder.configs(connection_params).create() -- creates a Snowpark session. All DataFrame operations push down to Snowflake SQL.
- Load DataFrame
- df = session.table('ORDERS') -- reads from Snowflake table (lazy). df = session.sql('SELECT * FROM orders WHERE status = \'OPEN\'') -- from SQL string.
- DataFrame Transformations (Lazy)
- df2 = df.filter(col('STATUS') == 'OPEN').select('ORDER_ID', 'AMOUNT').sort('AMOUNT', ascending=False) -- builds a logical plan. NO execution until collect() or write().
- Execute DataFrame
- df.collect() -- returns list of Row objects (pulls data to client). df.show() -- prints first 10 rows. df.count() -- returns row count. df.to_pandas() -- converts to pandas DataFrame.
- Write DataFrame
- df.write.mode('overwrite').save_as_table('MY_RESULTS') -- writes to Snowflake table. df.write.copy_into_location('@my_stage/output/') -- writes to stage as files.
- Snowpark UDF Registration
- from snowflake.snowpark.functions import udf; @udf(return_type=StringType(), input_types=[StringType()]) def upper_udf(s): return s.upper() -- registers Python function as UDF in Snowflake.
- Pushdown Optimization
- Snowpark DataFrame operations (filter, select, join, groupBy) compile to Snowflake SQL and run inside Snowflake. Calling a plain Python function (not Snowpark API) on a DataFrame breaks pushdown — runs client-side.
- Snowpark Join
- orders.join(customers, orders['CUSTOMER_ID'] == customers['ID'], join_type='inner').select(orders['ORDER_ID'], customers['NAME']).show()
- Snowpark GroupBy / Agg
- from snowflake.snowpark.functions import sum, count; df.groupBy('REGION').agg(sum('AMOUNT').alias('total'), count('ORDER_ID').alias('cnt')).show()
Snowflake Cortex LLM Functions
- COMPLETE (Text Generation)
- SELECT SNOWFLAKE.CORTEX.COMPLETE('mistral-7b', 'Write a product description for: ' || product_name) FROM products; -- generates text using a specified LLM model.
- SUMMARIZE (Text Summarization)
- SELECT SNOWFLAKE.CORTEX.SUMMARIZE(ticket_text) AS summary FROM support_tickets; -- produces a concise summary of long-form text. No model parameter required.
- SENTIMENT (Sentiment Analysis)
- SELECT SNOWFLAKE.CORTEX.SENTIMENT(review_text) AS sentiment FROM reviews; -- returns a FLOAT between -1.0 (most negative) and 1.0 (most positive). No model parameter required. Not a string label.
- TRANSLATE (Language Translation)
- SELECT SNOWFLAKE.CORTEX.TRANSLATE(description, 'en', 'fr') AS french_desc FROM products; -- translates text from source language to target language using ISO 639-1 codes.
- EXTRACT_ANSWER (QA)
- SELECT SNOWFLAKE.CORTEX.EXTRACT_ANSWER(document_text, 'What is the contract end date?') AS answer FROM contracts; -- extracts specific answers from a passage of text.
- EMBED_TEXT (Vector Embeddings)
- SELECT SNOWFLAKE.CORTEX.EMBED_TEXT('snowflake-arctic-embed-m', description) AS embedding FROM products; -- returns a VECTOR type (not ARRAY) for semantic similarity search. Store in a VECTOR(N, FLOAT) column for similarity queries.
- CLASSIFY_TEXT (Classification)
- SELECT SNOWFLAKE.CORTEX.CLASSIFY_TEXT(ticket_text, ARRAY_CONSTRUCT('billing', 'technical', 'cancellation', 'other')) AS category FROM tickets; -- classifies text into provided categories.
- Cortex — Data Stays in Snowflake
- All Cortex LLM functions execute INSIDE Snowflake. Data does NOT leave your Snowflake account. Cortex uses serverless compute credits (separate from virtual warehouse credits).
Performance Optimization and Warehouses
- Query Profile Navigation
- Snowsight: Query History → click query → Profile tab. Look for: Partitions Scanned vs Total (pruning ratio), Bytes Spilled to Local/Remote Storage (memory pressure), Join type and row estimate accuracy.
- Spill to Disk Fix
- Spill to local storage = warehouse memory exhausted. Spill to remote storage = local SSD exhausted too. Fix: scale up warehouse (more memory per node). Also check for cartesian products or inefficient joins.
- Scale Up vs Scale Out Decision
- Scale UP (larger size): single complex query runs slowly, spills to disk, large sorts/joins. Scale OUT (multi-cluster): many concurrent users queue, warehouse busy with multiple queries simultaneously.
- Result Cache Hit
- SHOW TRANSACTIONS; / QUERY_HISTORY: look for 'cache result' in query type. Result cache is FREE (no warehouse credits), account-level, 24-hour TTL. Invalidated by: DML on source table, 24hr expiry, or non-deterministic functions in query. Warehouse suspend/resize does NOT clear result cache.
- Local Disk Cache (Data Cache)
- Warehouse-local SSD cache for recently scanned micro-partitions. Warm cache speeds up repeated queries on same data. Cleared when warehouse SUSPENDS. Re-warms on first query after resume.
- Cache Tier Order
- Query execution checks: 1) Result Cache (free, 24hr, account-level) → 2) Local Disk Cache (warehouse SSD, cleared on suspend) → 3) Remote Disk Cache (Snowflake Foundation/FDN layer, persists across suspends) → 4) Cloud Storage (S3/Blob/GCS).
- SHOW WAREHOUSES / ALTER WAREHOUSE
- SHOW WAREHOUSES; -- lists all warehouses with size, state, credit usage. ALTER WAREHOUSE my_wh SET WAREHOUSE_SIZE = 'LARGE' MAX_CONCURRENCY_LEVEL = 16;
- Query Acceleration Service
- ALTER WAREHOUSE my_wh SET ENABLE_QUERY_ACCELERATION = TRUE QUERY_ACCELERATION_MAX_SCALE_FACTOR = 8; -- offloads large scan portions to serverless compute. Best for ad-hoc large scans.
- Credit Usage Monitoring
- SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY WHERE START_TIME >= DATEADD('day', -7, CURRENT_TIMESTAMP) ORDER BY CREDITS_USED DESC; -- identify high-credit warehouses.
Micro-partitions, Clustering, and Search Optimization
- Check Clustering Quality
- SELECT SYSTEM$CLUSTERING_INFORMATION('my_table', '(transaction_date, region)'); -- returns JSON with clustering_ratio (1.0 = perfect), clustering_depth (1 = ideal), total_partition_count.
- Define Clustering Key
- ALTER TABLE large_orders CLUSTER BY (transaction_date, region); -- organizes micro-partitions by these columns for better pruning. Best for high-cardinality columns frequently used in WHERE filters.
- Automatic Clustering
- Once a CLUSTER BY key is defined, Snowflake's automatic clustering service continuously reorganizes micro-partitions. This runs in the background and consumes credits. Monitor with CLUSTERING_HISTORY view.
- Drop Clustering Key
- ALTER TABLE my_table DROP CLUSTERING KEY; -- stops automatic clustering and removes the clustering key. Existing data layout is preserved but no longer maintained.
- Good vs Bad Clustering Keys
- Good: date columns (ORDER_DATE), status columns with 5-50 distinct values used in WHERE. Bad: UUID/surrogate keys (too many distinct values, tiny pruning benefit), columns never in WHERE filters.
- Enable Search Optimization
- ALTER TABLE customers ADD SEARCH OPTIMIZATION ON EQUALITY(email, phone), SUBSTRING(name); -- creates access path for equality, IN, and substring searches. Check eligibility with SEARCH_OPTIMIZATION_HISTORY.
- Search Optimization vs Clustering
- Search Optimization: equality/IN/substring point lookups — few rows returned. Clustering Key: range scans, date-range filters, GROUP BY on clustered column — many rows returned. Solve different problems.
- Materialized View Create
- CREATE MATERIALIZED VIEW daily_sales_mv AS SELECT order_date, SUM(amount) total FROM orders GROUP BY order_date; -- Snowflake refreshes automatically when ORDERS changes using serverless compute.
- Materialized View Restrictions
- MV limitations: no non-deterministic functions (RANDOM, CURRENT_TIMESTAMP), no CTEs (WITH clause), no LATERAL joins, no UDFs, no UNION, no ORDER BY. Dynamic Tables have fewer restrictions.
Time Travel and Data Protection
- Time Travel — Timestamp
- SELECT * FROM orders AT(TIMESTAMP => '2026-01-15 09:00:00'::TIMESTAMP_LTZ); -- queries the table as it existed at a specific timestamp. Requires data within retention period.
- Time Travel — Offset (Seconds)
- SELECT * FROM orders AT(OFFSET => -3600); -- queries state 1 hour (3600 seconds) before current time. OFFSET must be negative.
- Time Travel — Statement ID
- SELECT * FROM orders BEFORE(STATEMENT => '01b12345-0000-...'); -- queries data BEFORE a specific statement (e.g., before a DELETE). Get query ID from QUERY_HISTORY.
- Restore Deleted Rows
- INSERT INTO orders SELECT * FROM orders BEFORE(STATEMENT => '<delete_query_id>') EXCEPT SELECT * FROM orders; -- reinserts only the rows that were deleted, avoiding duplicates.
- UNDROP TABLE
- UNDROP TABLE orders; -- restores a dropped table within its Time Travel retention period. Must be run in the same database/schema context. Restores data, metadata, and grants.
- UNDROP DATABASE / SCHEMA
- UNDROP DATABASE my_db; UNDROP SCHEMA my_schema; -- same behavior as UNDROP TABLE. Restores all contained objects if dropped within their retention periods.
- Set Time Travel Retention
- ALTER TABLE large_staging SET DATA_RETENTION_TIME_IN_DAYS = 0; -- disables Time Travel for cost savings on staging tables. Standard: max 1 day. Enterprise: max 90 days. Fail-safe still runs 7 days.
- Transient Tables (No Fail-safe)
- CREATE TRANSIENT TABLE staging_load (id NUMBER, data VARIANT); -- no Fail-safe period, max 1-day Time Travel. Lower storage cost. Use for ephemeral staging data that does not need recovery.
- Zero-copy Clone
- CREATE TABLE orders_dev CLONE orders; -- instant, shares physical micro-partitions via metadata. Near-zero initial storage. Clone diverges as data changes (copy-on-write). Grants NOT inherited.
- Clone at Point in Time
- CREATE DATABASE dev_db CLONE prod_db AT(TIMESTAMP => '2026-06-01 00:00:00'::TIMESTAMP_LTZ); -- clones the database as it existed at a specific historical moment. Full database/schema cloning supported.
- Time Travel vs Fail-safe
- Time Travel: user-accessible via SQL, 0-90 days configurable, for user error recovery. Fail-safe: 7 days fixed, Snowflake-managed, not SQL-accessible, only for catastrophic platform failures.
Data Governance and Security
- Create Object Tag
- CREATE TAG pii_sensitivity ALLOWED_VALUES 'high', 'medium', 'low'; ALTER TABLE customers ALTER COLUMN email SET TAG pii_sensitivity = 'high'; -- tags are key-value pairs on columns, tables, or databases.
- Automated Data Classification
- CALL SYSTEM$CLASSIFY('my_database.my_schema.customers', OBJECT_CONSTRUCT('auto_tag', true)); -- SYSTEM$CLASSIFY is a stored procedure (use CALL, not SELECT). Snowflake ML scans columns and suggests privacy_category and semantic_category tags.
- Create Masking Policy
- CREATE MASKING POLICY email_mask AS (val STRING) RETURNS STRING -> CASE WHEN IS_ROLE_IN_SESSION('DATA_STEWARD') THEN val ELSE REGEXP_REPLACE(val, '.+@', '***@') END;
- Apply Masking Policy to Column
- ALTER TABLE customers MODIFY COLUMN email SET MASKING POLICY email_mask; -- policy evaluated at query time. Unmasked data is stored; masking is applied when data is returned.
- Tag-based Masking Policy
- ALTER TAG pii_sensitivity SET MASKING POLICY email_mask; -- any column with tag pii_sensitivity = 'high' automatically gets email_mask applied. Scales governance across new columns automatically.
- Create Row Access Policy
- CREATE ROW ACCESS POLICY region_access AS (region STRING) RETURNS BOOLEAN -> IS_ROLE_IN_SESSION('ADMIN') OR region = (SELECT region FROM user_region_map WHERE username = CURRENT_USER());
- Apply Row Access Policy
- ALTER TABLE sales ADD ROW ACCESS POLICY region_access ON (region); -- policy evaluated per row at query time. Returns FALSE = row not returned (no error). One policy per table.
- Query ACCESS_HISTORY
- SELECT query_id, user_name, direct_objects_accessed, base_objects_accessed FROM SNOWFLAKE.ACCOUNT_USAGE.ACCESS_HISTORY WHERE query_start_time >= DATEADD('day', -1, CURRENT_TIMESTAMP); -- 2-3 hour lag.
- Database Replication
- ALTER DATABASE prod_db ENABLE REPLICATION TO ACCOUNTS myorg.us-west-2-account; -- on secondary: CREATE DATABASE prod_db_replica AS REPLICA OF myorg.us-east-1-account.prod_db; ALTER DATABASE prod_db_replica REFRESH;
- Masking and Result Cache Interaction
- Masking policies bypass result cache sharing across different roles. User A (unmasked) and User B (masked) running the same query do NOT share cached results — each query computes separately to avoid data leakage.
- Data Clean Room (Provider Side)
- Data Clean Rooms use Snowflake Native App Framework. Provider packages allowed analyses as a Native App. Consumer installs the app and runs analyses without seeing raw data — only aggregate results exit the clean room.