CertPrepNow
Google CloudGCP-PMLE72 concepts

GCP-PMLE Cheat Sheet

Quick reference for the Google Professional Machine Learning Engineer exam.

Vertex AI Core Platform

Vertex AI
Unified ML platform combining AutoML, custom training, pipelines, feature store, and endpoints under one API and SDK.
Vertex AI Workbench
Managed JupyterLab environment; managed notebooks auto-idle for cost savings while user-managed notebooks give full VM control.
Vertex AI Model Registry
Central repository for versioning model artifacts, metadata, and evaluation metrics before deployment to an endpoint.
Vertex AI Experiments
Tracks and compares metrics, parameters, and artifacts across training runs; not used for production A/B traffic testing.
Vertex AI Feature Store
Centralized feature repository with low-latency online serving and point-in-time offline serving to prevent training-serving skew.
Vertex AI Vizier
Black-box hyperparameter optimization service using Bayesian optimization; can tune any system, not only Vertex AI training jobs.
Vertex AI Metadata / ML Metadata (MLMD)
Automatically records lineage of datasets, models, and pipeline executions for artifact provenance and reproducibility.

Vertex AI Training — gcloud CLI

gcloud ai custom-jobs create --region=REGION --display-name=NAME --worker-pool-spec=machine-type=n1-standard-4,replica-count=1,container-image-uri=IMAGE_URI
Launches a custom training job with a specified container image, machine type, and worker pool configuration.
gcloud ai models upload --region=REGION --display-name=NAME --container-image-uri=URI --artifact-uri=gs://BUCKET/model
Registers a trained model artifact from Cloud Storage into the Model Registry for later deployment.
gcloud ai hp-tuning-jobs create --region=REGION --config=study_config.yaml --display-name=NAME
Submits a Vertex AI Vizier hyperparameter tuning job defined by a YAML study configuration.
gcloud ai endpoints create --region=REGION --display-name=NAME
Creates an empty prediction endpoint that one or more models can later be deployed to.
gcloud ai endpoints deploy-model ENDPOINT_ID --region=REGION --model=MODEL_ID --machine-type=n1-standard-4 --min-replica-count=1 --max-replica-count=5
Deploys a registered model to an endpoint with defined autoscaling replica bounds.
gcloud ai custom-jobs stream-logs JOB_ID --region=REGION
Streams live training logs from a running custom job for real-time debugging.

BigQuery ML

CREATE OR REPLACE MODEL `dataset.model_name` OPTIONS(model_type='logistic_reg', input_label_cols=['label']) AS SELECT * FROM `dataset.training_table`;
Trains a model directly on BigQuery data using standard SQL with no data export required.
SELECT * FROM ML.PREDICT(MODEL `dataset.model_name`, TABLE `dataset.new_data`)
Generates predictions on new rows using a trained BigQuery ML model.
SELECT * FROM ML.EVALUATE(MODEL `dataset.model_name`)
Returns evaluation metrics such as precision, recall, and ROC AUC for a trained BigQuery ML model.
SELECT * FROM ML.EXPLAIN_PREDICT(MODEL `dataset.model_name`, TABLE `dataset.new_data`, STRUCT(3 AS top_k_features))
Returns Shapley-value feature attributions explaining each individual prediction.
BQML model_type options
linear_reg, logistic_reg, kmeans, matrix_factorization, boosted_tree_classifier/regressor, dnn_classifier/regressor, arima_plus, autoencoder, pca — pick per business problem.
bq query --use_legacy_sql=false "SELECT * FROM ML.PREDICT(MODEL \`ds.m\`, TABLE \`ds.t\`)"
Runs a BigQuery ML SQL statement from the command line for scripting and automation.
TRANSFORM() clause in CREATE MODEL
Bakes feature preprocessing into the model definition so the same transforms apply automatically at both training and prediction time.

AutoML and Pre-built APIs

AutoML Tabular / Image / Text / Video
No-code training with automated feature engineering, architecture search, and hyperparameter tuning; requires your own labeled data.
Vision AI
Pre-trained API for label detection, OCR, face detection, and explicit content moderation with zero training data required.
Natural Language AI
Pre-trained API for entity extraction, sentiment analysis, and syntax parsing on unstructured text.
Translation AI
Pre-trained and custom-glossary API for real-time and batch text translation across 100+ languages.
Speech-to-Text / Text-to-Speech
Pre-trained APIs for streaming and batch audio transcription and natural-sounding speech synthesis.
Data Labeling Service
Managed human labeling workforce that produces training labels needed for AutoML or custom-trained models.

Data Processing and Ingestion

Dataflow
Serverless, autoscaling batch/stream processing built on Apache Beam; default choice for new ETL and feature engineering pipelines.
Dataproc
Managed Spark/Hadoop clusters; choose only when reusing existing Spark ML code or the broader Hadoop ecosystem.
Pub/Sub
Serverless messaging that decouples event producers from consumers in streaming ML feature pipelines.
Cloud Storage storage classes
Standard (frequent access), Nearline (30-day), Coldline (90-day), Archive (365-day); use Standard for active training data and model artifacts.
TFRecord format
Binary, protobuf-based storage format optimized for high-throughput TensorFlow input pipelines.
tf.Transform
Applies the exact same preprocessing graph at both training and serving time to eliminate training-serving skew in TFX pipelines.

Distributed Training and Hyperparameter Tuning

Data parallelism
Replicates the full model across workers, each processing a data shard, with gradients synchronized each step; the most common strategy.
Model parallelism
Splits model layers across multiple devices; use only when a single model does not fit in one device's memory.
Reduction Server
Vertex AI feature that accelerates all-reduce gradient synchronization for large-scale distributed GPU training.
GPU vs TPU selection
GPUs broadly support PyTorch and custom CUDA ops; TPUs are optimized for large-scale TensorFlow/JAX training with maximum throughput.
Vizier search space definition
Define each hyperparameter's type (DOUBLE, INTEGER, CATEGORICAL, DISCRETE) and scale (linear or log) in the study configuration.
Automated early stopping
Vizier can terminate underperforming hyperparameter trials early to save compute without exhausting the full trial budget.

Vertex AI Pipelines and MLOps

from kfp import dsl @dsl.pipeline(name="training-pipeline", pipeline_root="gs://bucket/root") def my_pipeline(project: str): ...
Defines a Kubeflow Pipelines DAG using the KFP SDK, the standard way to author Vertex AI Pipelines.
from google.cloud import aiplatform job = aiplatform.PipelineJob( display_name="my-run", template_path="pipeline.json", parameter_values={"project": "my-proj"}, ) job.run()
Compiles a pipeline template and submits it as a Vertex AI Pipeline run from Python.
@dsl.component decorator
Packages a Python function as a standalone, containerized, reusable pipeline component with typed inputs and outputs.
Cloud Build
CI/CD engine that builds and tests pipeline component containers and pushes them to Artifact Registry before deployment.
Artifact Registry
Stores versioned container images and pipeline artifacts referenced by Vertex AI Pipelines components.
Pipeline trigger types
Scheduled (Cloud Scheduler cron), event-driven (Pub/Sub), or data-driven (Cloud Storage/BigQuery change) automation for retraining pipelines.
TFX standard component order
ExampleGen to StatisticsGen to SchemaGen to Transform to Trainer to Evaluator to Pusher, chained in sequence for production ML pipelines.

Serving and Deployment Patterns

Online prediction (Vertex AI Endpoints)
Real-time, low-latency inference with autoscaling; scaling is reactive and cannot instantly absorb sudden traffic spikes.
Batch prediction
Spins up temporary compute, scores a full dataset, then shuts down with no persistent endpoint cost; ideal for periodic scoring jobs.
gcloud ai endpoints deploy-model ENDPOINT_ID --region=REGION --model=MODEL_ID --traffic-split=0=80,1=20
Routes a percentage of production traffic to each deployed model version for canary or blue-green rollout.
Model optimization for serving
Quantization (lower numeric precision), pruning (remove low-impact weights), and distillation (train a smaller student model) trade some accuracy for lower latency and cost.
Private Endpoints (VPC Peering)
Serve predictions over a private network connection instead of the public internet for lower latency and improved security.
Pre-built vs custom serving containers
Pre-built containers cover common frameworks (TensorFlow, scikit-learn, XGBoost, PyTorch); custom containers are needed for unsupported runtimes or dependencies.
NVIDIA Triton on Vertex AI
Supports multi-framework model serving with dynamic batching for higher throughput on a single endpoint.

Monitoring, Explainability, and Responsible AI

Vertex AI Model Monitoring
Detects training-serving skew and prediction drift by comparing live traffic feature statistics against a stored baseline.
Data drift vs concept drift vs prediction drift
Data drift is a shift in input feature distribution, concept drift is a shift in the input-output relationship, prediction drift is a shift in model output distribution.
Feature attribution methods
Shapley values (game-theoretic, tabular/AutoML), Integrated Gradients (differentiable models), and XRAI (region-based saliency for images).
Continuous evaluation
Compares live predictions against ground-truth labels as they become available to track real-world accuracy over time.
Fairness metrics
Demographic parity requires equal positive prediction rates across groups; equalized odds requires equal true/false positive rates across groups.
Vertex Explainable AI
Returns per-prediction feature attributions via the same API call as standard online or batch prediction requests.
What-If Tool and Model Cards
The What-If Tool interactively probes model behavior across data slices and thresholds; Model Cards document intended use, limitations, and evaluation results.

IAM Roles for ML

roles/aiplatform.user
Grants permission to create and run jobs, deploy models, and manage most Vertex AI resources without full administrative control.
roles/aiplatform.admin
Full control over all Vertex AI resources, including managing IAM policy on the AI Platform project.
roles/aiplatform.viewer
Read-only access to view Vertex AI resources, jobs, endpoints, and models.
roles/bigquery.dataEditor + roles/bigquery.jobUser
Minimum role combination needed to create datasets, train BigQuery ML models, and run queries.
roles/storage.objectAdmin
Grants read, write, and delete access on Cloud Storage buckets used for training data and model artifacts.
Service accounts for training and pipeline jobs
Vertex AI jobs execute as a service account; grant it only the minimum roles required, following least-privilege principles.

Architecture Decision Patterns — Quick Rules

Choose BigQuery ML when...
Data is already in BigQuery, the team is SQL-skilled, and the model type is supported (regression, classification, clustering, forecasting).
Choose AutoML when...
The data type is standard (tabular, image, text, video), you have your own labeled data, and time-to-production outweighs full customization.
Choose Pre-built APIs when...
The task is general-purpose (vision, text, translation, speech) and no labeled training data is available at all.
Choose custom training when...
The scenario requires a novel architecture, specialized preprocessing, or maximum achievable model performance.
Choose Dataflow over Dataproc when...
Building a new pipeline with no existing Spark/Hadoop dependency; reserve Dataproc for migrating existing Spark ML workloads.
Choose online over batch prediction when...
The scenario mentions real-time or low-latency requirements; choose batch when scoring large datasets on a schedule with no latency constraint.
Choose Vertex AI Pipelines over Cloud Composer when...
The workflow is ML-specific; Composer only wins the exam when the scenario mentions existing Airflow DAGs or non-ML orchestration needs.

Ready to test yourself?

Start a timed GCP-PMLE mock exam or review practice questions by domain.