CertPrepNow
Google CloudGCP-PMLEUpdated 2026-06-17

GCP-PMLE Study Guide

Everything you need to pass the Google Professional Machine Learning Engineer exam. Structured study plans, key services, common traps, and practice questions.

You Can Pass This Exam For Free

The PMLE exam is passable with free resources if you study consistently for 8-10 weeks and have prior ML experience:

  • Google Cloud official exam guide PDF (free download)
  • Google Cloud Skills Boost free tier labs for Vertex AI and BigQuery ML
  • Google Cloud Architecture Center ML reference architectures (free)
  • TensorFlow official documentation and tutorials (free)
  • Google Cloud product documentation for all in-scope services (free)
  • 500+ free practice questions on this site

Google Cloud provides extensive free documentation and hands-on labs. The exam tests practical knowledge of Google Cloud services, so hands-on experience with a free trial account is essential. The official exam guide combined with product documentation covers the majority of exam content.

Choose Your Study Path

You have general programming experience but limited ML or Google Cloud knowledge. You need to build foundational skills in both areas.

Week 1Learn ML fundamentals: supervised vs unsupervised vs reinforcement learning, classification vs regression, common algorithms (linear regression, logistic regression, decision trees, neural networks). Set up a Google Cloud free trial account
Week 2Study Google Cloud ML ecosystem: understand Vertex AI as the unified platform, BigQuery ML for SQL-based modeling, AutoML for low-code training, and pre-built ML APIs (Vision, Natural Language, Translation)
Week 3Deep dive into data preparation: Cloud Storage, Dataflow for batch/streaming ETL, BigQuery for data warehousing, feature engineering techniques, and handling training-serving skew
Week 4Learn model training: Vertex AI custom training with TensorFlow and PyTorch, distributed training on GPUs/TPUs, hyperparameter tuning with Vertex AI Vizier, and experiment tracking
Week 5Study model serving: Vertex AI Endpoints for online prediction, batch prediction jobs, autoscaling configuration, latency optimization, and model versioning with Model Registry
Week 6Learn MLOps and pipelines: Vertex AI Pipelines with Kubeflow, CI/CD for ML, automated retraining triggers, artifact management, and pipeline orchestration patterns
Week 7Study generative AI topics: Model Garden, foundation models, prompt engineering, RAG patterns with Vertex AI Agent Builder, and responsible AI evaluation for generative outputs
Week 8Cover monitoring and optimization: model drift detection, feature attribution (Shapley values), continuous evaluation, data skew monitoring, and retraining strategies
Week 9Complete hands-on labs on Cloud Skills Boost: build an end-to-end pipeline from data ingestion through deployment. Take your first full practice exam
Week 10Review all incorrect answers, re-study weak domains. Take a second practice exam aiming for 85%+. Focus on Vertex AI Pipelines and serving which carry the most exam weight

Exam Overview

Format

50-60 questions, 120 minutes. Multiple choice and multiple select questions.

Scoring

Pass/fail only — Google does not disclose numeric scores. The 70% threshold shown is a widely cited community estimate. No penalty for wrong answers.

Domains & Weights

  • Architecting Low-Code AI Solutions13%
  • Collaborating Within and Across Teams to Manage Data and Models14%
  • Scaling Prototypes into ML Models18%
  • Serving and Scaling Models20%
  • Automating and Orchestrating ML Pipelines22%
  • Monitoring AI Solutions13%

Registration

$200 USD. Available online-proctored (remote) or at Kryterion testing centers. Exam fee is $200 USD plus applicable taxes.

Topic Priority Table

Not all topics are tested equally. Focus your study time on Tier 1 first, then Tier 2. Tier 3 topics rarely appear — just recognize what they do.

Tier 1: Must KnowYou must understand these services deeply, know when to use each, and be able to select the right one in scenario-based questions. These appear across multiple questions.
Tier 2: Should KnowUnderstand what these services do and when to use them. May appear in 2-5 questions each.
Tier 3: Recognize OnlyKnow what these are at a high level. Rarely more than 1-2 questions each.
Domain 113% of exam

Architecting Low-Code AI Solutions

This domain covers building ML solutions with minimal custom code using BigQuery ML, AutoML, pre-built ML APIs, and Model Garden. You need to know when each approach is appropriate, which BigQuery ML model types to select for different business problems, and how to leverage RAG patterns with Vertex AI Agent Builder for generative AI applications.

Key Topics

BigQuery MLAutoMLPre-built ML APIsModel GardenVertex AI Agent BuilderVertex AI Studio

Must-Know Concepts

  • BigQuery ML model types: linear regression, logistic regression (binary/multiclass), K-means clustering, matrix factorization (recommendations), boosted trees (XGBoost), DNN, ARIMA (time series), and autoencoders — know which to use for each problem type
  • Decision criteria for choosing BigQuery ML vs AutoML vs custom training: data location, team expertise, model complexity, and time-to-production
  • Pre-built ML APIs (Vision, Natural Language, Translation, Speech-to-Text, Video Intelligence) require zero training data and work out of the box for common use cases
  • Model Garden provides access to foundation models (Gemini, PaLM, open-source models) for fine-tuning or direct deployment
  • RAG (Retrieval-Augmented Generation) patterns using Vertex AI Agent Builder to ground model responses with enterprise data
  • When to use low-code solutions vs custom training: low-code for standard problems and fast time-to-production, custom for specialized requirements

Common Exam Traps

BigQuery ML only supports specific model types — do not choose it for custom neural network architectures or image/video classification
Pre-built APIs need NO training data. AutoML needs YOUR labeled data. If the question says no labeled data is available, pre-built APIs are the answer
Matrix factorization in BigQuery ML is for recommendation systems, not for dimensionality reduction. Know the specific use case
ARIMA in BigQuery ML is for univariate time series forecasting. For multivariate or complex time series, consider custom training
Quick Check: Architecting Low-Code AI Solutions

Question 1 of 3

A retail company has transaction data in BigQuery and wants to predict which customers will churn next month. The data science team is small and primarily skilled in SQL. Which approach should they use?

Domain 214% of exam

Collaborating Within and Across Teams to Manage Data and Models

This domain covers how ML engineers work with data engineers, data scientists, and application developers to manage data and models. Key topics include data exploration and processing, prototyping in notebooks, experiment tracking, feature engineering, and model governance including versioning, access control, and data quality.

Key Topics

Vertex AI WorkbenchVertex AI ExperimentsVertex AI Feature StoreBigQueryDataflowCloud StorageVertex AI Model Registry

Must-Know Concepts

  • Vertex AI Workbench provides managed Jupyter notebooks for prototyping and collaboration. Know notebook instance types and when to use managed vs user-managed instances
  • Vertex AI Experiments tracks and compares ML experiments including metrics, parameters, and artifacts across training runs
  • Vertex AI Feature Store enables feature sharing across teams, prevents training-serving skew with consistent feature computation, and supports point-in-time lookups
  • Data exploration at scale using BigQuery for SQL-based analysis and Vertex AI Workbench for interactive exploration
  • Model governance: versioning with Model Registry, access control with IAM, lineage tracking, and metadata management
  • Data quality and integrity: validation, schema enforcement, anomaly detection in data pipelines, and handling data drift
  • Cross-team collaboration patterns: sharing notebooks, features, models, and datasets across organizational boundaries

Common Exam Traps

Vertex AI Workbench managed notebooks auto-idle and are cost-efficient. User-managed notebooks give more control but require manual management — know when each is appropriate
Feature Store solves training-serving skew by serving the SAME features for both training and prediction. If the question mentions inconsistent features between training and serving, Feature Store is the answer
Vertex AI Experiments is for tracking ML experiments, not A/B testing in production. Production A/B testing uses traffic splitting on Vertex AI Endpoints
Model Registry stores model metadata and artifacts, but the actual model binary can be in Cloud Storage. Registry provides governance, not storage
Quick Check: Collaborating Within and Across Teams to Manage Data and Models

Question 1 of 3

Multiple data science teams in a company are independently engineering the same features for different models, leading to inconsistent feature values between training and serving. Which service should they adopt?

Domain 318% of exam

Scaling Prototypes into ML Models

This domain covers taking ML prototypes from notebooks to production-grade models. Key topics include choosing frameworks and model architectures, training at scale with distributed training on GPUs and TPUs, hyperparameter optimization, transfer learning, handling overfitting and underfitting, and model evaluation and explainability.

Key Topics

Vertex AI Custom TrainingVertex AI VizierTensorFlowPyTorchVertex AI WorkbenchVertex AI Experiments

Must-Know Concepts

  • Choosing ML frameworks: TensorFlow for production-ready models with TFX ecosystem, PyTorch for research flexibility, scikit-learn for classical ML, XGBoost for tabular data
  • Distributed training strategies: data parallelism (split data across workers), model parallelism (split model across devices), and when to use each
  • GPU vs TPU selection: GPUs for general deep learning and PyTorch, TPUs for large-scale TensorFlow/JAX workloads requiring maximum throughput
  • Hyperparameter tuning with Vertex AI Vizier: defining search spaces, optimization objectives, early stopping criteria, and parallel trial configuration
  • Transfer learning: using pre-trained models and fine-tuning on domain-specific data to reduce training time and data requirements
  • Model evaluation metrics: accuracy, precision, recall, F1, AUC-ROC for classification; RMSE, MAE, R-squared for regression; confusion matrices
  • Overfitting prevention: regularization (L1, L2, dropout), cross-validation, early stopping, data augmentation, and reducing model complexity
  • Model explainability: feature attribution methods (Shapley values, XRAI, integrated gradients) for understanding model predictions

Common Exam Traps

TPUs are optimized for TensorFlow and JAX. For PyTorch or custom CUDA operations, GPUs are the correct choice
Data parallelism is more common and easier to implement than model parallelism. Use model parallelism only when the model is too large to fit on a single device
Vertex AI Vizier uses Bayesian optimization by default, which is more efficient than grid search or random search for hyperparameter tuning
Transfer learning does NOT mean training from scratch. It means starting with a pre-trained model and adapting it to a new task with less data
High training accuracy with low validation accuracy indicates overfitting. The answer is regularization or more data, not a larger model
Quick Check: Scaling Prototypes into ML Models

Question 1 of 3

An ML engineer is training a large language model using TensorFlow. The model does not fit in the memory of a single GPU. What distributed training strategy should they use?

Domain 420% of exam

Serving and Scaling Models

The second-heaviest domain at 20%. Covers deploying models for online and batch prediction, configuring autoscaling, managing traffic between model versions, optimizing inference latency and cost, and choosing appropriate serving infrastructure. You must know when to use online vs batch prediction and how to handle scaling challenges.

Key Topics

Vertex AI EndpointsVertex AI Batch PredictionCloud RunVertex AI Model RegistryTensorFlow Serving

Must-Know Concepts

  • Online prediction via Vertex AI Endpoints: real-time, low-latency inference with autoscaling, load balancing, and traffic splitting between model versions
  • Batch prediction: high-throughput scoring of large datasets without latency requirements, using Vertex AI batch prediction jobs
  • Traffic splitting for A/B testing and canary deployments: gradually shifting traffic from an old model version to a new one
  • Autoscaling configuration: minimum/maximum replicas, target CPU utilization, and scaling policies. Know that autoscaling is reactive and may not handle sudden traffic spikes
  • Model optimization for serving: quantization (reducing precision), pruning (removing unnecessary weights), distillation (training smaller models to mimic larger ones)
  • Serving infrastructure choices: Vertex AI Endpoints (managed), Cloud Run (containerized), GKE (Kubernetes) — know when each is appropriate
  • Model versioning and rollback: deploying new model versions alongside existing ones and rolling back if performance degrades
  • Cost optimization: choosing appropriate machine types, using preemptible/spot VMs for batch prediction, and right-sizing endpoints

Common Exam Traps

Autoscaling is REACTIVE, not predictive. It cannot handle sudden traffic spikes instantly. For predictable spikes, pre-scale endpoints before the expected load increase
Batch prediction does not maintain a persistent endpoint. It creates compute resources, processes data, and shuts down — making it cost-effective for periodic scoring
Traffic splitting on Vertex AI Endpoints enables canary deployments, but it is NOT the same as Vertex AI Experiments. Endpoints handle production traffic; Experiments track training runs
Quantization reduces model size and inference latency but may decrease accuracy. The exam tests whether you understand this trade-off
Cloud Run is a valid serving option for ML models in containers, but Vertex AI Endpoints is the preferred answer for ML-specific serving questions on this exam
Quick Check: Serving and Scaling Models

Question 1 of 3

A fraud detection model needs to score transactions in real time with sub-100ms latency. The traffic volume varies from 100 to 10,000 requests per second throughout the day. Which serving approach should be used?

Domain 522% of exam

Automating and Orchestrating ML Pipelines

The heaviest domain at 22%. Covers designing and implementing end-to-end ML pipelines using Vertex AI Pipelines and Kubeflow, CI/CD for ML, pipeline component design, trigger-based automation, artifact management, metadata tracking, and A/B testing in production. Master this domain or you will not pass.

Key Topics

Vertex AI PipelinesKubeflow PipelinesCloud BuildArtifact RegistryVertex AI Model RegistryVertex AI Metadata

Must-Know Concepts

  • Vertex AI Pipelines design: component identification, parameter configuration, trigger setup (scheduled, event-driven, data-driven), and compute selection
  • Kubeflow Pipelines SDK: building pipeline components, passing artifacts between components, and defining pipeline DAGs
  • Pipeline component decoupling: each component should be independently testable, reusable, and containerized with Cloud Build
  • CI/CD for ML: automated testing of pipeline components, model validation gates, staged deployments, and integration with Cloud Build
  • A/B testing and canary deployments in ML: traffic splitting, champion/challenger patterns, and automated rollback based on performance metrics
  • Artifact management: storing and versioning training data, model binaries, evaluation metrics, and pipeline outputs in Artifact Registry and Cloud Storage
  • Metadata and lineage tracking: recording experiment parameters, data versions, model versions, and pipeline execution history
  • Automated retraining triggers: scheduling, data drift detection, performance degradation thresholds, and new data arrival events

Common Exam Traps

Vertex AI Pipelines (Kubeflow) is almost always preferred over Cloud Composer (Airflow) for ML pipeline orchestration on this exam
Pipeline components must be DECOUPLED and independently testable. If the question describes tightly coupled components, the answer involves refactoring into independent containers
CI/CD for ML is different from CI/CD for software: it includes data validation, model evaluation, and model deployment as pipeline stages, not just code testing
A/B testing in production (traffic splitting) is different from experiment tracking (Vertex AI Experiments). Know which one the question is asking about
Scheduled retraining is not always the answer. If the question mentions data drift detection, use monitoring-triggered retraining instead of fixed schedules
Quick Check: Automating and Orchestrating ML Pipelines

Question 1 of 3

An ML team wants their model to automatically retrain when data drift exceeds a predefined threshold, rather than on a fixed schedule. Which combination of services should they use?

Domain 613% of exam

Monitoring AI Solutions

This domain covers monitoring deployed ML models for performance degradation, data drift, prediction skew, and bias. Also includes troubleshooting production ML systems, optimizing model performance, establishing retraining policies, and implementing responsible AI practices for fairness and explainability.

Key Topics

Vertex AI Model MonitoringCloud MonitoringCloud LoggingVertex AI ExplainabilityResponsible AI Toolkit

Must-Know Concepts

  • Vertex AI Model Monitoring: detecting data drift (training vs serving data distribution changes), prediction drift, and feature attribution drift
  • Types of drift: data drift (input distribution changes), concept drift (relationship between input and output changes), and prediction drift (model output distribution changes)
  • Continuous evaluation: comparing model predictions against ground truth labels as they become available to track accuracy over time
  • Feature attribution methods: Shapley values, XRAI, and integrated gradients — know when to use each for model explainability
  • When to retrain: data drift exceeds threshold, model accuracy degrades below acceptable level, new data sources become available, or business requirements change
  • Responsible AI: fairness metrics (demographic parity, equalized odds), bias detection in training data and model outputs, and mitigation strategies
  • Troubleshooting production ML systems: IAM permission issues, resource quota limits, training failures, serving errors, and performance bottlenecks
  • Logging strategies: what to log (predictions, input features, latency, errors), log sanitization (removing PII), and log analysis for debugging

Common Exam Traps

Data drift and concept drift are DIFFERENT. Data drift means input distributions changed. Concept drift means the relationship between inputs and outputs changed. Both may require retraining but for different reasons
Model monitoring detects WHEN to retrain, but it does not retrain automatically unless connected to a pipeline trigger
Feature attribution explains WHY a model made a specific prediction, not HOW accurate the prediction is. Do not confuse explainability with accuracy
Responsible AI is not just about bias detection — it includes fairness, transparency, accountability, privacy, and safety. The exam tests multiple responsible AI dimensions
Quick Check: Monitoring AI Solutions

Question 1 of 3

A production model's accuracy has dropped from 92% to 78% over three months, but the input data distribution appears unchanged. Which type of drift is most likely occurring?

Services and Concepts You Must Not Confuse

These pairs appear on nearly every exam. Learn the difference and you'll avoid the most common traps.

AutoML vs Custom Training (Vertex AI)

Use AutoML when…

Automated model training requiring minimal ML expertise. Best when time-to-production is critical, the dataset is standard (tabular, image, text), and a good-enough model is acceptable.

Use Custom Training (Vertex AI) when…

Full control over model architecture, training process, and hyperparameters. Best when you need custom architectures, specialized preprocessing, or maximum model performance.

Exam trap

The exam often presents scenarios where AutoML is sufficient but custom training seems appealing. If the question emphasizes speed, limited ML expertise, or standard data types, AutoML is usually correct. Custom training is correct when the question mentions specific framework requirements or custom architectures.

BigQuery ML vs Vertex AI Custom Training

Use BigQuery ML when…

Train models using SQL directly in BigQuery. Best for analysts who know SQL, structured tabular data already in BigQuery, and standard model types (regression, classification, clustering, time series).

Use Vertex AI Custom Training when…

Train models using Python frameworks (TensorFlow, PyTorch, scikit-learn) with full control over architecture and training. Best for complex models, unstructured data, or custom requirements.

Exam trap

BigQuery ML is the right answer when data is already in BigQuery and the model type is supported. Do not choose BigQuery ML for image classification, custom neural architectures, or when the question specifies TensorFlow/PyTorch code.

Dataflow vs Dataproc

Use Dataflow when…

Fully managed, serverless stream and batch data processing based on Apache Beam. Best for new pipelines, real-time processing, and when you want zero cluster management.

Use Dataproc when…

Managed Spark/Hadoop clusters. Best when you have existing Spark or Hadoop workloads, need Spark ML libraries, or require the broader Hadoop ecosystem.

Exam trap

Default to Dataflow for new data processing pipelines. Choose Dataproc only when the question explicitly mentions existing Spark/Hadoop code, Spark ML, or the Hadoop ecosystem. Dataflow is serverless; Dataproc requires cluster management.

Online Prediction vs Batch Prediction

Use Online Prediction when…

Real-time, low-latency predictions via Vertex AI Endpoints. Best for user-facing applications requiring immediate responses (recommendations, fraud detection, chatbots).

Use Batch Prediction when…

High-throughput predictions on large datasets without latency requirements. Best for periodic scoring jobs (nightly churn predictions, monthly risk assessments).

Exam trap

If the question mentions latency requirements or real-time responses, choose online prediction. If it mentions processing large datasets periodically or when latency is not critical, choose batch prediction. Autoscaling for online endpoints does not handle sudden spikes well — pre-scaling may be needed.

Vertex AI Feature Store vs Inline Feature Engineering

Use Vertex AI Feature Store when…

Centralized feature repository with online/offline serving. Best when features are shared across teams, reused across multiple models, or when training-serving skew must be eliminated.

Use Inline Feature Engineering when…

Features computed within the training or serving pipeline. Best for model-specific features that are not reused, or when feature computation is tightly coupled to the model.

Exam trap

Feature Store is the correct answer when the question mentions training-serving skew, feature sharing across teams, or feature reuse. If features are simple and model-specific, inline engineering may be sufficient.

Pre-built ML APIs vs AutoML

Use Pre-built ML APIs when…

Ready-to-use APIs for common tasks (Vision, NLP, Translation, Speech) requiring zero training data. Best when the use case matches a standard API capability exactly.

Use AutoML when…

Custom model training with your own labeled data. Best when pre-built APIs do not meet accuracy requirements or when you need domain-specific predictions.

Exam trap

Pre-built APIs need NO training data and work immediately. AutoML requires YOUR labeled data. If the question says 'no labeled data available' or 'general purpose,' pre-built APIs are correct. If it says 'domain-specific accuracy needed,' AutoML is correct.

GPU vs TPU

Use GPU when…

General-purpose accelerators for ML training and inference. Best for most deep learning workloads, custom operations, and frameworks with broad GPU support.

Use TPU when…

Google-designed accelerators optimized for large-scale TensorFlow and JAX workloads. Best for very large models, massive batch sizes, and when maximum throughput is needed.

Exam trap

TPUs are optimized for TensorFlow and JAX, not PyTorch (though support is improving). If the question mentions PyTorch or custom CUDA operations, GPU is correct. If it mentions large-scale TensorFlow training with massive datasets, TPU is likely correct.

Vertex AI Pipelines vs Cloud Composer (Airflow)

Use Vertex AI Pipelines when…

Serverless ML pipeline orchestration built on Kubeflow Pipelines. Best for ML-specific workflows with native Vertex AI integration.

Use Cloud Composer (Airflow) when…

Managed Apache Airflow for general workflow orchestration. Best for complex multi-system orchestration beyond ML, or when you have existing Airflow DAGs.

Exam trap

For ML pipeline questions, Vertex AI Pipelines is almost always the correct answer. Cloud Composer/Airflow is rarely correct on this exam. Choose Composer only when the question explicitly mentions existing Airflow infrastructure or non-ML orchestration needs.

Top Mistakes to Avoid

Choosing custom training when BigQuery ML or AutoML would meet the requirement faster — the exam tests whether you pick the simplest effective solution
Confusing Vertex AI Pipelines (Kubeflow, for ML workflows) with Cloud Composer (Airflow, for general orchestration) — Pipelines is almost always correct on this exam
Mixing up data drift (input distribution changes) with concept drift (input-output relationship changes) — both degrade performance but for different reasons
Selecting GPUs for every training job when TPUs are more efficient for large-scale TensorFlow workloads, or choosing TPUs when the framework is PyTorch
Choosing online prediction for batch scoring jobs or batch prediction for real-time requirements — match the serving pattern to latency needs
Confusing Feature Store (centralized feature management) with Model Registry (model version management) — they manage different artifacts
Forgetting that autoscaling is reactive and cannot handle sudden traffic spikes — pre-scaling is needed for predictable load increases
Using Cloud Composer/Airflow for ML pipeline orchestration when Vertex AI Pipelines is the Google-recommended solution
Thinking transfer learning retrains a model from scratch — it starts with a pre-trained model and fine-tunes on new data, requiring less data and compute
Confusing Vertex AI Experiments (tracking training runs) with production A/B testing (traffic splitting on Endpoints) — they serve different purposes

Exam-Ready Checklist

Can explain all 6 exam domains and their relative weights (13%, 14%, 18%, 20%, 22%, 13%)
Know when to choose BigQuery ML vs AutoML vs custom training vs pre-built APIs for any given scenario
Can design an end-to-end Vertex AI Pipeline with proper component decoupling, triggers, and artifact management
Understand online vs batch prediction trade-offs and can configure autoscaling for Vertex AI Endpoints
Know GPU vs TPU selection criteria and when to use distributed training (data parallelism vs model parallelism)
Can explain training-serving skew and how Vertex AI Feature Store prevents it
Understand all three types of drift (data, concept, prediction) and when each triggers retraining
Know Vertex AI Model Monitoring capabilities: data drift detection, prediction skew, and feature attribution drift
Can explain model explainability methods (Shapley values, XRAI, integrated gradients) and when to use each
Understand responsible AI: fairness metrics, bias detection and mitigation, and Google's responsible AI principles
Know CI/CD for ML: Cloud Build for pipeline testing, model validation gates, and staged deployment patterns
Can distinguish between Vertex AI Experiments (training tracking) and Endpoint traffic splitting (production A/B testing)
Understand RAG patterns using Vertex AI Agent Builder and when to use RAG vs fine-tuning
Scored 85%+ on at least two full practice exams before scheduling the real exam

Recommended Resources

Free & Official Resources

Paid Courses & Practice Exams

These are recommended if you prefer a structured learning path. They can save time but are not required to pass.

Frequently Asked Questions