How long should I study for the CNPA exam?

It depends on your background. Experienced Kubernetes or DevOps engineers with GitOps and observability experience can prepare in 4-6 weeks. Platform engineers actively building IDPs may need only 2-3 weeks of focused review. Those new to cloud native concepts should budget 8-10 weeks.

Do I need hands-on Kubernetes experience for the CNPA?

The CNPA is a multiple-choice knowledge exam, not a hands-on performance exam like the CKA or CKS. However, theoretical knowledge is much easier to absorb with practical experience. At minimum, run some labs on Killercoda or Minikube to ground abstract concepts.

What is the passing score for CNPA?

75% (45 out of 60 questions). There is no penalty for wrong answers, so always answer every question. Aim for 85%+ on practice exams to have a comfortable margin on exam day.

How is CNPA different from the CKA or CKAD?

CKA (Certified Kubernetes Administrator) and CKAD (Certified Kubernetes Application Developer) are hands-on performance exams where you work in a live cluster. CNPA is a multiple-choice knowledge exam focused on platform engineering concepts, tooling decisions, and practices rather than CLI proficiency. CNPA is broader in scope (GitOps, IDPs, observability, policy) while CKA/CKAD focus on Kubernetes operations and development.

Is the CNPA exam open book?

No. The CNPA is a closed-book online proctored exam. No external resources, documentation, or notes are permitted during the exam. You must know the content from memory.

What domains should I focus on most?

Platform Engineering Core Fundamentals at 36% is the most critical domain — it covers more than a third of the exam. Platform Observability, Security, and Conformance at 20% is second. Continuous Delivery and Platform Engineering at 16% is third. Together these three domains cover 72% of the exam.

What CNCF projects are covered on the CNPA exam?

Key CNCF projects include: Argo CD (GitOps), Flux (GitOps), Backstage (developer portal), Crossplane (infrastructure provisioning), Prometheus/Grafana (metrics), Loki (logs), Jaeger/Tempo (traces), Istio/Linkerd (service mesh), OPA/Gatekeeper (policy), Kyverno (policy), Tekton (CI), Helm (packaging), and Kustomize (configuration management).

How long is the CNPA certification valid?

CNPA certification is valid for 2 years. The exam includes one free retake if you do not pass on the first attempt, which must be used within 12 months of your original exam purchase.

Can I take the CNPA online or do I need a testing center?

The CNPA is an online proctored exam delivered through PSI. You take it from your own computer with a webcam and microphone. You need a quiet, private space and a stable internet connection. There are no physical testing centers required.

Certified Cloud Native Platform Engineering Associate (CNPA) Free Study Guide 2026

Q: What is the difference between platform engineering and DevOps?

DevOps is a culture and set of practices for breaking down silos between development and operations. Platform engineering is a discipline that emerges from DevOps — platform engineers build and maintain the internal platforms (IDPs) that enable development teams to practice DevOps efficiently at scale. Platform engineering is often described as 'DevOps as a product' — treating the platform as a product with developers as customers.

You Can Pass This Exam For Free

The CNPA exam is passable with free resources if you have hands-on experience with Kubernetes and cloud native tooling and study consistently for 6-8 weeks:

CNCF CNPA official exam curriculum and candidate handbook (free on Linux Foundation website)
CNCF landscape documentation and project READMEs (free on cncf.io)
Kubernetes official documentation at kubernetes.io (free, comprehensive)
Platform Engineering community resources at platformengineering.org (free)
Argo CD, Flux, Crossplane, Backstage official docs (free)
CNCF YouTube channel — KubeCon presentations on platform engineering topics (free)
Killercoda and Play with Kubernetes interactive browser-based labs (free tier)
Free practice questions on this site

The CNPA is knowledge-based with multiple-choice questions. Hands-on Kubernetes experience and familiarity with CNCF ecosystem projects are strongly recommended. Free official documentation and community content cover all exam domains thoroughly.

Choose Your Study Path

Limited Kubernetes or cloud native experience. You need to build foundational knowledge of cloud native concepts, Kubernetes, and the CNCF ecosystem before tackling platform engineering specifics.

Week 1Learn Kubernetes fundamentals: pods, deployments, services, namespaces, configmaps, and secrets. Use the official kubernetes.io docs and Killercoda labs. Understand the declarative resource model — what it means to define desired state in YAML.

Week 2Study DevOps principles and how they map to platform engineering. Understand the difference between platform engineering and traditional ops. Read the CNCF Platforms white paper (free download). Learn what an Internal Developer Platform (IDP) is and why organizations build them.

Week 3Dive into Continuous Integration: understand CI pipelines, what they automate (build, test, lint, scan), and popular CI tools in the CNCF ecosystem (Tekton, Jenkins X, GitHub Actions). Learn the pipeline stages and their purpose in a platform context.

Week 4Study Continuous Delivery and GitOps basics. Understand the GitOps principles: Git as the single source of truth, declarative configuration, reconciliation. Learn Argo CD and Flux conceptually — what they do, push vs pull deployment models, and how they enforce desired state.

Week 5Cover observability fundamentals: the three pillars (metrics, logs, traces). Learn Prometheus for metrics, Grafana for dashboards, Loki for logs, Jaeger/Tempo for distributed tracing. Understand what SLIs, SLOs, and SLAs mean for platform reliability.

Week 6Study Kubernetes security basics: RBAC, network policies, pod security standards, secrets management. Learn about admission controllers and policy engines (OPA/Gatekeeper, Kyverno). Understand mTLS and service mesh concepts (Istio, Linkerd).

Week 7Learn platform APIs and infrastructure provisioning: Custom Resource Definitions (CRDs), the Kubernetes reconciliation loop, and the operator pattern. Study Crossplane for infrastructure provisioning from Kubernetes. Understand what infrastructure as code means in a cloud native context.

Week 8Study IDPs and developer experience: what service catalogs are (Backstage), how developer portals abstract platform complexity, and golden paths. Learn about DORA metrics (deployment frequency, lead time, MTTR, change failure rate) and how they measure platform effectiveness.

Week 9Practice questions across all domains. Focus on Platform Engineering Core Fundamentals (36% of exam) — this single domain is more than a third of the exam. Review GitOps, CI/CD, and observability concepts.

Week 10Take full mock exams targeting 75%+. Review all incorrect answers. Re-study any domains where you score below 70%. The passing score is 75%, so aim for 85%+ in practice to have a comfortable margin.

Exam Overview

Format

60 multiple-choice questions, 120 minutes. Online proctored exam delivered through PSI. Closed-book with no external resources allowed.

Scoring

Percentage-based scoring. Passing: 75%. No penalty for wrong answers — always answer every question. Score report provided immediately after exam.

Domains & Weights

Platform Engineering Core Fundamentals36%
Platform Observability, Security, and Conformance20%
Continuous Delivery and Platform Engineering16%
Platform APIs and Provisioning Infrastructure12%
IDPs and Developer Experience8%
Measuring your Platform8%

Registration

$250 USD. Register at training.linuxfoundation.org. Exam fee is $250 USD. Certification is valid for 2 years and includes one free retake. A Linux Foundation account is required.

Topic Priority Table

Not all topics are tested equally. Focus your study time on Tier 1 first, then Tier 2. Tier 3 topics rarely appear — just recognize what they do.

Tier 1: Must KnowYou must understand these concepts deeply, know how they work, and apply them in scenario-based questions. These appear across multiple questions and multiple domains.

Tier 2: Should KnowUnderstand what these are, their key characteristics, and how they fit into a cloud native platform. May appear in 2-5 questions each.

Tier 3: Recognize OnlyKnow what these are at a high level and their role in the cloud native platform ecosystem. Rarely more than 1-2 questions each.

Domain 136% of exam

Platform Engineering Core Fundamentals

The largest domain at 36% of the exam. Covers the foundational concepts: declarative resource management, DevOps practices, application environments, platform architecture, and the full CI/CD and GitOps lifecycle. This is the theoretical foundation that all other domains build on.

Key Topics

Declarative ConfigurationKubernetesGitOpsCI PipelinesCD PipelinesDevOps CulturePlatform Architecture

Must-Know Concepts

Declarative vs imperative resource management: declarative defines desired state in files; imperative issues commands. Kubernetes is declarative — you apply YAML manifests
The four OpenGitOps principles: declarative, versioned and immutable, pulled automatically, and continuously reconciled
DevOps as both cultural and technical practices: breaking silos between dev and ops, shared responsibility for reliability, fast feedback loops
Platform engineering as the discipline of building self-service platforms (IDPs) that enable developers to be productive without deep infrastructure knowledge
Application environments: development, staging/pre-production, and production environments and the promotion patterns between them
Continuous Integration: automated build, test, lint, scan, and artifact creation on every commit. Fast feedback for developers
Continuous Delivery: automated deployment pipeline that can deploy to production at any time. Every commit should be deployable
Continuous Deployment: every commit that passes CI is automatically deployed to production without manual approval (a subset of organizations practice this)
GitOps workflow: developer commits to Git > CI pipeline builds and tests > GitOps operator detects change > reconciles cluster to new desired state
Push vs pull deployment models and why pull-based (GitOps) provides better security, auditability, and drift prevention

Common Exam Traps

Continuous Delivery means you CAN deploy at any time; Continuous Deployment means you DO deploy automatically on every passing commit. Many organizations practice Delivery but not Deployment

GitOps is NOT a tool — it is a set of practices. Argo CD and Flux are tools that implement GitOps

Declarative management means the SYSTEM handles reconciliation. You do not script the steps — you describe the end state

Platform engineering is NOT the same as DevOps or SRE, though it borrows from both. Platform engineers build PLATFORMS for other teams to use, not operational runbooks or services directly

Quick Check: Platform Engineering Core Fundamentals

Question 1 of 3

A platform team is designing a deployment workflow where developers commit application changes to a Git repository and the changes are automatically reflected in the Kubernetes cluster. The cluster continuously checks for drift and corrects it. Which deployment model does this describe?

Domain 220% of exam

Platform Observability, Security, and Conformance

The second-largest domain at 20%. Covers the three pillars of observability (metrics, logs, traces), SLI/SLO/SLA frameworks, Kubernetes security (RBAC, pod security, network policies), mTLS and service meshes, and policy engines for conformance enforcement.

Key Topics

PrometheusGrafanaLokiJaegerKubernetes RBACNetwork PoliciesmTLSService MeshOPA/GatekeeperKyverno

Must-Know Concepts

Three pillars of observability: metrics (quantitative measurements), logs (discrete events), and traces (distributed request paths)
SLI: the specific metric measured (e.g., request latency p99). SLO: the reliability target for that SLI (e.g., p99 < 200ms, 99.9% of the time). SLA: the contractual commitment, typically looser than the SLO
Error budget: the acceptable amount of unreliability implied by an SLO. If SLO is 99.9%, the error budget is 0.1% — the budget guides release velocity decisions
Prometheus: pull-based metrics collection, PromQL for queries, Alertmanager for routing alerts, PodMonitor/ServiceMonitor CRDs for Kubernetes scrape configuration
Kubernetes RBAC: Roles and ClusterRoles define permissions, RoleBindings and ClusterRoleBindings assign them to subjects (users, groups, service accounts). Principle of least privilege
Network Policies: Kubernetes objects that define allowed ingress and egress traffic for pods. By default, all traffic is allowed; network policies add restrictions
Pod Security Standards: Privileged, Baseline, and Restricted pod security profiles enforced via Pod Security Admission controller in modern Kubernetes
mTLS (mutual TLS): both client and server authenticate each other and encrypt communication. Service meshes (Istio, Linkerd) implement mTLS transparently for all service-to-service communication
Admission webhooks: Kubernetes API intercept points that validate or mutate resources before persistence. Mutating webhooks run before validating webhooks
Policy engines: OPA/Gatekeeper (Rego-based) and Kyverno (YAML-based) enforce organizational policies via admission webhooks. Both support audit and enforce modes

Common Exam Traps

Network Policies are DEFAULT ALLOW — if no NetworkPolicy selects a pod, all traffic is allowed. Adding a NetworkPolicy creates restrictions, it does not add permissions

SLO should be STRICTER than SLA. Your SLO is your internal target; your SLA is the external commitment. Violate your SLO first so you can fix it before violating the SLA

Pod Security Admission replaced PodSecurityPolicy (PSP) in Kubernetes 1.25. PSP is deprecated — do not reference it in modern platform engineering answers

mTLS in a service mesh is transparent to application code — applications do NOT implement TLS themselves. The sidecar proxy (Envoy in Istio) handles TLS termination and origination

OPA/Gatekeeper audit mode reports on policy violations for EXISTING resources; enforce mode blocks NEW or UPDATED resources. Existing violations are not automatically deleted in enforce mode

Quick Check: Platform Observability, Security, and Conformance

Question 1 of 3

A platform team has set an SLO of 99.9% availability for their API gateway. Their SLA with enterprise customers guarantees 99.5% availability. An incident reduces availability to 99.7% for a month. What is the impact?

Domain 316% of exam

Continuous Delivery and Platform Engineering

This domain covers CI pipeline architecture, advanced GitOps workflows, incident response practices, and how continuous delivery integrates with platform engineering. Expect questions on pipeline stages, deployment strategies, and GitOps branching models.

Key Topics

CI PipelinesGitOps WorkflowsArgo CDFluxDeployment StrategiesIncident ResponseCanary / Blue-Green Deployments

Must-Know Concepts

CI pipeline stages in cloud native: code commit > trigger > checkout > lint > unit test > build container image > scan image for vulnerabilities > push to registry > update GitOps manifests
Deployment strategies: rolling update (gradual pod replacement), blue-green (two identical environments, switch traffic), canary (route small % traffic to new version, gradually increase)
GitOps branching models: environment-per-branch (main=prod, staging branch, dev branch) vs directory-based (one branch, environments in subdirectories)
Progressive delivery: combining canary deployments with automated analysis to automatically promote or rollback based on metrics (Argo Rollouts supports this)
Incident response in platform engineering: alert triggers > on-call paged > triage (impact, scope) > communicate status > mitigate (rollback, scale, fix) > post-incident review > action items
GitOps-based incident response: rollback by reverting the Git commit — the GitOps operator automatically restores the previous state
Supply chain security in CI: image signing (cosign/Sigstore), SBOM generation, vulnerability scanning with Trivy or Grype, SLSA levels for build provenance
Separation of concerns: CI is responsible for producing a validated artifact (container image); CD/GitOps is responsible for deploying it. They should not overlap

Common Exam Traps

Canary deployments route a small PERCENTAGE of traffic to the new version — not a separate environment. Blue-green is two full environments with traffic switching. Canary uses real production traffic incrementally

Rolling update is the Kubernetes default — pods are gradually replaced. It does NOT provide instant rollback like blue-green. Rolling back a rolling update takes time

GitOps rollback = reverting the Git commit. The operator detects the Git revert and reconciles the cluster to the previous desired state. This is faster and more reliable than running kubectl commands

CI pipelines should build the image ONCE and promote that exact image through environments. Never rebuild the image for each environment — rebuild breaks the immutability guarantee

Vulnerability scanning in CI should be a GATE — the pipeline should fail if critical vulnerabilities are found, not just report them

Quick Check: Continuous Delivery and Platform Engineering

Question 1 of 3

A platform team wants to deploy a new API version to production while minimizing risk. They want to route 5% of production traffic to the new version, monitor error rates, and gradually increase traffic if metrics look healthy. Which deployment strategy should they use?

Domain 412% of exam

Platform APIs and Provisioning Infrastructure

This domain covers how platforms expose APIs through Kubernetes extension mechanisms — CRDs, custom controllers, operators — and how infrastructure is provisioned through cloud native tooling like Crossplane and Terraform. The focus is on infrastructure-as-code in a Kubernetes-native context.

Key Topics

Custom Resource DefinitionsKubernetes OperatorsCrossplaneTerraformKubernetes API MachineryReconciliation Loop

Must-Know Concepts

CRD lifecycle: define schema in YAML > apply to cluster > Kubernetes API server accepts instances > custom controller watches and acts on instances
The operator pattern: a software extension that uses CRDs and a custom controller to manage the complete lifecycle of a complex application (install, configure, backup, upgrade, recover)
Reconciliation loop: watch for changes > observe current state > compare to desired state > take actions to reconcile > repeat continuously
Infrastructure provisioning approaches: declarative (Terraform, Crossplane) vs imperative (scripts, manual). Platform teams should use declarative for reproducibility
Crossplane architecture: providers (connect to cloud APIs), managed resources (map to cloud resources like RDS instances), composite resources (abstract multiple managed resources), and compositions (templates for composite resources)
The difference between infrastructure provisioning (creating the resource) and infrastructure configuration (managing settings after creation) — both should be handled declaratively
Kubernetes API groups and versioning: how API resources are organized (apiVersion: apps/v1, batch/v1, etc.) and why versioning matters for CRD stability
Webhook admission patterns: how mutating and validating webhooks extend the Kubernetes API admission chain for platform governance

Common Exam Traps

A CRD defines the schema; a custom controller provides the behavior. Without a controller, CRD instances are just stored — nothing happens to them

Crossplane composite resources abstract underlying cloud resources from developers. A developer creates a DatabaseClaim; Crossplane creates the actual RDS instance. The abstraction is the key design principle

Operators are for complex STATEFUL applications that have operational knowledge built in (backup, failover, upgrades). For simple stateless apps, Deployments and Helm charts are sufficient — operators are overkill

Terraform is NOT Kubernetes-native. It uses its own state file and CLI. Crossplane runs inside Kubernetes and uses the reconciliation loop — no external state file needed for Kubernetes-managed resources

When a custom controller crashes or is restarted, it should be able to RECONCILE from current state in the cluster — it must be idempotent. Reconciliation must be safe to run repeatedly

Quick Check: Platform APIs and Provisioning Infrastructure

Question 1 of 3

A platform team wants developers to request a managed PostgreSQL database by creating a YAML manifest in Kubernetes, without knowing the underlying cloud provider details. What is the cloud native approach?

Domain 58% of exam

IDPs and Developer Experience

This domain covers Internal Developer Platforms — how platform teams build self-service environments that improve developer productivity. Topics include service catalogs, developer portals (Backstage), golden paths, and how AI/ML automation is emerging in platform tooling.

Key Topics

BackstageService CatalogsGolden PathsSoftware TemplatesDeveloper PortalsSelf-Service Infrastructure

Must-Know Concepts

Platform engineering goal: reduce cognitive load on developers by providing self-service, paved roads, and golden paths for common tasks
Golden paths: the recommended, pre-built, opinionated workflows that platform teams provide for common developer tasks (create a service, provision a database, set up CI). Golden paths, not golden cages — developers can deviate but it costs more
Backstage core components: Software Catalog (register and discover all services, APIs, teams), Software Templates (scaffolding for new services), TechDocs (documentation as code), and the Plugin ecosystem
Service catalog: a registry of all software components, services, APIs, and resources in the organization. Enables discoverability, ownership tracking, and dependency mapping
Developer portals vs IDPs: the portal (Backstage) is the UI layer; the IDP is the entire platform. The portal surfaces capabilities the IDP provides
Self-service provisioning: developers should be able to create standard resources (namespaces, databases, CI pipelines) without filing tickets or waiting for ops teams
AI/ML in platform tooling: AI-assisted code review, AI-generated runbooks, ML-based anomaly detection in observability, and LLM-powered developer assistants integrated into portals

Common Exam Traps

Backstage is a framework, not a finished product. Organizations must install, configure, and maintain it. It requires ongoing investment — it is not plug-and-play

Golden paths should be maintained as first-class products, not one-time setups. If the golden path becomes outdated, developers bypass it and the productivity benefit is lost

Service catalogs without ownership data are incomplete. The catalog is most valuable when it shows WHO owns each service — essential for incident response and change management

Developer portals are for DEVELOPERS — they should reduce friction, not add it. If onboarding to the portal is harder than the manual process, adoption will be zero

Quick Check: IDPs and Developer Experience

Question 1 of 3

A platform team has built golden paths for creating new microservices. A developer needs to create a new service type not covered by any existing golden path. What should the developer do?

Domain 68% of exam

Measuring your Platform

The smallest domain at 8%. Covers how platform engineering teams measure the effectiveness and efficiency of their platforms using DORA metrics, developer experience metrics, and platform-specific KPIs. Understanding what to measure and why is key.

Key Topics

DORA MetricsPlatform KPIsDevEx MetricsError BudgetsAdoption Metrics

Must-Know Concepts

DORA four key metrics: Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Mean Time to Restore (MTTR)
DORA performance buckets: Elite performers have deployment frequency multiple times per day, lead time under one hour, change failure rate < 5%, MTTR under one hour
Deployment Frequency: how often code is deployed to production. Higher frequency (with stability) indicates mature delivery pipeline
Lead Time for Changes: time from code commit to running in production. Measures the efficiency of the entire delivery pipeline
Change Failure Rate: percentage of deployments that cause an incident or require rollback. Measures deployment quality
MTTR (Mean Time to Restore): average time to restore service after a production incident. Measures platform resilience and incident response effectiveness
Platform adoption metrics: golden path adoption rate, developer portal active users, self-service request volume, ticket volume reduction
Developer experience (DevEx) metrics: developer satisfaction scores (surveys), onboarding time for new developers, time to first deployment, cognitive load measures
Relationship between DORA metrics: teams should aim to improve ALL four simultaneously. High deployment frequency with high change failure rate is not mature — stability matters as much as speed

Common Exam Traps

MTTR is Mean Time to RESTORE — not Mean Time to REPAIR or RESOLVE (though these terms are sometimes used interchangeably in practice). The exam uses 'restore' as recovering service, not fixing root cause

A high deployment frequency with a high change failure rate is WORSE than moderate frequency with low failure rate. DORA metrics must improve together, not independently

Change Failure Rate measures deployments that CAUSE incidents — it does not measure all failures. A deployment that succeeds but later causes a subtle performance issue may not be captured immediately

Error budgets are derived from SLOs — they represent how much unreliability you can afford. Exhausting the error budget signals it is time to prioritize reliability work over new features

Measuring platform adoption is as important as technical metrics. A technically excellent platform with zero developer adoption provides no business value

Quick Check: Measuring your Platform

Question 1 of 3

A platform team deploys to production 15 times per day but 30% of those deployments cause incidents requiring rollback. Using DORA metrics, how should this be characterized?

Concepts You Must Not Confuse

These pairs appear on nearly every exam. Learn the difference and you'll avoid the most common traps.

GitOps (Pull-based CD) vs Traditional Push-based CD

Use GitOps (Pull-based CD) when…

The cluster agent (Argo CD, Flux) continuously pulls from Git and reconciles cluster state. Git is the single source of truth. Drift is automatically detected and corrected.

Use Traditional Push-based CD when…

CI/CD pipeline pushes deployments directly to the cluster using kubectl or Helm commands. Cluster state can drift from what was deployed without detection.

Exam trap

GitOps is PULL-based — the cluster pulls from Git. Push-based CD pipelines push to the cluster. GitOps provides continuous reconciliation and drift correction that push models lack.

Argo CD vs Flux

Use Argo CD when…

Single application with a rich UI for managing GitOps deployments. App-of-apps pattern for managing multiple applications. Strong multi-tenancy through Projects.

Use Flux when…

Modular GitOps Toolkit approach with composable controllers. Stronger Helm and Kustomize native integration. More flexible multi-cluster bootstrap approach.

Exam trap

Both Argo CD and Flux implement the same GitOps principles (pull-based, Git as source of truth, continuous reconciliation). Choose based on team preference and tooling ecosystem, not principles — the principles are identical.

OPA/Gatekeeper vs Kyverno

Use OPA/Gatekeeper when…

Uses Rego policy language — very powerful and general-purpose. Can enforce complex multi-resource policies. Steeper learning curve. OPA can also be used outside Kubernetes.

Use Kyverno when…

Uses Kubernetes-native YAML policies — easier learning curve for Kubernetes operators. Native support for generating resources and mutating resources alongside validation.

Exam trap

Both are Kubernetes admission webhook policy engines. OPA/Gatekeeper uses Rego (a dedicated policy language), while Kyverno uses YAML (familiar to Kubernetes users). Kyverno can also GENERATE and MUTATE resources, not just validate.

SLO (Service Level Objective) vs SLA (Service Level Agreement)

Use SLO (Service Level Objective) when…

Internal reliability target set by the engineering team. A goal, not a contract. Defines the acceptable reliability threshold and drives error budget calculations. Violation has no contractual consequences.

Use SLA (Service Level Agreement) when…

External contractual commitment with customers or stakeholders. Violation results in penalties (credits, refunds). SLAs are typically set more conservatively than internal SLOs.

Exam trap

SLOs are INTERNAL targets that should be stricter than your SLAs. If your SLA is 99.9% uptime, your SLO should be 99.95% so that SLO violations trigger internal action before the SLA is breached.

Custom Resource Definition (CRD) vs Kubernetes Operator

Use Custom Resource Definition (CRD) when…

Defines the schema for a new Kubernetes resource type. Tells Kubernetes what fields the new resource accepts. Provides no behavior on its own.

Use Kubernetes Operator when…

A controller that watches a specific CRD and implements the business logic to manage the lifecycle of the custom resource. The operator gives behavior to the CRD.

Exam trap

A CRD without an operator is just a data store — Kubernetes accepts the resource but nothing acts on it. The operator watches for CRD instances and reconciles them to the desired state. You need BOTH the CRD and the operator together.

Metrics (Prometheus) vs Traces (Jaeger/Tempo)

Use Metrics (Prometheus) when…

Aggregated numerical measurements over time. Answer questions like 'what is the error rate?' and 'how many requests per second?'. Efficient storage via time-series. Best for alerting and dashboards.

Use Traces (Jaeger/Tempo) when…

Distributed end-to-end request paths through microservices. Answer questions like 'which service is causing latency?' and 'where did this request fail?'. Essential for debugging distributed systems.

Exam trap

Metrics tell you THAT something is wrong (error rate spiked). Traces tell you WHERE and WHY (which microservice in the call chain failed). Both are needed for effective platform observability — metrics for alerting, traces for root cause analysis.

IDP (Internal Developer Platform) vs Developer Portal (Backstage)

Use IDP (Internal Developer Platform) when…

The entire self-service platform including all infrastructure, services, pipelines, and tooling that developers use to build and deploy applications. Encompasses multiple systems and tools.

Use Developer Portal (Backstage) when…

The UI layer of an IDP — a centralized web interface where developers discover services, create projects, view documentation, and interact with the platform. Backstage is the most common framework.

Exam trap

A developer portal is ONE component of an IDP — the front-end interface. The IDP itself includes the CI/CD system, secret management, monitoring, and all platform services. Backstage provides the portal UI; it does not replace the entire platform.

Mutating Admission Webhook vs Validating Admission Webhook

Use Mutating Admission Webhook when…

Intercepts API requests and can MODIFY the resource before it is persisted (e.g., inject sidecar containers, set default values, add labels). Runs BEFORE validating webhooks.

Use Validating Admission Webhook when…

Intercepts API requests and can ALLOW or DENY them based on policy, but CANNOT modify the resource. Runs AFTER mutating webhooks.

Exam trap

Execution order matters: Mutating webhooks run FIRST, then Validating webhooks. Kyverno registers BOTH types of webhooks (MutatingWebhookConfiguration for mutate rules, ValidatingWebhookConfiguration for validate and generate rules). OPA/Gatekeeper primarily validates but also supports mutation via assign/assignMetadata. A validating webhook sees the resource AFTER any mutations have been applied.

Top Mistakes to Avoid

Confusing GitOps (pull-based, cluster pulls from Git) with traditional push-based CI/CD (pipeline pushes to cluster) — the pull model and drift correction are what define GitOps

Treating CRDs and operators as the same thing — a CRD defines the schema and a controller/operator implements the behavior; you need both

Mixing up SLO and SLA — SLO is the internal target (stricter), SLA is the external commitment (looser). Violate your SLO first so you fix issues before breaching the SLA

Thinking DORA metrics only measure speed — change failure rate and MTTR measure stability. Elite performers are fast AND stable simultaneously

Confusing Backstage (a developer portal framework) with an IDP (the entire self-service platform) — Backstage is one UI layer component of a broader IDP

Assuming network policies add permissions — Kubernetes defaults to allow-all; network policies add RESTRICTIONS. Default behavior without any network policy is fully permissive

Forgetting that OPA/Gatekeeper and Kyverno policies in enforce mode only affect NEW or UPDATED resources — existing violations require audit mode to surface and separate cleanup

Thinking continuous deployment and continuous delivery are the same — delivery means you CAN deploy manually; deployment means you DO deploy automatically on every passing commit

Confusing canary (traffic percentage splitting) with blue-green (two environments, instant traffic switch) — they are different risk management strategies with different rollback characteristics

Assuming Crossplane replaces Terraform — Crossplane is Kubernetes-native for cloud resource management via the reconciliation loop; Terraform is a standalone IaC tool. Both have valid use cases

Exam-Ready Checklist

Can explain all 6 exam domains and their relative weights: 36%, 20%, 16%, 12%, 8%, 8%

Know the four OpenGitOps principles and the difference between push-based and pull-based deployment models

Can explain declarative resource management vs imperative and why declarative is the cloud native standard

Understand the Kubernetes reconciliation loop and how it applies to GitOps, operators, and Crossplane

Know all three observability pillars (metrics, logs, traces), the tools for each (Prometheus, Loki, Jaeger), and the SLI/SLO/SLA/error budget framework

Can explain Kubernetes RBAC, network policies, pod security standards, and when to use each

Understand mTLS in service meshes and why it provides authentication + encryption without code changes

Know OPA/Gatekeeper vs Kyverno — policy language, modes (audit/enforce), and when each runs in the admission chain

Can explain CRDs + operators and why both are needed (schema vs behavior)

Understand the IDP concept, what Backstage provides (Software Catalog, Templates, TechDocs, Plugins), and what golden paths mean

Know all four DORA metrics by name, what each measures, and the elite performer benchmarks

Understand deployment strategies: rolling update, blue-green, canary — when to use each and rollback characteristics

Can explain Crossplane's role in cloud native infrastructure provisioning and how it differs from Terraform

Scored 75%+ on at least two full mock exams (the passing score is 75%). Aim for 85%+ for a comfortable margin

Recommended Resources

Free & Official Resources

CNCF CNPA Exam Curriculum

Official CNPA exam page with curriculum outline, candidate handbook, and registration information from the Linux Foundation.

Official

Kubernetes Official Documentation

The authoritative reference for all Kubernetes concepts. Required reading for CRDs, RBAC, network policies, operators, and the Kubernetes API.

Free

CNCF Platforms White Paper

The CNCF TAG App Delivery white paper on platform engineering — defines IDPs, platform capabilities, and platform maturity model. Essential reading.

Free

OpenGitOps Principles

The four core GitOps principles defined by the OpenGitOps working group. Authoritative source for GitOps concepts tested on the exam.

Free

CNCF Landscape

Interactive map of all CNCF projects organized by category. Essential for understanding which tools fit which platform engineering use cases.

Free

DORA State of DevOps Report

The source research for DORA metrics with definitions, benchmarks, and elite performer thresholds. Free annual reports available.

Free

Argo CD Documentation

Official Argo CD docs covering GitOps concepts, sync policies, app-of-apps, and multi-cluster management.

Free

Backstage Documentation

Official Backstage docs covering Software Catalog, Software Templates, TechDocs, and the plugin system.

Free

Killercoda CNCF Labs

Free browser-based interactive Kubernetes and CNCF tool labs. Hands-on practice without any local setup required.

Free

Free CNPA Practice Questions

Free practice questions on this site covering all CNPA exam domains.

Free

Paid Courses & Practice Exams

These are recommended if you prefer a structured learning path. They can save time but are not required to pass.

Linux Foundation LFS263: Platform Engineering Fundamentals

Official Linux Foundation training course designed to align with the CNPA exam curriculum.

Paid

Udemy: Cloud Native Platform Engineering

Search for CNPA or cloud native platform engineering courses with practice exams and video content.

Paid

CNPA Study Guide

You Can Pass This Exam For Free

Choose Your Study Path

Exam Overview

Topic Priority Table

Platform Engineering Core Fundamentals

Key Topics

Must-Know Concepts

Common Exam Traps

Platform Observability, Security, and Conformance

Key Topics

Must-Know Concepts

Common Exam Traps

Continuous Delivery and Platform Engineering

Key Topics

Must-Know Concepts

Common Exam Traps

Platform APIs and Provisioning Infrastructure

Key Topics

Must-Know Concepts

Common Exam Traps

IDPs and Developer Experience

Key Topics

Must-Know Concepts

Common Exam Traps

Measuring your Platform

Key Topics

Must-Know Concepts

Common Exam Traps

Concepts You Must Not Confuse

Top Mistakes to Avoid

Exam-Ready Checklist

Recommended Resources

Free & Official Resources

Paid Courses & Practice Exams

Frequently Asked Questions