Compute Engine vs GKE vs Cloud Run — Which GCP Compute Service to Choose (2026)

The hardest single ACE topic is not memorising facts. It is making the right compute choice when two options both could work. Google's exam writers love giving you scenarios where Cloud Run, GKE, and Compute Engine could all technically run the workload — and expecting you to pick the one that minimises operational overhead while meeting the constraints.

This guide walks through each option, when each is the right answer, and the exam tells that signal which one to pick.

The five GCP compute options

From most managed to least managed:

Cloud Functions (Gen 2) — event-driven, single-function, fully managed.
Cloud Run — containerised, request-driven (or job-driven), fully managed.
App Engine — managed runtime for specific languages (standard) or containers (flex). Largely superseded by Cloud Run.
GKE (Autopilot or Standard) — managed Kubernetes. Autopilot = node-managed by Google. Standard = you manage nodes.
Compute Engine — raw VMs. Full OS control, you manage everything inside.

The decision tree

Run through these questions in order:

Is the workload a single event-triggered function (one HTTP endpoint, one Pub/Sub trigger, one Cloud Storage event)? → Cloud Functions Gen 2.
Is it a stateless request/response container that should scale to zero? → Cloud Run service.
Is it a finite batch job that runs to completion? → Cloud Run jobs (or Batch for very large parallel workloads).
Does the workload need Kubernetes-specific features (StatefulSets, Operators, complex service meshes, GitOps-driven multi-team clusters)? → GKE. Default to Autopilot.
Does it need node-level customisation (DaemonSets, privileged containers, custom CNI, specific node OS, node-local storage)? → GKE Standard.
Does it need a full VM (lift-and-shift, GPU/TPU, large memory beyond container limits, specialised software that does not containerise)? → Compute Engine.
None of the above clearly apply? → Cloud Run is the safe default.

Cloud Run in depth

Cloud Run runs containers and bills you by request and per-container CPU/memory time. Two flavours:

Cloud Run services — request/response. HTTP traffic comes in, container responds, scales 0 to N based on traffic.
Cloud Run jobs — finite batch. You define a job, run it manually or on a schedule, it executes to completion, exits, you pay for execution time only.

Cloud Run service essentials

Revisions: every deploy creates a new revision. You can split traffic between revisions (e.g., 90/10 canary).
Min/max instances: min=0 scales to zero (cheapest, has cold starts). min=1+ keeps warm instances (no cold start, costs more).
Concurrency: how many concurrent requests one container handles. Default 80. Can be raised up to 1000 or lowered to 1 (for workloads that cannot handle concurrency).
CPU allocation: “CPU always allocated” (good for background tasks) vs “CPU only during request” (cheaper, default).
Memory/CPU limits: configurable per service. Up to 32 GiB memory and 8 vCPUs per container.
VPC connector or Direct VPC egress: for accessing private resources inside a VPC.
Public ingress requires IAM binding: allUsers → roles/run.invoker. Without it, returns 403.

Where Cloud Run wins

HTTP APIs and microservices with variable traffic.
Public-facing web apps and websites (rendered in container).
Webhooks and integration endpoints.
Internal tools serving HTTP traffic to authenticated users.
Scheduled batch jobs (Cloud Run jobs triggered by Cloud Scheduler).

Where Cloud Run loses

Long-running, always-on workloads where scale-to-zero is not useful and the warm-instance cost exceeds GKE node cost at scale.
Workloads requiring StatefulSets, persistent volumes, or persistent identity within a cluster.
Workloads needing more than 32 GiB memory or 8 vCPUs per instance.
Workloads that take more than 60 minutes per request (Cloud Run service has a 60-minute request timeout max).
Multi-container pods or sidecar patterns (Cloud Run is single-container per service).

GKE Autopilot

Autopilot is Google's “managed Kubernetes” mode. Google manages the nodes; you only see and manage the pods. Bills per pod by CPU/memory/storage.

Where Autopilot wins

Standard web/API workloads on Kubernetes.
Teams adopting GKE without dedicated Kubernetes-ops expertise.
Multi-team clusters where you want consistent operational guardrails.
Workloads that need Kubernetes primitives (Deployments, Services, Ingress, ConfigMaps, Secrets) but not node-level features.

Where Autopilot loses

DaemonSets (limited support — some are allowed, custom ones often blocked).
Privileged containers (not allowed).
Custom CNI plugins (not allowed).
HostNetwork or hostPath volumes (restricted).
Workloads requiring specific node OS images or kernel parameters.
Cost-sensitive high-scale workloads where per-pod pricing exceeds per-node pricing at your utilisation.

GKE Standard

You manage the nodes. Bills per VM (node). Full Kubernetes feature set with no Autopilot restrictions.

Where Standard wins

Workloads needing DaemonSets, privileged containers, or custom CNI.
Cost optimisation at scale — at high utilisation, packing pods densely onto nodes is cheaper than per-pod pricing.
Workloads needing specific node configurations (machine types not in Autopilot's catalogue, custom OS images, node taints/tolerations).
Multi-tenant clusters with custom isolation needs.
Existing Kubernetes operators (Istio, Knative, Strimzi, OperatorHub) that assume node access.

Where Standard loses

Small teams without Kubernetes-ops capacity — node management becomes the bottleneck.
Low-traffic workloads — paying for nodes when there is no work to do is wasteful.
Workloads that have no Kubernetes-specific requirements — using Standard for a basic web app is overkill.

Compute Engine

Raw VMs. You pick the machine type, the OS, the disk. You install software, manage patches, configure networking. Bills per VM-second.

Where Compute Engine wins

Lift-and-shift migrations of existing VM workloads.
Specialised hardware needs (GPUs for ML, TPUs for TensorFlow, large memory or high-CPU instances).
Software that does not containerise cleanly (legacy enterprise software, licensed software bound to MAC addresses, complex stateful services).
Single-tenant workloads where you want a sole-tenant node.
Spot/preemptible VMs for cost-sensitive batch workloads (up to 91% cheaper than on-demand).
Custom OS images or kernel configurations.

Where Compute Engine loses

New stateless web workloads — Cloud Run is simpler and cheaper for typical traffic patterns.
Auto-scaling workloads where you want true zero-cost-at-idle — even MIGs do not scale to zero.
Teams optimising for operational simplicity — Compute Engine is the most ops-heavy option.

Compute Engine features to know for ACE

Machine families: e2 (cost-optimised general), n2/n2d (balanced general), c2/c3 (compute-optimised), m1/m2/m3 (memory-optimised), a2/a3 (GPU/accelerator).
Spot VMs: up to 91% cheaper, can be terminated by Google with 30-second notice. Maximum lifetime 30 days for preemptible (legacy term); spot VMs have no max lifetime.
Sustained use discounts: automatic discount for VMs running >25% of a month.
Committed use discounts (CUDs): 1- or 3-year commit for up to 57% discount on regular VMs.
Managed Instance Groups (MIGs): deploy N identical VMs from an instance template, autoscale on metrics, autoheal on health checks. Regional MIGs spread across zones for HA.

Cloud Functions (Gen 2)

Cloud Functions Gen 2 runs on Cloud Run infrastructure under the hood. The difference from Cloud Run is the developer experience: deploy a function, not a container.

Where Cloud Functions wins

Single-purpose event handlers (Pub/Sub → process message, Cloud Storage object created → process file, Eventarc event → react).
Webhooks where the simplest possible deployment is the goal.
Glue code between GCP services.

When to graduate to Cloud Run

You need more than one endpoint per deployment.
You want explicit container control (Dockerfile, custom base image).
You need longer execution times or more memory than Cloud Functions provides.

App Engine (legacy default)

App Engine standard runs Python, Java, Go, Node.js, Ruby, and PHP apps in a managed sandbox. Auto-scales (close to zero, not always actually zero), no infrastructure to manage. App Engine flex runs containers in managed VMs.

For new workloads, Cloud Run is the modern default. App Engine still works and is still on the ACE exam, but most decisions that historically went to App Engine now go to Cloud Run.

The exam will sometimes give scenarios where App Engine standard is the right answer (typically: a specific runtime language, no container experience on the team, want zero infrastructure). Most of the time, Cloud Run is the answer when both could work.

Side-by-side comparison

Feature	Compute Engine	GKE Standard	GKE Autopilot	Cloud Run	Cloud Functions
Container required?	No	Yes	Yes	Yes	No (function code)
Scales to zero?	No	No	No (pod-level yes, cluster no)	Yes	Yes
Node management	You	You	Google	Google	Google
OS management	You	Google (node OS)	Google	Google	Google
Billing model	Per VM-second	Per node + cluster fee	Per pod (CPU/mem)	Per request + CPU/mem time	Per invocation + CPU/mem time
Max instance memory	12 TB+	Node-dependent	128 GiB per pod	32 GiB	32 GiB (gen 2)
Cold start	None (warm)	Pod start time	Pod start time	~1-10 sec (mitigatable)	~1-10 sec
Best for	Lift-shift, GPU, custom OS	Full K8s features	Managed K8s	Modern stateless	Event glue

Pricing model differences

The pricing model often drives the right answer in “cost-sensitive” scenarios:

Compute Engine: pay for the VM 24/7 whether it does work or not (unless you stop it). Discounts: sustained-use, committed-use, spot.
GKE Standard: pay for nodes 24/7, plus the cluster management fee ($0.10/hour per cluster after the first free zonal cluster).
GKE Autopilot: pay per pod by CPU/memory/storage, plus the cluster management fee. Cheaper than Standard at low utilisation, more expensive at high utilisation.
Cloud Run: pay per request + per CPU/memory-second of actual processing. Scale-to-zero means zero cost at idle.
Cloud Functions: similar to Cloud Run — pay per invocation + per CPU/memory-time.

Exam scenario tells

Phrases that signal a specific answer:

Scenario phrase	Likely answer
“Pay nothing when idle” / “scale to zero”	Cloud Run or Cloud Functions
“Minimise operational overhead” (with container)	Cloud Run (then Autopilot if K8s-specific)
“No Kubernetes operations experience”	Cloud Run or GKE Autopilot
“Single event handler” / “Pub/Sub trigger”	Cloud Functions Gen 2
“DaemonSet” / “privileged container”	GKE Standard
“GPU” / “TPU” / “specialised hardware”	Compute Engine (sometimes GKE Standard with GPU nodes)
“Lift and shift” / “migrate existing VMs”	Compute Engine
“Batch job runs to completion”	Cloud Run jobs (or Batch for very large parallel)
“Stateful Kubernetes” / “StatefulSet”	GKE Standard
“Long-running 24/7 always-on workload”	GKE or Compute Engine (not Cloud Run)

Worked examples

Example 1: REST API for a SaaS product, bursty traffic, 5-100 RPS

Pick Cloud Run. Scale-to-zero is fine because the SaaS handles bursts well with cold starts. Per-request billing matches the bursty pattern. No Kubernetes features needed.

Example 2: Internal data-processing pipeline, runs every hour, processes ~10GB

Pick Cloud Run jobs triggered by Cloud Scheduler. Finite batch work, runs to completion, no need for a persistent compute resource.

Example 3: Multi-team microservices platform, 50+ services, sidecar service mesh, GitOps

Pick GKE Standard. Service mesh sidecars and node-level operators rule out Autopilot. Cloud Run is too constrained for the deployment patterns the team will use.

Example 4: ML training pipeline using GPUs

Pick Compute Engine (or GKE Standard with GPU node pools, depending on whether the team prefers Kubernetes orchestration). Specialised hardware rules out Cloud Run and Cloud Functions.

Example 5: Webhook receiver that processes Stripe events

Pick Cloud Functions Gen 2. Single endpoint, event-triggered, simplest possible deployment. Cloud Run would also work; Cloud Functions is preferred for the developer experience.

Example 6: Legacy Windows ERP system being migrated to GCP

Pick Compute Engine. Lift-and-shift of an existing VM workload. The system does not containerise cleanly and you need full OS control.

Frequently asked questions

Should I use Cloud Run or GKE for a containerised web app?

Default to Cloud Run unless you need something it does not give you. Cloud Run is simpler, scales to zero, and bills per request. Choose GKE when you need persistent state in cluster (StatefulSets), node-level customisation, complex workflow orchestration, or operational features Kubernetes provides that Cloud Run does not.

When should I use Compute Engine instead of GKE or Cloud Run?

Use Compute Engine when you need full OS control, specialised hardware (GPUs, TPUs, large memory), software that does not containerise cleanly, lift-and-shift VM migrations, or persistent stateful workloads where managing the OS is part of the workload. Most modern stateless workloads should not start with Compute Engine.

What is the difference between Cloud Run services and Cloud Run jobs?

Cloud Run services handle request/response traffic — they scale up to serve HTTP requests and scale to zero when idle. Cloud Run jobs run finite batch work to completion — they execute a task, exit, and bill only for execution time. Use services for APIs and websites; use jobs for scheduled batch processing or one-off tasks.

When should I use Cloud Functions instead of Cloud Run?

Use Cloud Functions for single-purpose, event-triggered functions where you want the absolute simplest deployment (just a function file, no container). Use Cloud Run when you need a full container, longer execution time, more memory, or HTTP routing. Cloud Functions Gen 2 actually runs on Cloud Run infrastructure under the hood — the difference is the developer experience.

Is App Engine still relevant in 2026?

It exists and works, but for new workloads Cloud Run has largely superseded it. App Engine standard is still on the ACE exam and remains good for Python, Java, Go, Node.js, Ruby, and PHP apps that fit its model. App Engine flex is mostly legacy — Cloud Run does what flex does, with less operational overhead.

Can Cloud Run reach private resources inside a VPC?

Yes, via a VPC connector or Direct VPC egress. Configure the Cloud Run service to route egress traffic through the VPC connector, which sits in your VPC and forwards traffic to private subnets. This is how Cloud Run reaches private Cloud SQL instances, internal load balancers, or on-prem resources over Cloud Interconnect or VPN.

Drill compute scenarios on ACE

Compute-selection scenarios are the most-tested topic on Domain 3 (25% of the exam). Practise with realistic questions. 30 free, no signup.

Try 30 Free Questions See Pricing