Your team picked a serverless platform recently. Now you're staring at a cost anomaly in your billing dashboard, or a cold start that adds noticeable latency to every agent interaction, and you're wondering whether the other side of the fence looks any different.
This guide breaks down the architectural and operational differences between AWS Lambda and Google Cloud Functions (GCF) that matter for production decisions, with cited specifications so you can model the impact on your workloads.
AWS Lambda vs Google Cloud Functions: comparison table
| Dimension | AWS Lambda | Google Cloud Functions (2nd gen) |
|---|---|---|
| Concurrency model | One request per execution environment in standard compute | Up to 1,000 concurrent requests per instance |
| Cold start mitigation | SnapStart (Java, Python, .NET) plus Provisioned Concurrency | Startup CPU Boost plus Minimum Instances |
| Billing granularity | Per GB-second at 1ms granularity | vCPU-seconds and GiB-seconds, request-based or instance-based |
| Free tier | 1 million requests and 400,000 GB-seconds per month | 2 million invocations per month |
| Max execution timeout | 900 seconds (15 minutes), extendable via published durable workflow pattern | 3,600 seconds (60 minutes) for HTTP |
| Max memory | Up to 10,240 MB | Up to 32 GiB, configuration-dependent |
| Isolation technology | Firecracker microVM | Cloud Run sandbox (2nd gen), gVisor (1st gen) |
| Supported runtimes | Node.js, Python, Java, .NET, Ruby, Go, Rust, custom | Node.js, Python, Java, Go, Ruby, PHP, .NET |
| Event sources | Deep native AWS integrations (DynamoDB, Kinesis, SQS, EventBridge, SNS) | Eventarc with CloudEvents (Cloud Storage, Firestore, Pub/Sub) |
| Traffic splitting | CodeDeploy weighted aliases | Cloud Run traffic splitting |
What is AWS Lambda?
AWS Lambda is Amazon's function-as-a-service platform. You upload code as a ZIP or container image, configure a trigger, and Lambda handles execution in Firecracker microVM-isolated environments. Memory allocations run from 128 MB to 10,240 MB, with CPU allocated proportionally. One full vCPU kicks in at 1,769 MB of memory, functions can run for up to 15 minutes per invocation, and container images can reach 10 GB.
Recent additions have expanded Lambda beyond short-lived event processing. AWS has published a durable workflow pattern for Lambda that can run for up to one year. SnapStart covers Java, Python, and .NET runtimes for faster cold starts.
Lambda's event source ecosystem is its strongest pull. It natively integrates with DynamoDB, Kinesis, SQS, EventBridge, and SNS, among other AWS services. For teams deep in the AWS ecosystem, these integrations remove custom polling infrastructure.
What is Google Cloud Functions?
Google Cloud Functions 2nd gen is Google Cloud's serverless function offering, unified under Cloud Run. Documentation now refers to these as "Cloud Run functions." That unification gives functions access to Cloud Run capabilities: per-instance concurrency, HTTP timeouts up to 60 minutes, memory up to 32 GiB, and two CPU billing modes (request-based and instance-based).
The 1st generation still exists and runs single-concurrency instances with lower timeout and memory limits than 2nd gen. New projects should default to 2nd gen unless specific 1st gen constraints apply.
GCF 2nd gen routes events through Eventarc, which follows the CloudEvents specification. Trigger sources include Cloud Storage, Firestore, and Pub/Sub. The runtime list covers Node.js, Python, Java, Go, Ruby, PHP, and .NET, with newer versions like Python 3.14 and Java 25 available on 2nd gen.
How do their concurrency models differ?
Concurrency is the most important architectural difference between the two platforms.
AWS Lambda processes one invocation at a time per execution environment in standard compute. Each environment handles many requests sequentially over its lifetime but only one at any moment. If many requests arrive simultaneously, Lambda creates new execution environments until existing ones become available.
Each environment is an isolated Firecracker microVM with its own memory, CPU, and filesystem. The default limit is 1,000 concurrent executions per account per region, and quota increases expand this further.
GCF 2nd gen takes the opposite approach. A single instance can handle up to 1,000 concurrent requests. The same burst can be served by fewer instances rather than spinning up a separate environment for each request.
The practical impact shows up in two areas:
- Cold start frequency: Lambda's model means every new concurrent request beyond existing warm instances triggers a cold start. GCF's multi-request instances reduce cold start frequency under bursty traffic because existing instances absorb concurrent requests without spawning new environments.
- Resource sharing: GCF instances share memory and CPU across concurrent requests within a single instance. CPU-intensive workloads on GCF need careful concurrency tuning to avoid resource contention. Lambda's isolation-per-request model avoids this tradeoff.
For agent workloads making bursty, concurrent tool calls, GCF's per-instance concurrency structurally reduces cold start exposure. Lambda's model provides stronger per-request resource isolation. Which one matters more depends on whether your agents share state across concurrent operations or need hard isolation boundaries between them.
How does cold start performance compare?
Lambda cold start duration varies by runtime, package size, and memory allocation. AWS notes cold starts occur in under 1% of invocations for typical production workloads. Exact figures depend heavily on test setup, but Go tends to cold-start faster than Java, and Java benefits substantially from SnapStart.
SnapStart dramatically improves Java cold starts. An AWS benchmark showed SnapStart cutting Spring Boot P50 cold starts from 5,047 ms to 1,178 ms, with invoke priming reaching 781.68 ms p99.9 cold-start latency. SnapStart covers Java 11+, Python 3.12+, and .NET 8+. When SnapStart isn't enough, Provisioned Concurrency keeps environments pre-initialized and eliminates cold starts entirely.
GCF uses Startup CPU Boost, which temporarily allocates extra CPU during initialization. Google reports up to 50% faster startup for Spring PetClinic (Java) and up to 30% faster for Node.js, though it doesn't publish the absolute baseline times behind these percentages.
GCF 2nd gen also reduces cold start frequency through per-instance concurrency: a warm instance serving many concurrent requests means fewer instances need to cold-start under load. Minimum instance configuration keeps a baseline of warm instances active, similar in role to Lambda's Provisioned Concurrency.
Teams that need cold start SLAs with specific millisecond targets will find Lambda's published examples more actionable for capacity planning. GCF's structural concurrency advantage reduces how often cold starts happen, even if absolute duration data is harder to pin down.
How do their pricing models work?
Lambda bills per GB-second at 1ms granularity. The first tier rate for x86 is $0.0000166667 per GB-second, with volume discounts starting at 6 billion GB-seconds. Arm (Graviton2) pricing runs roughly 20% lower at equivalent tiers.
Requests cost $0.20 per million, and the free tier includes 1 million requests and 400,000 GB-seconds per month. One recent billing change matters for cold-start-heavy workloads: Lambda now bills the INIT phase for on-demand ZIP-packaged functions on managed runtimes, where previously only the execution phase was billed.
Provisioned Concurrency adds a separate reservation charge in addition to execution charges. It runs from configuration time until disabled, rounded up rather than billed exactly to the second.
GCF 1st gen bills per GB-second and GHz-second with 2 million free invocations per month. GCF 2nd gen uses Cloud Run pricing: $0.000024 per vCPU-second and $0.0000025 per GiB-second for request-based billing, with instance-based billing charging for the entire container lifetime at lower per-unit rates ($0.000018 per vCPU-second).
Google Cloud offers committed use discounts on Cloud Run via flexible CUDs, while AWS Lambda's closest equivalent is Compute Savings Plans with savings of up to 17%.
Three structural differences affect cost modeling:
- Billing granularity: Lambda's finer-grained billing versus GCF's coarser rounding means very short functions can cost less on Lambda.
- Unit mismatch: GCF 1st gen uses GB (decimal) while 2nd gen uses GiB (binary). The same dollar figure applies to different absolute memory quantities across generations.
- Concurrency economics: GCF 2nd gen's per-instance concurrency means a single instance bill covers many concurrent requests. Lambda bills a separate execution environment for each concurrent request.
Model these differences against your actual invocation patterns before committing to either platform. Short, high-volume workloads favor Lambda. High-concurrency workloads with steady utilization favor GCF 2nd gen.
How do execution limits and compute resources compare?
Lambda caps execution at 15 minutes per invocation. GCF 2nd gen allows longer HTTP executions, with specific limits depending on trigger type. For long-running HTTP workloads, GCF provides more headroom. Lambda compensates with an AWS-published durable workflow pattern that supports stateful workflows running up to one year through a Lambda-based architecture. This guide does not identify a published native equivalent on the GCF side.
On memory and CPU, Lambda allocates up to 10,240 MB with CPU proportional to memory. GCF 2nd gen supports up to 32 GiB of memory depending on configuration, and decouples CPU and memory allocation for more flexibility.
For payloads, Lambda supports 6 MB synchronous payloads and up to 200 MB streamed responses, with ephemeral storage up to 10 GB. GCF 2nd gen handles larger standard HTTP request and response sizes than Lambda's synchronous limit, while Lambda's response streaming covers larger streamed payloads. If your workload needs larger-than-6MB synchronous responses without streaming, GCF is the cleaner fit.
How do their security and isolation models compare?
Both platforms meet enterprise compliance requirements. The architectural approaches to workload isolation differ.
Lambda runs each execution environment inside a Firecracker microVM. Firecracker uses Linux KVM to create lightweight virtual machines with a minimal device model, and it's written in Rust. Each function gets a virtual machine boundary that prevents cross-tenant access, even on shared physical hosts. GCF 2nd gen uses a sandboxed runtime model on Cloud Run infrastructure. GCF 1st gen uses gVisor, a user-space kernel that intercepts system calls. While gVisor provides improved isolation over traditional containers, microVMs offer hardware-enforced boundaries that are stronger by design.
On compliance, both cover SOC 2, ISO/IEC 27001:2022, ISO/IEC 27017, ISO/IEC 27018, HIPAA, FedRAMP, and PCI DSS. Lambda adds DoD SRG Impact Levels 2, 4, and 5 on GovCloud. FIPS 140-3 is offered for certain GovCloud services through FIPS endpoints, though AWS documentation doesn't show Lambda specifically validated for FIPS 140-3. GCF adds HITRUST CSF.
Google Cloud cites ISO/IEC 27701, ISO/IEC 42001, and StateRAMP at the broader Google Cloud level, but available documentation doesn't specifically confirm Cloud Functions coverage for those programs. For most production workloads, compliance parity isn't a differentiator. The distinction matters for government workloads needing FedRAMP High or FIPS 140-3.
For networking, Lambda supports Hyperplane ENI-based VPC integration, with an elastic network interface quota of 500 per VPC. IPv6 support removes the NAT Gateway requirement. GCF offers Serverless VPC Access connectors and Direct VPC egress.
How do their event sources and ecosystem integrations compare?
Lambda's integration breadth is a primary ecosystem advantage. It natively supports DynamoDB streams, Kinesis, SQS, Amazon MSK, EventBridge, SNS, SES, Application Load Balancer, and IoT. These integrations provide built-in polling and batching, with retry behavior handled by the event source mapping rather than custom code.
GCF 2nd gen routes events through Eventarc using the CloudEvents specification. Trigger sources include Cloud Storage, Firestore, and Pub/Sub. The event surface is narrower than Lambda's but covers the core GCP services. If your architecture leans heavily on managed AWS data services, Lambda's integration depth will save meaningful glue code. If you're standardized on GCP data products, GCF's Eventarc layer is sufficient without requiring external trigger infrastructure.
How do deployment and observability compare?
Lambda supports deployment workflows that integrate with GitHub Actions and S3-based packaging for large artifacts. CodeDeploy weighted aliases support canary and linear deployment strategies for gradual rollouts. On observability, Lambda sends metrics to CloudWatch at regular intervals, X-Ray provides distributed tracing, and the Telemetry API extends observability to extensions.
GCF 2nd gen inherits Cloud Run's deployment infrastructure. Automatic base image updates keep underlying images patched without manual intervention. Traffic splitting is available through Cloud Run for gradual rollouts without external tooling. On observability, GCF integrates with Cloud Monitoring, and Error Reporting analyzes properly formatted exceptions sent through Cloud Logging.
The deployment surfaces feel comparable in practice. Teams that need fine-grained deployment control often prefer Lambda's CodeDeploy primitives. Teams that want patching handled automatically lean toward GCF's inherited Cloud Run behavior.
What to choose and when
The right choice depends on your existing cloud investment, workload characteristics, and team expertise.
Choose AWS Lambda when:
- Your infrastructure runs on AWS: Native integrations with DynamoDB, Kinesis, SQS, and EventBridge remove custom polling code.
- You need Java, Python, or .NET cold start optimization: SnapStart is available across these runtimes with documented improvements.
- You need stateful, long-running workflows: AWS has published a durable workflow pattern with execution spans up to one year.
- Strict per-request isolation matters: One execution environment per request means no resource contention in the standard model.
Choose Google Cloud Functions when:
- Your infrastructure runs on GCP: Eventarc triggers for Cloud Storage, Firestore, and Pub/Sub work without extra configuration.
- Your workloads are bursty with high concurrency: Per-instance concurrency up to 1,000 requests reduces cold start frequency and can lower costs.
- You need long-running HTTP functions: The 60-minute timeout provides more headroom than Lambda's 15-minute cap.
- You want committed use discounts: Cloud Run CUDs provide commitment pricing.
Where both platforms fall short:
Both Lambda and GCF are general-purpose serverless platforms built for stateless, request-response workloads. Agents that execute untrusted code in production, maintain persistent state between sessions, or need to resume from idle in under 25ms run into limits on both.
Perpetual sandbox platforms like Blaxel are designed for this category. Blaxel sandboxes resume from standby in under 25ms, stay in standby indefinitely at zero compute cost while idle, and return to standby automatically after 15 seconds of network inactivity. The full filesystem and memory state restores on resume, with Volumes available for guaranteed long-term persistence. Hardware-enforced microVM isolation provides stronger tenant boundaries than container-based sandboxes.
Evaluate AWS Lambda vs Google Cloud Functions for your agent workloads
Understanding the differences between Lambda and GCF helps you make informed infrastructure decisions, but the comparison reveals a gap both platforms share. General-purpose serverless functions optimize for stateless, short-lived invocations and handle API backends, event processing, and data pipelines well. They struggle with workloads that need state persistence across sessions, instant resume from idle, and sub-second responsiveness after periods of inactivity.
Perpetual sandbox platforms like Blaxel address this gap directly. Sandboxes resume in under 25ms and stay in standby indefinitely at zero compute cost. Agents Hosting co-locates agent logic with sandboxes to remove network roundtrip latency between the agent and its execution environment. MCP Servers Hosting deploys custom Model Context Protocol servers as serverless endpoints.
The Model Gateway routes LLM requests across providers with token cost controls, and Batch Jobs handle parallel fan-out workloads. All of this runs on hardware-enforced microVM isolation with SOC 2 Type II, HIPAA, and ISO 27001 compliance. Blaxel complements Lambda or GCF rather than replacing them, handling the persistent sandbox workloads that general-purpose FaaS platforms weren't designed for while your existing serverless functions keep handling event-driven processing.
Sign up free to test agent workloads on perpetual sandboxes, or book a demo to discuss your architecture with the Blaxel team.
The persistent sandbox layer both platforms are missing
Sub-25ms resume, perpetual standby at zero compute cost, microVM isolation, and co-located Agents Hosting. Up to $200 in free credits.
Frequently asked questions
Is AWS Lambda cheaper than Google Cloud Functions?
It depends on your workload pattern. Lambda bills at finer granularity while GCF uses coarser rounding, so short-duration functions often cost less on Lambda. GCF 2nd gen's per-instance concurrency means fewer instances serve the same concurrent traffic, which reduces total compute cost for high-concurrency workloads. Model both against your actual invocation pattern before committing.
Can Google Cloud Functions replace AWS Lambda?
GCF handles many common serverless use cases when your infrastructure runs on GCP. The 2nd gen supports comparable memory, longer HTTP timeouts, and broader per-instance concurrency. The main gaps are Lambda's deeper event source integrations with AWS services and AWS's published durable workflow pattern for long-running stateful workflows.
Which platform has better cold start performance?
AWS publishes specific benchmark examples but not a complete cross-runtime figure set for SLA planning on its own. GCF publishes percentage improvements without absolute baselines. SnapStart covers Java, Python, and .NET runtimes with documented improvements for heavyweight frameworks. GCF's per-instance concurrency reduces cold start frequency under bursty load, even if per-start duration data is less transparent.
What is the maximum execution time for Lambda vs Cloud Functions?
Lambda caps at 15 minutes per invocation. AWS has published a Lambda-based durable workflow pattern that extends this to one year for stateful workflows. GCF 2nd gen allows longer execution windows depending on trigger type, with HTTP functions getting the most headroom at 60 minutes. For single-invocation HTTP workloads, GCF offers more headroom.
How do Lambda and Cloud Functions handle scaling differently?
Lambda scales by adding execution environments, with each environment handling one request in the standard model. GCF 2nd gen scales by adding instances that each handle up to 1,000 concurrent requests, so it needs fewer instances to absorb the same traffic volume. Lambda provides stronger per-request isolation. Both offer warm-instance mechanisms (Provisioned Concurrency on Lambda, minimum instances on GCF) to reduce cold starts for latency-sensitive workloads.



