Cold Start Reduction
How we reduced our serverless cold start times from 3 seconds to under 200ms by optimizing Lambda functions.
The Cold Start Problem
AWS Lambda functions experience “cold starts” when they haven’t been invoked recently. The Lambda service must:
- Provision a new execution environment
- Download your deployment package
- Initialize the runtime
- Run your initialization code
This can take several seconds, creating poor user experiences.
Our Starting Point
Initial metrics:
- Cold start: 3200ms
- Warm start: 45ms
- Deployment size: 48MB
- Dependencies: 127 packages
Clearly unacceptable for user-facing APIs.
Optimization Strategy
1. Reduce Package Size
Our biggest win came from shrinking the deployment package.
Before: 48MB Bundle
{
"dependencies": {
"aws-sdk": "^2.1000.0",
"lodash": "^4.17.21",
"moment": "^2.29.4",
"axios": "^1.4.0",
"express": "^4.18.2",
"body-parser": "^1.20.2",
"cors": "^2.8.5",
"uuid": "^9.0.0",
"dotenv": "^16.0.3",
"joi": "^17.9.2",
"jsonwebtoken": "^9.0.0",
"bcrypt": "^5.1.0"
}
}
After: 8MB Bundle
{
"dependencies": {
"uuid": "^9.0.0",
"jsonwebtoken": "^9.0.0"
}
}
Changes made:
- ❌ Removed
aws-sdk(pre-installed in Lambda) - ❌ Removed
lodash(used native JS methods) - ❌ Removed
moment(used nativeDate) - ❌ Removed
express(used Lambda proxy integration) - ❌ Removed
axios(usedfetchoraws-sdkclients) - ✅ Kept only essential dependencies
Result: 48MB → 8MB (83% reduction)
2. Use Lambda Layers
Extract common dependencies into Lambda layers:
# Create layer structure
mkdir -p layer/nodejs/node_modules
# Install dependencies
cd layer/nodejs
npm install uuid jsonwebtoken --production
# Create layer
cd ..
zip -r layer.zip nodejs
# Deploy layer
aws lambda publish-layer-version \
--layer-name common-deps \
--zip-file fileb://layer.zip \
--compatible-runtimes nodejs18.x
Result: Deployment package reduced to 2MB
3. Lazy Load Dependencies
Don’t import everything at the top:
Before: Eager Loading
// ALL modules loaded on cold start
const jwt = require('jsonwebtoken')
const bcrypt = require('bcrypt')
const { S3Client } = require('@aws-sdk/client-s3')
const { DynamoDBClient } = require('@aws-sdk/client-dynamodb')
const { SNSClient } = require('@aws-sdk/client-sns')
exports.handler = async (event) => {
// Only use jwt for this request
const token = jwt.sign({ userId: 123 }, process.env.JWT_SECRET)
return { statusCode: 200, body: token }
}
After: Lazy Loading
exports.handler = async (event) => {
// Load only what's needed
const jwt = require('jsonwebtoken')
const token = jwt.sign({ userId: 123 }, process.env.JWT_SECRET)
return { statusCode: 200, body: token }
}
Better approach - Cache imports:
let jwt
let s3Client
exports.handler = async (event) => {
// Load once per container
if (!jwt) {
jwt = require('jsonwebtoken')
}
if (!s3Client && event.needsS3) {
const { S3Client } = require('@aws-sdk/client-s3')
s3Client = new S3Client({})
}
// Use cached imports...
}
Result: Cold start initialization time reduced by 400ms
4. Enable SnapStart
AWS Lambda SnapStart takes a snapshot after initialization:
# serverless.yml
functions:
api:
handler: handler.main
runtime: java11 # SnapStart works with Java
snapStart: true
For Node.js (no SnapStart yet), use provisioned concurrency:
functions:
api:
handler: handler.main
provisionedConcurrency: 2 # Keep 2 warm instances
Result: Eliminates cold starts for ~$10/month
5. Connection Pooling
Don’t create new DB connections per invocation:
Before: New Connection Every Time
const { Client } = require('pg')
exports.handler = async (event) => {
const client = new Client({
host: process.env.DB_HOST,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD
})
await client.connect() // 200-500ms!
const result = await client.query('SELECT * FROM users WHERE id = $1', [event.userId])
await client.end()
return { statusCode: 200, body: JSON.stringify(result.rows) }
}
After: Reuse Connection
const { Client } = require('pg')
let client
async function getClient() {
if (!client) {
client = new Client({
host: process.env.DB_HOST,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD
})
await client.connect()
}
return client
}
exports.handler = async (event) => {
const db = await getClient()
const result = await db.query(
'SELECT * FROM users WHERE id = $1',
[event.userId]
)
return { statusCode: 200, body: JSON.stringify(result.rows) }
}
Even better - Use RDS Proxy:
const { Client } = require('pg')
const client = new Client({
host: process.env.RDS_PROXY_ENDPOINT, // RDS Proxy handles pooling
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD
})
client.connect()
exports.handler = async (event) => {
const result = await client.query(
'SELECT * FROM users WHERE id = $1',
[event.userId]
)
return { statusCode: 200, body: JSON.stringify(result.rows) }
}
Result: First request: 0ms connection time. Subsequent requests: instant.
6. Optimize AWS SDK v3
Use modular AWS SDK v3 instead of v2:
Before: AWS SDK v2 (59MB)
const AWS = require('aws-sdk')
const s3 = new AWS.S3()
const dynamodb = new AWS.DynamoDB.DocumentClient()
exports.handler = async (event) => {
const obj = await s3.getObject({
Bucket: 'my-bucket',
Key: 'file.txt'
}).promise()
return { statusCode: 200, body: obj.Body.toString() }
}
After: AWS SDK v3 (modular)
const { S3Client, GetObjectCommand } = require('@aws-sdk/client-s3')
// Create client outside handler (reused across invocations)
const s3Client = new S3Client({})
exports.handler = async (event) => {
const response = await s3Client.send(new GetObjectCommand({
Bucket: 'my-bucket',
Key: 'file.txt'
}))
const body = await streamToString(response.Body)
return { statusCode: 200, body }
}
function streamToString(stream) {
return new Promise((resolve, reject) => {
const chunks = []
stream.on('data', chunk => chunks.push(chunk))
stream.on('error', reject)
stream.on('end', () => resolve(Buffer.concat(chunks).toString('utf8')))
})
}
Result: Only import needed clients, reducing package size
7. Use Native Fetch Instead of Axios
Node.js 18+ has native fetch:
Before: Axios (316KB)
const axios = require('axios')
exports.handler = async (event) => {
const response = await axios.get('https://api.example.com/data')
return { statusCode: 200, body: JSON.stringify(response.data) }
}
After: Native Fetch (0KB)
exports.handler = async (event) => {
const response = await fetch('https://api.example.com/data')
const data = await response.json()
return { statusCode: 200, body: JSON.stringify(data) }
}
Result: 316KB saved, one less dependency
8. Bundle with esbuild
Tree-shake unused code:
// build.js
const esbuild = require('esbuild')
esbuild.build({
entryPoints: ['src/handler.js'],
bundle: true,
platform: 'node',
target: 'node18',
outfile: 'dist/handler.js',
external: ['aws-sdk'], // Don't bundle AWS SDK
minify: true,
sourcemap: true
}).catch(() => process.exit(1))
Before bundling: 8MB After bundling: 1.2MB
9. Use Arm64 Architecture
Graviton2 processors are faster and cheaper:
# serverless.yml
functions:
api:
handler: handler.main
architecture: arm64 # Instead of x86_64
Result: 20% faster execution, 20% cheaper
10. Warm-Up Strategy
Keep functions warm with scheduled pings:
functions:
api:
handler: handler.main
events:
- http:
path: /api
method: get
- schedule:
rate: rate(5 minutes)
input:
warmup: true
exports.handler = async (event) => {
// Ignore warmup events
if (event.warmup) {
return { statusCode: 200, body: 'warmup' }
}
// Real logic...
}
Cost: ~$1/month per function Benefit: No cold starts during business hours
Results
| Metric | Before | After | Improvement |
|---|---|---|---|
| Cold start | 3200ms | 180ms | 94% faster |
| Package size | 48MB | 1.2MB | 97% smaller |
| Memory usage | 512MB | 256MB | 50% less |
| Monthly cost | $87 | $24 | 72% cheaper |
Advanced: Container Image Lambdas
For very large functions, use container images:
FROM public.ecr.aws/lambda/nodejs:18
# Copy package files
COPY package*.json ./
RUN npm ci --production
# Copy source code
COPY src/ ./src/
# Set handler
CMD ["src/handler.main"]
Build and deploy:
# Build image
docker build -t my-lambda .
# Push to ECR
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
docker tag my-lambda:latest 123456789.dkr.ecr.us-east-1.amazonaws.com/my-lambda:latest
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/my-lambda:latest
# Deploy Lambda
aws lambda create-function \
--function-name my-lambda \
--package-type Image \
--code ImageUri=123456789.dkr.ecr.us-east-1.amazonaws.com/my-lambda:latest \
--role arn:aws:iam::123456789:role/lambda-execution
Benefits:
- Up to 10GB image size
- Use any dependencies (including native binaries)
- Better for ML models or complex apps
Drawbacks:
- Slower cold starts (500ms - 2s)
- More complex deployment
Monitoring Cold Starts
Track cold starts with CloudWatch:
const isWarmStart = global.isWarmStart || false
global.isWarmStart = true
exports.handler = async (event) => {
console.log(JSON.stringify({
coldStart: !isWarmStart,
requestId: event.requestContext.requestId
}))
// Your logic...
}
Query in CloudWatch Insights:
fields @timestamp, @message
| filter @message like /coldStart/
| stats count(*) as coldStarts by bin(5m)
Cost Analysis
Provisioned Concurrency vs Cold Starts
Cold starts (no provisioned concurrency):
- 1M requests/month
- 10% cold starts = 100K cold starts
- Avg cold start: 200ms
- User experience: Poor for 10% of requests
- Cost: $20/month
Provisioned concurrency (2 instances):
- 1M requests/month
- 0% cold starts
- User experience: Consistent
- Cost: $30/month ($10 extra)
Verdict: Worth it for user-facing APIs
Best Practices Summary
- Keep it small - Minimize dependencies and bundle size
- Use layers - Share common code across functions
- Lazy load - Import only what you need, when you need it
- Reuse connections - DB, HTTP, AWS SDK clients
- Use AWS SDK v3 - Modular imports reduce size
- Bundle with esbuild - Tree-shake unused code
- Use Arm64 - 20% faster and cheaper
- Consider provisioned concurrency - For critical APIs
- Monitor cold starts - Track and optimize
- Test in production - Staging won’t show cold starts
When to Accept Cold Starts
Not every function needs optimization:
- Background jobs - Users don’t wait for these
- Infrequent tasks - < 1 request/hour
- Internal tools - Developers are patient
- Webhooks - Can retry on timeout
Save optimization effort for user-facing APIs.
Alternative Approaches
Use API Gateway HTTP APIs
Faster cold starts than REST APIs:
functions:
api:
handler: handler.main
events:
- httpApi: # Instead of 'http'
path: /api
method: get
Use Lambda@Edge
Run at CloudFront edge locations (no cold starts):
exports.handler = async (event) => {
const request = event.Records[0].cf.request
// Modify request or generate response
return {
status: '200',
body: 'Hello from edge!'
}
}
Consider Fargate
For long-running processes:
# serverless.yml
service: my-service
plugins:
- serverless-fargate
custom:
fargate:
clusterName: my-cluster
containerInsights: true
functions:
api:
type: fargate
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-api:latest
memory: 512
cpu: 256
Benefits:
- No cold starts
- Long execution time (no 15min limit)
- More memory options
Drawbacks:
- More expensive
- Always running (even with no traffic)
Lessons Learned
- Size matters most - Biggest impact on cold starts
- SDK v3 is essential - Don’t use v2 anymore
- Layers are underutilized - Share code across functions
- Provisioned concurrency works - But costs money
- Measure everything - Can’t optimize what you don’t measure