Cold Start Reduction

February 5, 2025 · 8 min read

How we reduced our serverless cold start times from 3 seconds to under 200ms by optimizing Lambda functions.

The Cold Start Problem

AWS Lambda functions experience “cold starts” when they haven’t been invoked recently. The Lambda service must:

Provision a new execution environment
Download your deployment package
Initialize the runtime
Run your initialization code

This can take several seconds, creating poor user experiences.

Our Starting Point

Initial metrics:

Cold start: 3200ms
Warm start: 45ms
Deployment size: 48MB
Dependencies: 127 packages

Clearly unacceptable for user-facing APIs.

Optimization Strategy

1. Reduce Package Size

Our biggest win came from shrinking the deployment package.

Before: 48MB Bundle

{
  "dependencies": {
    "aws-sdk": "^2.1000.0",
    "lodash": "^4.17.21",
    "moment": "^2.29.4",
    "axios": "^1.4.0",
    "express": "^4.18.2",
    "body-parser": "^1.20.2",
    "cors": "^2.8.5",
    "uuid": "^9.0.0",
    "dotenv": "^16.0.3",
    "joi": "^17.9.2",
    "jsonwebtoken": "^9.0.0",
    "bcrypt": "^5.1.0"
  }
}

After: 8MB Bundle

{
  "dependencies": {
    "uuid": "^9.0.0",
    "jsonwebtoken": "^9.0.0"
  }
}

Changes made:

❌ Removed aws-sdk (pre-installed in Lambda)
❌ Removed lodash (used native JS methods)
❌ Removed moment (used native Date)
❌ Removed express (used Lambda proxy integration)
❌ Removed axios (used fetch or aws-sdk clients)
✅ Kept only essential dependencies

Result: 48MB → 8MB (83% reduction)

2. Use Lambda Layers

Extract common dependencies into Lambda layers:

# Create layer structure
mkdir -p layer/nodejs/node_modules

# Install dependencies
cd layer/nodejs
npm install uuid jsonwebtoken --production

# Create layer
cd ..
zip -r layer.zip nodejs

# Deploy layer
aws lambda publish-layer-version \
  --layer-name common-deps \
  --zip-file fileb://layer.zip \
  --compatible-runtimes nodejs18.x

Result: Deployment package reduced to 2MB

3. Lazy Load Dependencies

Don’t import everything at the top:

Before: Eager Loading

// ALL modules loaded on cold start
const jwt = require('jsonwebtoken')
const bcrypt = require('bcrypt')
const { S3Client } = require('@aws-sdk/client-s3')
const { DynamoDBClient } = require('@aws-sdk/client-dynamodb')
const { SNSClient } = require('@aws-sdk/client-sns')

exports.handler = async (event) => {
  // Only use jwt for this request
  const token = jwt.sign({ userId: 123 }, process.env.JWT_SECRET)
  return { statusCode: 200, body: token }
}

After: Lazy Loading

exports.handler = async (event) => {
  // Load only what's needed
  const jwt = require('jsonwebtoken')
  const token = jwt.sign({ userId: 123 }, process.env.JWT_SECRET)
  return { statusCode: 200, body: token }
}

Better approach - Cache imports:

let jwt
let s3Client

exports.handler = async (event) => {
  // Load once per container
  if (!jwt) {
    jwt = require('jsonwebtoken')
  }

  if (!s3Client && event.needsS3) {
    const { S3Client } = require('@aws-sdk/client-s3')
    s3Client = new S3Client({})
  }

  // Use cached imports...
}

Result: Cold start initialization time reduced by 400ms

4. Enable SnapStart

AWS Lambda SnapStart takes a snapshot after initialization:

# serverless.yml
functions:
  api:
    handler: handler.main
    runtime: java11 # SnapStart works with Java
    snapStart: true

For Node.js (no SnapStart yet), use provisioned concurrency:

functions:
  api:
    handler: handler.main
    provisionedConcurrency: 2 # Keep 2 warm instances

Result: Eliminates cold starts for ~$10/month

5. Connection Pooling

Don’t create new DB connections per invocation:

Before: New Connection Every Time

const { Client } = require('pg')

exports.handler = async (event) => {
  const client = new Client({
    host: process.env.DB_HOST,
    database: process.env.DB_NAME,
    user: process.env.DB_USER,
    password: process.env.DB_PASSWORD
  })

  await client.connect() // 200-500ms!

  const result = await client.query('SELECT * FROM users WHERE id = $1', [event.userId])

  await client.end()

  return { statusCode: 200, body: JSON.stringify(result.rows) }
}

After: Reuse Connection

const { Client } = require('pg')

let client

async function getClient() {
  if (!client) {
    client = new Client({
      host: process.env.DB_HOST,
      database: process.env.DB_NAME,
      user: process.env.DB_USER,
      password: process.env.DB_PASSWORD
    })
    await client.connect()
  }
  return client
}

exports.handler = async (event) => {
  const db = await getClient()

  const result = await db.query(
    'SELECT * FROM users WHERE id = $1',
    [event.userId]
  )

  return { statusCode: 200, body: JSON.stringify(result.rows) }
}

Even better - Use RDS Proxy:

const { Client } = require('pg')

const client = new Client({
  host: process.env.RDS_PROXY_ENDPOINT, // RDS Proxy handles pooling
  database: process.env.DB_NAME,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD
})

client.connect()

exports.handler = async (event) => {
  const result = await client.query(
    'SELECT * FROM users WHERE id = $1',
    [event.userId]
  )

  return { statusCode: 200, body: JSON.stringify(result.rows) }
}

Result: First request: 0ms connection time. Subsequent requests: instant.

6. Optimize AWS SDK v3

Use modular AWS SDK v3 instead of v2:

Before: AWS SDK v2 (59MB)

const AWS = require('aws-sdk')
const s3 = new AWS.S3()
const dynamodb = new AWS.DynamoDB.DocumentClient()

exports.handler = async (event) => {
  const obj = await s3.getObject({
    Bucket: 'my-bucket',
    Key: 'file.txt'
  }).promise()

  return { statusCode: 200, body: obj.Body.toString() }
}

After: AWS SDK v3 (modular)

const { S3Client, GetObjectCommand } = require('@aws-sdk/client-s3')

// Create client outside handler (reused across invocations)
const s3Client = new S3Client({})

exports.handler = async (event) => {
  const response = await s3Client.send(new GetObjectCommand({
    Bucket: 'my-bucket',
    Key: 'file.txt'
  }))

  const body = await streamToString(response.Body)
  return { statusCode: 200, body }
}

function streamToString(stream) {
  return new Promise((resolve, reject) => {
    const chunks = []
    stream.on('data', chunk => chunks.push(chunk))
    stream.on('error', reject)
    stream.on('end', () => resolve(Buffer.concat(chunks).toString('utf8')))
  })
}

Result: Only import needed clients, reducing package size

7. Use Native Fetch Instead of Axios

Node.js 18+ has native fetch:

Before: Axios (316KB)

const axios = require('axios')

exports.handler = async (event) => {
  const response = await axios.get('https://api.example.com/data')
  return { statusCode: 200, body: JSON.stringify(response.data) }
}

After: Native Fetch (0KB)

exports.handler = async (event) => {
  const response = await fetch('https://api.example.com/data')
  const data = await response.json()
  return { statusCode: 200, body: JSON.stringify(data) }
}

Result: 316KB saved, one less dependency

8. Bundle with esbuild

Tree-shake unused code:

// build.js
const esbuild = require('esbuild')

esbuild.build({
  entryPoints: ['src/handler.js'],
  bundle: true,
  platform: 'node',
  target: 'node18',
  outfile: 'dist/handler.js',
  external: ['aws-sdk'], // Don't bundle AWS SDK
  minify: true,
  sourcemap: true
}).catch(() => process.exit(1))

Before bundling: 8MB After bundling: 1.2MB

9. Use Arm64 Architecture

Graviton2 processors are faster and cheaper:

# serverless.yml
functions:
  api:
    handler: handler.main
    architecture: arm64 # Instead of x86_64

Result: 20% faster execution, 20% cheaper

10. Warm-Up Strategy

Keep functions warm with scheduled pings:

functions:
  api:
    handler: handler.main
    events:
      - http:
          path: /api
          method: get
      - schedule:
          rate: rate(5 minutes)
          input:
            warmup: true

exports.handler = async (event) => {
  // Ignore warmup events
  if (event.warmup) {
    return { statusCode: 200, body: 'warmup' }
  }

  // Real logic...
}

Cost: ~$1/month per function Benefit: No cold starts during business hours

Results

Metric	Before	After	Improvement
Cold start	3200ms	180ms	94% faster
Package size	48MB	1.2MB	97% smaller
Memory usage	512MB	256MB	50% less
Monthly cost	$87	$24	72% cheaper

Advanced: Container Image Lambdas

For very large functions, use container images:

FROM public.ecr.aws/lambda/nodejs:18

# Copy package files
COPY package*.json ./
RUN npm ci --production

# Copy source code
COPY src/ ./src/

# Set handler
CMD ["src/handler.main"]

Build and deploy:

# Build image
docker build -t my-lambda .

# Push to ECR
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com

docker tag my-lambda:latest 123456789.dkr.ecr.us-east-1.amazonaws.com/my-lambda:latest
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/my-lambda:latest

# Deploy Lambda
aws lambda create-function \
  --function-name my-lambda \
  --package-type Image \
  --code ImageUri=123456789.dkr.ecr.us-east-1.amazonaws.com/my-lambda:latest \
  --role arn:aws:iam::123456789:role/lambda-execution

Benefits:

Up to 10GB image size
Use any dependencies (including native binaries)
Better for ML models or complex apps

Drawbacks:

Slower cold starts (500ms - 2s)
More complex deployment

Monitoring Cold Starts

Track cold starts with CloudWatch:

const isWarmStart = global.isWarmStart || false
global.isWarmStart = true

exports.handler = async (event) => {
  console.log(JSON.stringify({
    coldStart: !isWarmStart,
    requestId: event.requestContext.requestId
  }))

  // Your logic...
}

Query in CloudWatch Insights:

fields @timestamp, @message
| filter @message like /coldStart/
| stats count(*) as coldStarts by bin(5m)

Cost Analysis

Provisioned Concurrency vs Cold Starts

Cold starts (no provisioned concurrency):

1M requests/month
10% cold starts = 100K cold starts
Avg cold start: 200ms
User experience: Poor for 10% of requests
Cost: $20/month

Provisioned concurrency (2 instances):

1M requests/month
0% cold starts
User experience: Consistent
Cost: $30/month ($10 extra)

Verdict: Worth it for user-facing APIs

Best Practices Summary

Keep it small - Minimize dependencies and bundle size
Use layers - Share common code across functions
Lazy load - Import only what you need, when you need it
Reuse connections - DB, HTTP, AWS SDK clients
Use AWS SDK v3 - Modular imports reduce size
Bundle with esbuild - Tree-shake unused code
Use Arm64 - 20% faster and cheaper
Consider provisioned concurrency - For critical APIs
Monitor cold starts - Track and optimize
Test in production - Staging won’t show cold starts

When to Accept Cold Starts

Not every function needs optimization:

Background jobs - Users don’t wait for these
Infrequent tasks - < 1 request/hour
Internal tools - Developers are patient
Webhooks - Can retry on timeout

Save optimization effort for user-facing APIs.

Alternative Approaches

Use API Gateway HTTP APIs

Faster cold starts than REST APIs:

functions:
  api:
    handler: handler.main
    events:
      - httpApi: # Instead of 'http'
          path: /api
          method: get

Use Lambda@Edge

Run at CloudFront edge locations (no cold starts):

exports.handler = async (event) => {
  const request = event.Records[0].cf.request

  // Modify request or generate response
  return {
    status: '200',
    body: 'Hello from edge!'
  }
}

Consider Fargate

For long-running processes:

# serverless.yml
service: my-service

plugins:
  - serverless-fargate

custom:
  fargate:
    clusterName: my-cluster
    containerInsights: true

functions:
  api:
    type: fargate
    image: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-api:latest
    memory: 512
    cpu: 256

Benefits:

No cold starts
Long execution time (no 15min limit)
More memory options

Drawbacks:

More expensive
Always running (even with no traffic)

Lessons Learned

Size matters most - Biggest impact on cold starts
SDK v3 is essential - Don’t use v2 anymore
Layers are underutilized - Share code across functions
Provisioned concurrency works - But costs money
Measure everything - Can’t optimize what you don’t measure

Tools and Resources