INFRASTRUCTURE2025-10-28📖 5 min read

AWS Lambda Practical Guide: Complete Techniques for Cost Reduction and Performance Optimization

AWS Lambda Practical Guide: Complete Techniques for Cost Reduction and Performance Optimization

Cut infrastructure costs by 90% with serverless architecture while achieving true scalability. A comprehensive guide to Lambda in production — from cold start mitigation to real-world deployment strategies.

髙木 晃宏

代表 / エンジニア

👨‍💼

TL;DR

  • Achieved 90% cost reduction compared to traditional EC2 deployments using AWS Lambda
  • Cold starts can be kept under 100ms with the right configuration
  • Optimizing memory settings dramatically improves cost-performance ratio
  • Provisioned Concurrency delivers predictable, consistent response times

Introduction: Why Lambda Now?

"I wish I could stop managing servers."

Every engineer has thought this at some point. Dealing with late-night outages, working weekends to apply patches, scrambling to handle sudden traffic spikes — there's a way to escape all of that. It's called AWS Lambda.

Our team migrated from an EC2-based API server to Lambda back in 2023. The result: infrastructure costs dropped from 15万円/month to 1.5万円/month. Operational overhead fell to nearly zero.

In this article, I'll share everything we learned from that migration — the techniques and hard-won insights for running Lambda confidently in production.

Understanding How AWS Lambda Works

Execution Model

Lambda uses an event-driven execution model. Functions spin up only when a request arrives, execute, then shut down. This eliminates charges for idle time.

Request → Lambda starts → Executes → Response → Lambda stops ↑ ↓ └──── Restarts on next request ←────┘

Pricing Model

Lambda pricing comes down to two factors:

  1. Request count: $0.20 per 1 million requests
  2. Execution duration: $0.0000166667 per GB-second

For example, with 128MB memory, an average 200ms execution time, and 1 million requests per month:

Request cost: 1M × $0.20/1M = $0.20 Duration cost: 1M × 0.2s × 0.125GB × $0.0000166667 = $0.42 Total: $0.62/month (roughly 90円)

An equivalent EC2 instance (t3.micro) runs about $8.50/month. That's a cost difference of more than 13x.

Lambda Function Implementation Best Practices

Basic Structure

Here's the recommended structure for an efficient Lambda function:

// Initialization runs outside the handler — executes only on cold start const AWS = require('aws-sdk'); const dynamodb = new AWS.DynamoDB.DocumentClient(); // Heavy setup like DB connections also lives outside the handler let dbConnection = null; const initializeConnection = async () => { if (!dbConnection) { dbConnection = await createDatabaseConnection(); } return dbConnection; }; exports.handler = async (event, context) => { // Allow the execution context to be reused across invocations context.callbackWaitsForEmptyEventLoop = false; try { // Reuse the existing connection if available const db = await initializeConnection(); // Business logic const result = await processRequest(event, db); return { statusCode: 200, headers: { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' }, body: JSON.stringify(result) }; } catch (error) { console.error('Error:', error); return { statusCode: 500, body: JSON.stringify({ error: 'Internal Server Error' }) }; } };

Error Handling Design

In production, solid error handling is non-negotiable:

class LambdaError extends Error { constructor(message, statusCode, errorCode) { super(message); this.statusCode = statusCode; this.errorCode = errorCode; } } const errorHandler = (error) => { // Known errors — return structured response if (error instanceof LambdaError) { return { statusCode: error.statusCode, body: JSON.stringify({ error: error.message, code: error.errorCode }) }; } // Unexpected errors — log details internally, return generic message to client console.error('Unexpected error:', error); return { statusCode: 500, body: JSON.stringify({ error: 'Internal Server Error', code: 'INTERNAL_ERROR' }) }; };

The Definitive Guide to Cold Start Mitigation

Cold starts are Lambda's biggest challenge — but with the right strategies, you can reduce them to an acceptable level for production use.

When Cold Starts Occur

  1. When a function is invoked for the first time
  2. After an idle period of approximately 15 minutes
  3. When concurrency scales up and new instances are needed
  4. Immediately after a deployment

Strategy 1: Minimize Package Size

# Strip dev dependencies npm prune --production # Exclude unnecessary files via .lambdaignore node_modules/**/*.md node_modules/**/*.ts node_modules/**/test/**

Result: Reduced package size from 50MB to 15MB, cutting cold start time from 800ms to 300ms.

Strategy 2: Provisioned Concurrency

For predictable traffic patterns, Provisioned Concurrency keeps instances warm and ready:

# serverless.yml functions: api: handler: handler.main provisionedConcurrency: 5 # Keep 5 instances warm at all times

Cost: ~$0.015/hour per instance. That's $0.36/day, or roughly $11/month per instance.

Strategy 3: Warmup Strategy

Periodically invoke Lambda to keep instances warm:

// warmup.js — triggered every 5 minutes via CloudWatch Events exports.handler = async (event) => { const lambda = new AWS.Lambda(); const functions = [ 'production-api-users', 'production-api-orders', 'production-api-products' ]; await Promise.all(functions.map(name => lambda.invoke({ FunctionName: name, InvocationType: 'Event', // Async invocation Payload: JSON.stringify({ warmup: true }) }).promise() )); };

Handle warmup requests in each function:

exports.handler = async (event) => { // Short-circuit warmup requests immediately if (event.warmup) { console.log('Warmup request'); return { statusCode: 200, body: 'Warmed up' }; } // Normal request handling // ... };

Memory Configuration Optimization

Lambda's memory setting directly affects both performance and cost.

How CPU Allocation Works

CPU capacity scales proportionally with memory:

MemoryCPUUse Case
128MB0.08 vCPULightweight processing
512MB0.33 vCPUGeneral-purpose APIs
1024MB0.58 vCPUCompute-intensive tasks
1769MB1 vCPUCPU-bound workloads
3008MB1.75 vCPUHeavy processing

Finding the Optimal Memory Size

Use AWS Lambda Power Tuning to find the sweet spot:

# Deploy from the Serverless Application Repository (SAR) aws serverlessrepo create-cloud-formation-change-set \ --application-id arn:aws:serverlessrepo:us-east-1:451282441545:applications/aws-lambda-power-tuning \ --stack-name lambda-power-tuning

Sample output:

{ "power": 512, "cost": 0.0000042, "duration": 215, "stateMachine": { "executionCost": 0.00045, "lambdaCost": 0.0042, "visualization": "https://lambda-power-tuning.show/..." } }

In our case, increasing from 256MB to 512MB delivered:

  • Execution time: 450ms → 180ms (60% faster)
  • Cost: $0.0000047 → $0.0000030 (36% cheaper)

Increasing memory often reduces execution time enough to lower the total cost — counterintuitive, but true.

Practical Use Cases and Implementations

Use Case 1: REST API

A typical implementation paired with API Gateway:

// api/users.js const { DynamoDB } = require('@aws-sdk/client-dynamodb'); const { DynamoDBDocument } = require('@aws-sdk/lib-dynamodb'); const client = new DynamoDB({}); const docClient = DynamoDBDocument.from(client); exports.handler = async (event) => { const { httpMethod, pathParameters, body } = event; switch (httpMethod) { case 'GET': if (pathParameters?.id) { return await getUser(pathParameters.id); } return await listUsers(); case 'POST': return await createUser(JSON.parse(body)); case 'PUT': return await updateUser(pathParameters.id, JSON.parse(body)); case 'DELETE': return await deleteUser(pathParameters.id); default: return { statusCode: 405, body: 'Method Not Allowed' }; } }; async function getUser(id) { const result = await docClient.get({ TableName: process.env.USERS_TABLE, Key: { id } }); if (!result.Item) { return { statusCode: 404, body: JSON.stringify({ error: 'User not found' }) }; } return { statusCode: 200, body: JSON.stringify(result.Item) }; }

Use Case 2: Image Processing Pipeline

Automatic image processing triggered by S3 events:

const sharp = require('sharp'); const { S3Client, GetObjectCommand, PutObjectCommand } = require('@aws-sdk/client-s3'); const s3 = new S3Client({}); exports.handler = async (event) => { for (const record of event.Records) { const bucket = record.s3.bucket.name; const key = decodeURIComponent(record.s3.object.key); // Fetch the original image const { Body } = await s3.send(new GetObjectCommand({ Bucket: bucket, Key: key })); const imageBuffer = await streamToBuffer(Body); // Resize to each target size const sizes = [ { name: 'thumbnail', width: 150, height: 150 }, { name: 'medium', width: 800, height: 600 }, { name: 'large', width: 1920, height: 1080 } ]; await Promise.all(sizes.map(async (size) => { const resized = await sharp(imageBuffer) .resize(size.width, size.height, { fit: 'inside' }) .webp({ quality: 80 }) .toBuffer(); const newKey = key.replace('uploads/', `processed/${size.name}/`).replace(/\.[^.]+$/, '.webp'); await s3.send(new PutObjectCommand({ Bucket: bucket, Key: newKey, Body: resized, ContentType: 'image/webp' })); })); } return { statusCode: 200, body: 'Processing complete' }; };

Use Case 3: Scheduled Batch Processing

Recurring jobs triggered by CloudWatch Events:

// Daily report generation — runs at midnight exports.handler = async (event) => { const yesterday = new Date(); yesterday.setDate(yesterday.getDate() - 1); // Aggregate previous day's data const metrics = await aggregateMetrics(yesterday); // Generate report const report = generateReport(metrics); // Save to S3 await saveToS3(report, `reports/${formatDate(yesterday)}.json`); // Send Slack notification await notifySlack({ channel: '#daily-reports', text: `📊 Daily report generated: ${metrics.totalOrders} orders, $${metrics.revenue} revenue` }); return { success: true }; };

VPC Integration and Security

Configuring Lambda Inside a VPC

When your function needs access to RDS or ElastiCache, it must run inside a VPC:

# serverless.yml provider: vpc: securityGroupIds: - sg-xxxxxxxxx subnetIds: - subnet-xxxxxxxx - subnet-yyyyyyyy # Spread across multiple AZs functions: api: handler: handler.main vpc: securityGroupIds: - ${self:provider.vpc.securityGroupIds} subnetIds: - ${self:provider.vpc.subnetIds}

Cold Start Mitigation for VPC Lambda

VPC Lambda used to suffer from significantly longer cold starts, but AWS dramatically improved this in 2019. These strategies still help:

  1. Use Provisioned Concurrency (covered above)
  2. Use RDS Proxy for efficient connection management
const { Signer } = require('@aws-sdk/rds-signer'); const mysql = require('mysql2/promise'); const signer = new Signer({ hostname: process.env.RDS_PROXY_ENDPOINT, port: 3306, username: process.env.DB_USER }); let connection; async function getConnection() { if (!connection) { const token = await signer.getAuthToken(); connection = await mysql.createConnection({ host: process.env.RDS_PROXY_ENDPOINT, user: process.env.DB_USER, password: token, database: process.env.DB_NAME, ssl: { rejectUnauthorized: true } }); } return connection; }

Monitoring and Debugging

Structured Logging with CloudWatch Logs

Structured logs make debugging far more efficient:

const log = (level, message, data = {}) => { console.log(JSON.stringify({ level, message, timestamp: new Date().toISOString(), requestId: global.requestId, ...data })); }; exports.handler = async (event, context) => { global.requestId = context.awsRequestId; log('INFO', 'Request received', { path: event.path, method: event.httpMethod }); try { const result = await processRequest(event); log('INFO', 'Request completed', { statusCode: 200, duration: Date.now() - startTime }); return result; } catch (error) { log('ERROR', 'Request failed', { error: error.message, stack: error.stack }); throw error; } };

CloudWatch Insights Query Examples

-- Check error rate over time fields @timestamp, @message | filter level = 'ERROR' | stats count() as errors by bin(1h) -- Identify slow requests fields @timestamp, duration, path | filter duration > 1000 | sort duration desc | limit 20 -- Detect cold starts fields @timestamp, @duration, @billedDuration | filter @initDuration > 0 | stats count() as coldStarts, avg(@initDuration) as avgInitTime by bin(1h)

Distributed Tracing with X-Ray

const AWSXRay = require('aws-xray-sdk-core'); const AWS = AWSXRay.captureAWS(require('aws-sdk')); exports.handler = async (event) => { const segment = AWSXRay.getSegment(); // Custom subsegment for granular tracing const subsegment = segment.addNewSubsegment('ProcessOrder'); try { const result = await processOrder(event); subsegment.close(); return result; } catch (error) { subsegment.addError(error); subsegment.close(); throw error; } };

Practical Cost Optimization Strategies

1. Set Appropriate Timeouts

The default 3-second timeout is often too short, but setting it too high means runaway costs during failures:

functions: api: timeout: 10 # 10 seconds for API handlers batch: timeout: 900 # 15 minutes for batch jobs (maximum)

2. Reduce Unnecessary Executions

Perform validation checks early to avoid running expensive logic needlessly:

exports.handler = async (event) => { // Bail out early to minimize billable execution time if (!event.body) { return { statusCode: 400, body: 'Bad Request' }; } const data = JSON.parse(event.body); if (!data.userId) { return { statusCode: 400, body: 'userId is required' }; } // Core logic starts here — minimize time spent in this section // ... };

3. Use Graviton2 (ARM Architecture)

Switch to ARM for a 20% cost reduction:

provider: architecture: arm64 # Changed from x86_64

Note: Some native modules may require recompilation for ARM compatibility.

Summary: When to Choose Lambda

Lambda is a great fit for:

  • Event-driven processing: Webhooks, file handling, notifications
  • Variable traffic: APIs with significant request spikes
  • Batch jobs: Periodic data processing, report generation
  • Microservices: Small, independent units of functionality

Lambda is a poor fit for:

  • Long-running tasks: Processing that exceeds 15 minutes
  • Stateful workloads: Maintaining persistent WebSocket connections
  • High-frequency, steady-state traffic: When requests are constant, EC2 is often cheaper
  • Specialized runtimes: Unsupported languages or runtime versions

Serverless isn't a silver bullet. But used in the right context, it can dramatically reduce both infrastructure costs and operational burden. Start small — pick one project and give it a try.

Resources


If you're exploring a Lambda migration and need guidance, feel free to reach out.