Introduction to Microservices Architecture: Migration Strategies from Monolith and Practical Guide

Design patterns and implementation approaches for building scalable systems. A comprehensive guide covering gradual migration from monolith, service decomposition principles, and fault tolerance strategies.
代表 / エンジニア
TL;DR
- Microservices are not a silver bullet. Assess your organization's maturity and system scale before committing
- Split services along business domain boundaries — not for purely technical reasons
- Understand the complexity of distributed systems and design fault tolerance in from the start
- Gradual migration is the key to success. Avoid Big Bang rewrites
Introduction: Why Microservices Now?
"Our monolith has hit its limits."
Many engineers have heard this before. As the team grows and features pile up, every deployment becomes a stressful event. Tests take hours to run, and even small changes cause unexpected side effects.
Our team faced the same challenges. We had a Ruby on Rails monolith built at the company's founding — what started as a 2-person project had grown into a 15-person team over four years.
The results were telling:
- Deploy frequency: dropped from weekly to twice a month
- Test duration: grew from 15 minutes to 2 hours
- Time to release new features: stretched from 2 weeks to 2 months
To break out of this situation, we decided to migrate to microservices. This article shares what we learned along the way.
What Are Microservices?
Definition
Microservices architecture is an approach to building applications as a collection of small, independent services. Each service:
- Focuses on a single business capability
- Can be deployed independently
- Owns its own data store
- Communicates via lightweight protocols (HTTP/REST, gRPC)
Monolith vs. Microservices
[Monolithic Architecture]
┌─────────────────────────────────────┐
│ Single Application │
│ ┌─────┐ ┌──────┐ ┌───────┐ ┌─────┐│
│ │ UI │ │ Auth │ │ Order │ │Stock││
│ └─────┘ └──────┘ └───────┘ └─────┘│
│ ┌─────────────┐ │
│ │ Single DB │ │
│ └─────────────┘ │
└─────────────────────────────────────┘
[Microservices Architecture]
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ UI/BFF │ │ Auth │ │ Order │ │ Stock │
│ Service │ │ Service │ │ Service │ │ Service │
└────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘
│ │ │ │
│ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐
│ │ Auth │ │ Order │ │ Stock │
│ │ DB │ │ DB │ │ DB │
│ └─────────┘ └─────────┘ └─────────┘Microservices Pros and Cons
| Aspect | Benefits | Drawbacks |
|---|---|---|
| Development speed | Teams develop and deploy independently | Requires cross-service coordination |
| Scalability | Scale only the services that need it | More complex infrastructure management |
| Tech choices | Choose the best stack per service | Risk of technology sprawl |
| Fault isolation | Failures are less likely to cascade | Distributed system failures are harder to debug |
| Team structure | Small teams can operate autonomously | Communication overhead shifts |
When to Migrate: Do You Actually Need Microservices?
Microservices bring real complexity. Use the checklist below to decide whether migration is warranted.
Situations where migration makes sense
□ Team size exceeds 10 people and merge conflicts are frequent
□ Deploy frequency has dropped to once a month or less
□ Tests take over an hour to run
□ You need to scale specific features independently
□ Different features call for different tech stacks
□ A failure in one area takes down the entire systemSituations where migration should be avoided
□ Team size is 5 or fewer
□ Product is in early stage (pre-PMF)
□ Domain understanding is shallow
□ Operational experience or infrastructure knowledge is lacking
□ The reason is "because everyone's doing it"Important: For small teams and early-stage products, a modular monolith is a valid alternative. It preserves the simplicity of a monolith while keeping the architecture ready for future decomposition.
Service Decomposition Principles
Domain-Driven Design (DDD)-Based Decomposition
Service boundaries should be defined by business domains. Splitting services for purely technical reasons (e.g., "I want to write this part in Go") tends to backfire.
[Bounded Contexts for an E-Commerce Site]
┌─────────────────┐ ┌─────────────────┐
│ Product Catalog │ │ Orders │
│ · Product info │ │ · Create order │
│ · Categories │ │ · Order history│
│ · Search │ │ · Cancellation │
└─────────────────┘ └─────────────────┘
┌─────────────────┐ ┌─────────────────┐
│ Inventory │ │ Payments │
│ · Stock levels │ │ · Processing │
│ · Stock in/out │ │ · Refunds │
│ · Alerts │ │ · Receipts │
└─────────────────┘ └─────────────────┘
┌─────────────────┐ ┌─────────────────┐
│ Shipping │ │ Customers │
│ · Arrangement │ │ · Profiles │
│ · Tracking │ │ · Auth │
│ · Delivery │ │ · Points │
└─────────────────┘ └─────────────────┘Decomposition Guidelines
- High cohesion: Keep related functionality in the same service
- Low coupling: Minimize dependencies between services
- Align with team boundaries: One team per service (or set of services)
- Data ownership: Each service owns its own data
Anti-patterns: Splits to Avoid
// ❌ Splitting by technical layer (don't do this)
// - API Gateway Service
// - Business Logic Service
// - Data Access Service
// ✅ Splitting by business domain (correct approach)
// - Order Service (includes all layers for orders)
// - Inventory Service (includes all layers for inventory)
// - Payment Service (includes all layers for payments)Service Communication Patterns
1. Synchronous Communication (REST / gRPC)
Use when an immediate response is required.
// REST API example
// Order Service calls Inventory Service
async function createOrder(orderData: CreateOrderDto): Promise<Order> {
// Check stock (synchronous call)
const stockResponse = await fetch(
`${INVENTORY_SERVICE_URL}/api/stock/${orderData.productId}`
);
const stock = await stockResponse.json();
if (stock.quantity < orderData.quantity) {
throw new Error('Insufficient stock');
}
// Create order
const order = await orderRepository.create(orderData);
// Reserve stock (synchronous call)
await fetch(`${INVENTORY_SERVICE_URL}/api/stock/reserve`, {
method: 'POST',
body: JSON.stringify({
productId: orderData.productId,
quantity: orderData.quantity,
orderId: order.id
})
});
return order;
}When to choose gRPC:
- High-speed communication between internal services
- When type-safe interfaces are required
- When bidirectional streaming is needed
// inventory.proto
syntax = "proto3";
service InventoryService {
rpc CheckStock(StockRequest) returns (StockResponse);
rpc ReserveStock(ReserveRequest) returns (ReserveResponse);
}
message StockRequest {
string product_id = 1;
}
message StockResponse {
int32 quantity = 1;
bool available = 2;
}2. Asynchronous Communication (Message Queues)
When an immediate response is not required, this enables loose coupling.
// Event-driven architecture example
// Order Service publishes an event
async function completeOrder(orderId: string): Promise<void> {
const order = await orderRepository.findById(orderId);
order.status = 'completed';
await orderRepository.save(order);
// Publish event (asynchronous)
await messageQueue.publish('order.completed', {
orderId: order.id,
customerId: order.customerId,
items: order.items,
totalAmount: order.totalAmount
});
}
// Inventory Service subscribes to the event
messageQueue.subscribe('order.completed', async (event) => {
// Finalize stock reservation
for (const item of event.items) {
await inventoryService.confirmReservation(item.productId, item.quantity);
}
});
// Notification Service subscribes to the event
messageQueue.subscribe('order.completed', async (event) => {
// Send confirmation email
await emailService.sendOrderConfirmation(event.customerId, event.orderId);
});Communication Pattern Selection Guide
| Pattern | Use Case | Benefits | Drawbacks |
|---|---|---|---|
| REST | External APIs, simple CRUD | Widely adopted, easy to debug | Overhead |
| gRPC | Internal communication, high performance | Fast, type-safe | Learning curve |
| Message queue | Async processing, event-driven | Loose coupling, scalable | Added complexity |
Data Management Strategy
Database per Service Pattern
Each service owns its own database.
# docker-compose.yml
version: '3.8'
services:
order-service:
build: ./services/order
environment:
DATABASE_URL: postgresql://order-db:5432/orders
order-db:
image: postgres:15
volumes:
- order-data:/var/lib/postgresql/data
inventory-service:
build: ./services/inventory
environment:
DATABASE_URL: postgresql://inventory-db:5432/inventory
inventory-db:
image: postgres:15
volumes:
- inventory-data:/var/lib/postgresql/data
payment-service:
build: ./services/payment
environment:
DATABASE_URL: postgresql://payment-db:5432/payments
payment-db:
image: postgres:15
volumes:
- payment-data:/var/lib/postgresql/data
volumes:
order-data:
inventory-data:
payment-data:Data Consistency Challenges and Solutions
In distributed systems, ACID transactions are not available. Instead, we embrace eventual consistency.
The Saga Pattern
Multi-service operations are implemented as a series of local transactions.
// Order creation Saga (Choreography style)
// 1. Order Service: Create order in "pending" state
// 2. Inventory Service: Reserve stock
// 3. Payment Service: Process payment
// 4. Order Service: Update order to "confirmed"
// Compensating transactions on failure
// Payment fails → Inventory: release reservation → Order: cancel order
class OrderSaga {
async execute(orderData: CreateOrderDto): Promise<Order> {
const sagaId = generateId();
try {
// Step 1: Create order (pending state)
const order = await this.orderService.createPending(orderData, sagaId);
// Step 2: Reserve stock
await this.inventoryService.reserve(orderData.items, sagaId);
// Step 3: Process payment
await this.paymentService.charge(order.totalAmount, sagaId);
// Step 4: Confirm order
return await this.orderService.confirm(order.id);
} catch (error) {
// Execute compensating transactions
await this.compensate(sagaId, error);
throw error;
}
}
private async compensate(sagaId: string, error: Error): Promise<void> {
// Undo completed steps in reverse order
await this.paymentService.refund(sagaId).catch(() => {});
await this.inventoryService.releaseReservation(sagaId).catch(() => {});
await this.orderService.cancel(sagaId).catch(() => {});
}
}Fault Tolerance and Resilience
In distributed systems, network failures and service outages should be treated as expected events, not exceptions.
Circuit Breaker Pattern
Detects consecutive failures and temporarily stops calls to the failing service.
class CircuitBreaker {
private failureCount = 0;
private lastFailureTime: Date | null = null;
private state: 'CLOSED' | 'OPEN' | 'HALF_OPEN' = 'CLOSED';
constructor(
private threshold: number = 5,
private timeout: number = 30000
) {}
async call<T>(fn: () => Promise<T>): Promise<T> {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailureTime!.getTime() > this.timeout) {
this.state = 'HALF_OPEN';
} else {
throw new Error('Circuit is OPEN');
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
private onSuccess(): void {
this.failureCount = 0;
this.state = 'CLOSED';
}
private onFailure(): void {
this.failureCount++;
this.lastFailureTime = new Date();
if (this.failureCount >= this.threshold) {
this.state = 'OPEN';
}
}
}
// Usage example
const inventoryCircuit = new CircuitBreaker(5, 30000);
async function checkStock(productId: string) {
return inventoryCircuit.call(() =>
fetch(`${INVENTORY_SERVICE_URL}/api/stock/${productId}`)
);
}Retry with Exponential Backoff
Attempts recovery from transient failures.
async function retryWithBackoff<T>(
fn: () => Promise<T>,
maxRetries: number = 3,
baseDelay: number = 1000
): Promise<T> {
let lastError: Error;
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error as Error;
if (attempt < maxRetries - 1) {
// Exponential backoff + jitter
const delay = baseDelay * Math.pow(2, attempt) + Math.random() * 1000;
await sleep(delay);
}
}
}
throw lastError!;
}Timeout Configuration
Always set timeouts on all external calls.
async function fetchWithTimeout(
url: string,
options: RequestInit = {},
timeout: number = 5000
): Promise<Response> {
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), timeout);
try {
return await fetch(url, {
...options,
signal: controller.signal
});
} finally {
clearTimeout(timeoutId);
}
}Monitoring and Observability
In distributed systems, pinpointing problems becomes significantly harder. Use the three pillars of observability to maintain visibility.
1. Logs
Structured logging with correlation IDs makes requests traceable across services.
// Structured logging with correlation ID
const logger = {
info: (message: string, context: object) => {
console.log(JSON.stringify({
level: 'info',
message,
timestamp: new Date().toISOString(),
correlationId: getCorrelationId(),
service: process.env.SERVICE_NAME,
...context
}));
}
};
// Usage example
logger.info('Order created', {
orderId: order.id,
customerId: order.customerId,
totalAmount: order.totalAmount
});2. Metrics
Monitor service health with Prometheus.
import { Counter, Histogram, Registry } from 'prom-client';
const registry = new Registry();
// Request counter
const httpRequestsTotal = new Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['method', 'path', 'status'],
registers: [registry]
});
// Response time
const httpRequestDuration = new Histogram({
name: 'http_request_duration_seconds',
help: 'HTTP request duration in seconds',
labelNames: ['method', 'path'],
buckets: [0.1, 0.5, 1, 2, 5],
registers: [registry]
});
// Measure via middleware
app.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
const duration = (Date.now() - start) / 1000;
httpRequestsTotal.inc({ method: req.method, path: req.path, status: res.statusCode });
httpRequestDuration.observe({ method: req.method, path: req.path }, duration);
});
next();
});3. Traces
Visualize the flow of requests across services with OpenTelemetry.
import { trace, SpanKind } from '@opentelemetry/api';
const tracer = trace.getTracer('order-service');
async function createOrder(orderData: CreateOrderDto): Promise<Order> {
return tracer.startActiveSpan('createOrder', async (span) => {
try {
span.setAttribute('order.customer_id', orderData.customerId);
// Check inventory (child span)
const stock = await tracer.startActiveSpan('checkInventory', {
kind: SpanKind.CLIENT,
}, async (childSpan) => {
const result = await inventoryService.checkStock(orderData.productId);
childSpan.end();
return result;
});
// Create order
const order = await orderRepository.create(orderData);
span.setAttribute('order.id', order.id);
return order;
} catch (error) {
span.recordException(error as Error);
throw error;
} finally {
span.end();
}
});
}Gradual Migration in Practice
The Strangler Fig Pattern
Rather than replacing the monolith all at once, migrate functionality piece by piece.
[Phase 1: Running in Parallel]
┌─────────────┐
│ API Gateway │
└──────┬──────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Monolith │ │ Monolith │ │ New Order │
│ (Auth) │ │ (Stock) │ │ Service │
└───────────┘ └───────────┘ └───────────┘
[Phase 2: Feature Migration Complete]
┌─────────────┐
│ API Gateway │
└──────┬──────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ New Auth │ │ New Stock │ │ Order │
│ Service │ │ Service │ │ Service │
└───────────┘ └───────────┘ └───────────┘Migration Priorities
- Start with features that have fewer dependencies
- Prioritize features that change frequently
- Split along team boundaries
- Plan data migration carefully
Conclusion: Making Microservices Work
Microservices is a powerful architectural style, but it is not a universal solution.
Before You Migrate, Confirm
- Organizational readiness: DevOps culture, autonomous team structure
- Technical readiness: Containers, CI/CD, monitoring infrastructure
- Domain understanding: Deep familiarity with the business domain
Principles for Success
- Start small: Begin with a single service
- Migrate incrementally: Avoid Big Bang rewrites
- Ensure observability: Logs, metrics, and traces
- Design for failure: Circuit Breaker, Retry, Timeout
Migrating to microservices is as much an organizational transformation as it is a technical one. Take your time, move steadily, and success will follow.
Resources
- Building Microservices (Sam Newman)
- Domain-Driven Design (Eric Evans)
- Kubernetes Official Documentation
- OpenTelemetry
If you're struggling with a microservices migration, feel free to reach out.