BACKEND2025-02-15📖 5 min read

Introduction to Microservices Architecture: Migration Strategies from Monolith and Practical Guide

Introduction to Microservices Architecture: Migration Strategies from Monolith and Practical Guide

Design patterns and implementation approaches for building scalable systems. A comprehensive guide covering gradual migration from monolith, service decomposition principles, and fault tolerance strategies.

髙木 晃宏

代表 / エンジニア

👨‍💼

TL;DR

  • Microservices are not a silver bullet. Assess your organization's maturity and system scale before committing
  • Split services along business domain boundaries — not for purely technical reasons
  • Understand the complexity of distributed systems and design fault tolerance in from the start
  • Gradual migration is the key to success. Avoid Big Bang rewrites

Introduction: Why Microservices Now?

"Our monolith has hit its limits."

Many engineers have heard this before. As the team grows and features pile up, every deployment becomes a stressful event. Tests take hours to run, and even small changes cause unexpected side effects.

Our team faced the same challenges. We had a Ruby on Rails monolith built at the company's founding — what started as a 2-person project had grown into a 15-person team over four years.

The results were telling:

  • Deploy frequency: dropped from weekly to twice a month
  • Test duration: grew from 15 minutes to 2 hours
  • Time to release new features: stretched from 2 weeks to 2 months

To break out of this situation, we decided to migrate to microservices. This article shares what we learned along the way.

What Are Microservices?

Definition

Microservices architecture is an approach to building applications as a collection of small, independent services. Each service:

  • Focuses on a single business capability
  • Can be deployed independently
  • Owns its own data store
  • Communicates via lightweight protocols (HTTP/REST, gRPC)

Monolith vs. Microservices

[Monolithic Architecture] ┌─────────────────────────────────────┐ │ Single Application │ │ ┌─────┐ ┌──────┐ ┌───────┐ ┌─────┐│ │ │ UI │ │ Auth │ │ Order │ │Stock││ │ └─────┘ └──────┘ └───────┘ └─────┘│ │ ┌─────────────┐ │ │ │ Single DB │ │ │ └─────────────┘ │ └─────────────────────────────────────┘ [Microservices Architecture] ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ UI/BFF │ │ Auth │ │ Order │ │ Stock │ │ Service │ │ Service │ │ Service │ │ Service │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ │ │ Auth │ │ Order │ │ Stock │ │ │ DB │ │ DB │ │ DB │ │ └─────────┘ └─────────┘ └─────────┘

Microservices Pros and Cons

AspectBenefitsDrawbacks
Development speedTeams develop and deploy independentlyRequires cross-service coordination
ScalabilityScale only the services that need itMore complex infrastructure management
Tech choicesChoose the best stack per serviceRisk of technology sprawl
Fault isolationFailures are less likely to cascadeDistributed system failures are harder to debug
Team structureSmall teams can operate autonomouslyCommunication overhead shifts

When to Migrate: Do You Actually Need Microservices?

Microservices bring real complexity. Use the checklist below to decide whether migration is warranted.

Situations where migration makes sense

□ Team size exceeds 10 people and merge conflicts are frequent □ Deploy frequency has dropped to once a month or less □ Tests take over an hour to run □ You need to scale specific features independently □ Different features call for different tech stacks □ A failure in one area takes down the entire system

Situations where migration should be avoided

□ Team size is 5 or fewer □ Product is in early stage (pre-PMF) □ Domain understanding is shallow □ Operational experience or infrastructure knowledge is lacking □ The reason is "because everyone's doing it"

Important: For small teams and early-stage products, a modular monolith is a valid alternative. It preserves the simplicity of a monolith while keeping the architecture ready for future decomposition.

Service Decomposition Principles

Domain-Driven Design (DDD)-Based Decomposition

Service boundaries should be defined by business domains. Splitting services for purely technical reasons (e.g., "I want to write this part in Go") tends to backfire.

[Bounded Contexts for an E-Commerce Site] ┌─────────────────┐ ┌─────────────────┐ │ Product Catalog │ │ Orders │ │ · Product info │ │ · Create order │ │ · Categories │ │ · Order history│ │ · Search │ │ · Cancellation │ └─────────────────┘ └─────────────────┘ ┌─────────────────┐ ┌─────────────────┐ │ Inventory │ │ Payments │ │ · Stock levels │ │ · Processing │ │ · Stock in/out │ │ · Refunds │ │ · Alerts │ │ · Receipts │ └─────────────────┘ └─────────────────┘ ┌─────────────────┐ ┌─────────────────┐ │ Shipping │ │ Customers │ │ · Arrangement │ │ · Profiles │ │ · Tracking │ │ · Auth │ │ · Delivery │ │ · Points │ └─────────────────┘ └─────────────────┘

Decomposition Guidelines

  1. High cohesion: Keep related functionality in the same service
  2. Low coupling: Minimize dependencies between services
  3. Align with team boundaries: One team per service (or set of services)
  4. Data ownership: Each service owns its own data

Anti-patterns: Splits to Avoid

// ❌ Splitting by technical layer (don't do this) // - API Gateway Service // - Business Logic Service // - Data Access Service // ✅ Splitting by business domain (correct approach) // - Order Service (includes all layers for orders) // - Inventory Service (includes all layers for inventory) // - Payment Service (includes all layers for payments)

Service Communication Patterns

1. Synchronous Communication (REST / gRPC)

Use when an immediate response is required.

// REST API example // Order Service calls Inventory Service async function createOrder(orderData: CreateOrderDto): Promise<Order> { // Check stock (synchronous call) const stockResponse = await fetch( `${INVENTORY_SERVICE_URL}/api/stock/${orderData.productId}` ); const stock = await stockResponse.json(); if (stock.quantity < orderData.quantity) { throw new Error('Insufficient stock'); } // Create order const order = await orderRepository.create(orderData); // Reserve stock (synchronous call) await fetch(`${INVENTORY_SERVICE_URL}/api/stock/reserve`, { method: 'POST', body: JSON.stringify({ productId: orderData.productId, quantity: orderData.quantity, orderId: order.id }) }); return order; }

When to choose gRPC:

  • High-speed communication between internal services
  • When type-safe interfaces are required
  • When bidirectional streaming is needed
// inventory.proto syntax = "proto3"; service InventoryService { rpc CheckStock(StockRequest) returns (StockResponse); rpc ReserveStock(ReserveRequest) returns (ReserveResponse); } message StockRequest { string product_id = 1; } message StockResponse { int32 quantity = 1; bool available = 2; }

2. Asynchronous Communication (Message Queues)

When an immediate response is not required, this enables loose coupling.

// Event-driven architecture example // Order Service publishes an event async function completeOrder(orderId: string): Promise<void> { const order = await orderRepository.findById(orderId); order.status = 'completed'; await orderRepository.save(order); // Publish event (asynchronous) await messageQueue.publish('order.completed', { orderId: order.id, customerId: order.customerId, items: order.items, totalAmount: order.totalAmount }); } // Inventory Service subscribes to the event messageQueue.subscribe('order.completed', async (event) => { // Finalize stock reservation for (const item of event.items) { await inventoryService.confirmReservation(item.productId, item.quantity); } }); // Notification Service subscribes to the event messageQueue.subscribe('order.completed', async (event) => { // Send confirmation email await emailService.sendOrderConfirmation(event.customerId, event.orderId); });

Communication Pattern Selection Guide

PatternUse CaseBenefitsDrawbacks
RESTExternal APIs, simple CRUDWidely adopted, easy to debugOverhead
gRPCInternal communication, high performanceFast, type-safeLearning curve
Message queueAsync processing, event-drivenLoose coupling, scalableAdded complexity

Data Management Strategy

Database per Service Pattern

Each service owns its own database.

# docker-compose.yml version: '3.8' services: order-service: build: ./services/order environment: DATABASE_URL: postgresql://order-db:5432/orders order-db: image: postgres:15 volumes: - order-data:/var/lib/postgresql/data inventory-service: build: ./services/inventory environment: DATABASE_URL: postgresql://inventory-db:5432/inventory inventory-db: image: postgres:15 volumes: - inventory-data:/var/lib/postgresql/data payment-service: build: ./services/payment environment: DATABASE_URL: postgresql://payment-db:5432/payments payment-db: image: postgres:15 volumes: - payment-data:/var/lib/postgresql/data volumes: order-data: inventory-data: payment-data:

Data Consistency Challenges and Solutions

In distributed systems, ACID transactions are not available. Instead, we embrace eventual consistency.

The Saga Pattern

Multi-service operations are implemented as a series of local transactions.

// Order creation Saga (Choreography style) // 1. Order Service: Create order in "pending" state // 2. Inventory Service: Reserve stock // 3. Payment Service: Process payment // 4. Order Service: Update order to "confirmed" // Compensating transactions on failure // Payment fails → Inventory: release reservation → Order: cancel order class OrderSaga { async execute(orderData: CreateOrderDto): Promise<Order> { const sagaId = generateId(); try { // Step 1: Create order (pending state) const order = await this.orderService.createPending(orderData, sagaId); // Step 2: Reserve stock await this.inventoryService.reserve(orderData.items, sagaId); // Step 3: Process payment await this.paymentService.charge(order.totalAmount, sagaId); // Step 4: Confirm order return await this.orderService.confirm(order.id); } catch (error) { // Execute compensating transactions await this.compensate(sagaId, error); throw error; } } private async compensate(sagaId: string, error: Error): Promise<void> { // Undo completed steps in reverse order await this.paymentService.refund(sagaId).catch(() => {}); await this.inventoryService.releaseReservation(sagaId).catch(() => {}); await this.orderService.cancel(sagaId).catch(() => {}); } }

Fault Tolerance and Resilience

In distributed systems, network failures and service outages should be treated as expected events, not exceptions.

Circuit Breaker Pattern

Detects consecutive failures and temporarily stops calls to the failing service.

class CircuitBreaker { private failureCount = 0; private lastFailureTime: Date | null = null; private state: 'CLOSED' | 'OPEN' | 'HALF_OPEN' = 'CLOSED'; constructor( private threshold: number = 5, private timeout: number = 30000 ) {} async call<T>(fn: () => Promise<T>): Promise<T> { if (this.state === 'OPEN') { if (Date.now() - this.lastFailureTime!.getTime() > this.timeout) { this.state = 'HALF_OPEN'; } else { throw new Error('Circuit is OPEN'); } } try { const result = await fn(); this.onSuccess(); return result; } catch (error) { this.onFailure(); throw error; } } private onSuccess(): void { this.failureCount = 0; this.state = 'CLOSED'; } private onFailure(): void { this.failureCount++; this.lastFailureTime = new Date(); if (this.failureCount >= this.threshold) { this.state = 'OPEN'; } } } // Usage example const inventoryCircuit = new CircuitBreaker(5, 30000); async function checkStock(productId: string) { return inventoryCircuit.call(() => fetch(`${INVENTORY_SERVICE_URL}/api/stock/${productId}`) ); }

Retry with Exponential Backoff

Attempts recovery from transient failures.

async function retryWithBackoff<T>( fn: () => Promise<T>, maxRetries: number = 3, baseDelay: number = 1000 ): Promise<T> { let lastError: Error; for (let attempt = 0; attempt < maxRetries; attempt++) { try { return await fn(); } catch (error) { lastError = error as Error; if (attempt < maxRetries - 1) { // Exponential backoff + jitter const delay = baseDelay * Math.pow(2, attempt) + Math.random() * 1000; await sleep(delay); } } } throw lastError!; }

Timeout Configuration

Always set timeouts on all external calls.

async function fetchWithTimeout( url: string, options: RequestInit = {}, timeout: number = 5000 ): Promise<Response> { const controller = new AbortController(); const timeoutId = setTimeout(() => controller.abort(), timeout); try { return await fetch(url, { ...options, signal: controller.signal }); } finally { clearTimeout(timeoutId); } }

Monitoring and Observability

In distributed systems, pinpointing problems becomes significantly harder. Use the three pillars of observability to maintain visibility.

1. Logs

Structured logging with correlation IDs makes requests traceable across services.

// Structured logging with correlation ID const logger = { info: (message: string, context: object) => { console.log(JSON.stringify({ level: 'info', message, timestamp: new Date().toISOString(), correlationId: getCorrelationId(), service: process.env.SERVICE_NAME, ...context })); } }; // Usage example logger.info('Order created', { orderId: order.id, customerId: order.customerId, totalAmount: order.totalAmount });

2. Metrics

Monitor service health with Prometheus.

import { Counter, Histogram, Registry } from 'prom-client'; const registry = new Registry(); // Request counter const httpRequestsTotal = new Counter({ name: 'http_requests_total', help: 'Total number of HTTP requests', labelNames: ['method', 'path', 'status'], registers: [registry] }); // Response time const httpRequestDuration = new Histogram({ name: 'http_request_duration_seconds', help: 'HTTP request duration in seconds', labelNames: ['method', 'path'], buckets: [0.1, 0.5, 1, 2, 5], registers: [registry] }); // Measure via middleware app.use((req, res, next) => { const start = Date.now(); res.on('finish', () => { const duration = (Date.now() - start) / 1000; httpRequestsTotal.inc({ method: req.method, path: req.path, status: res.statusCode }); httpRequestDuration.observe({ method: req.method, path: req.path }, duration); }); next(); });

3. Traces

Visualize the flow of requests across services with OpenTelemetry.

import { trace, SpanKind } from '@opentelemetry/api'; const tracer = trace.getTracer('order-service'); async function createOrder(orderData: CreateOrderDto): Promise<Order> { return tracer.startActiveSpan('createOrder', async (span) => { try { span.setAttribute('order.customer_id', orderData.customerId); // Check inventory (child span) const stock = await tracer.startActiveSpan('checkInventory', { kind: SpanKind.CLIENT, }, async (childSpan) => { const result = await inventoryService.checkStock(orderData.productId); childSpan.end(); return result; }); // Create order const order = await orderRepository.create(orderData); span.setAttribute('order.id', order.id); return order; } catch (error) { span.recordException(error as Error); throw error; } finally { span.end(); } }); }

Gradual Migration in Practice

The Strangler Fig Pattern

Rather than replacing the monolith all at once, migrate functionality piece by piece.

[Phase 1: Running in Parallel] ┌─────────────┐ │ API Gateway │ └──────┬──────┘ ┌──────────────┼──────────────┐ ▼ ▼ ▼ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ Monolith │ │ Monolith │ │ New Order │ │ (Auth) │ │ (Stock) │ │ Service │ └───────────┘ └───────────┘ └───────────┘ [Phase 2: Feature Migration Complete] ┌─────────────┐ │ API Gateway │ └──────┬──────┘ ┌──────────────┼──────────────┐ ▼ ▼ ▼ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ New Auth │ │ New Stock │ │ Order │ │ Service │ │ Service │ │ Service │ └───────────┘ └───────────┘ └───────────┘

Migration Priorities

  1. Start with features that have fewer dependencies
  2. Prioritize features that change frequently
  3. Split along team boundaries
  4. Plan data migration carefully

Conclusion: Making Microservices Work

Microservices is a powerful architectural style, but it is not a universal solution.

Before You Migrate, Confirm

  1. Organizational readiness: DevOps culture, autonomous team structure
  2. Technical readiness: Containers, CI/CD, monitoring infrastructure
  3. Domain understanding: Deep familiarity with the business domain

Principles for Success

  1. Start small: Begin with a single service
  2. Migrate incrementally: Avoid Big Bang rewrites
  3. Ensure observability: Logs, metrics, and traces
  4. Design for failure: Circuit Breaker, Retry, Timeout

Migrating to microservices is as much an organizational transformation as it is a technical one. Take your time, move steadily, and success will follow.

Resources


If you're struggling with a microservices migration, feel free to reach out.