Introduction to Microservices Architecture: Migration Strategies from Monolith and Practical Guide

TL;DR

Microservices are not a silver bullet. Assess your organization's maturity and system scale before committing
Split services along business domain boundaries — not for purely technical reasons
Understand the complexity of distributed systems and design fault tolerance in from the start
Gradual migration is the key to success. Avoid Big Bang rewrites

Introduction: Why Microservices Now?

"Our monolith has hit its limits."

Many engineers have heard this before. As the team grows and features pile up, every deployment becomes a stressful event. Tests take hours to run, and even small changes cause unexpected side effects.

Our team faced the same challenges. We had a Ruby on Rails monolith built at the company's founding — what started as a 2-person project had grown into a 15-person team over four years.

The results were telling:

Deploy frequency: dropped from weekly to twice a month
Test duration: grew from 15 minutes to 2 hours
Time to release new features: stretched from 2 weeks to 2 months

To break out of this situation, we decided to migrate to microservices. This article shares what we learned along the way.

What Are Microservices?

Definition

Microservices architecture is an approach to building applications as a collection of small, independent services. Each service:

Focuses on a single business capability
Can be deployed independently
Owns its own data store
Communicates via lightweight protocols (HTTP/REST, gRPC)

Monolith vs. Microservices

[Monolithic Architecture]
┌─────────────────────────────────────┐
│         Single Application          │
│  ┌─────┐ ┌──────┐ ┌───────┐ ┌─────┐│
│  │ UI  │ │ Auth │ │ Order │ │Stock││
│  └─────┘ └──────┘ └───────┘ └─────┘│
│           ┌─────────────┐           │
│           │  Single DB  │           │
│           └─────────────┘           │
└─────────────────────────────────────┘

[Microservices Architecture]
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│  UI/BFF │ │  Auth   │ │  Order  │ │  Stock  │
│ Service │ │ Service │ │ Service │ │ Service │
└────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘
     │           │           │           │
     │      ┌────┴────┐ ┌────┴────┐ ┌────┴────┐
     │      │  Auth   │ │  Order  │ │  Stock  │
     │      │   DB    │ │   DB    │ │   DB    │
     │      └─────────┘ └─────────┘ └─────────┘

Microservices Pros and Cons

Aspect	Benefits	Drawbacks
Development speed	Teams develop and deploy independently	Requires cross-service coordination
Scalability	Scale only the services that need it	More complex infrastructure management
Tech choices	Choose the best stack per service	Risk of technology sprawl
Fault isolation	Failures are less likely to cascade	Distributed system failures are harder to debug
Team structure	Small teams can operate autonomously	Communication overhead shifts

When to Migrate: Do You Actually Need Microservices?

Microservices bring real complexity. Use the checklist below to decide whether migration is warranted.

Situations where migration makes sense

□ Team size exceeds 10 people and merge conflicts are frequent
□ Deploy frequency has dropped to once a month or less
□ Tests take over an hour to run
□ You need to scale specific features independently
□ Different features call for different tech stacks
□ A failure in one area takes down the entire system

Situations where migration should be avoided

□ Team size is 5 or fewer
□ Product is in early stage (pre-PMF)
□ Domain understanding is shallow
□ Operational experience or infrastructure knowledge is lacking
□ The reason is "because everyone's doing it"

Important: For small teams and early-stage products, a modular monolith is a valid alternative. It preserves the simplicity of a monolith while keeping the architecture ready for future decomposition.

Service Decomposition Principles

Domain-Driven Design (DDD)-Based Decomposition

Service boundaries should be defined by business domains. Splitting services for purely technical reasons (e.g., "I want to write this part in Go") tends to backfire.

[Bounded Contexts for an E-Commerce Site]

┌─────────────────┐    ┌─────────────────┐
│  Product Catalog │    │     Orders      │
│  · Product info  │    │  · Create order │
│  · Categories    │    │  · Order history│
│  · Search        │    │  · Cancellation │
└─────────────────┘    └─────────────────┘

┌─────────────────┐    ┌─────────────────┐
│    Inventory    │    │    Payments     │
│  · Stock levels  │    │  · Processing  │
│  · Stock in/out  │    │  · Refunds     │
│  · Alerts        │    │  · Receipts    │
└─────────────────┘    └─────────────────┘

┌─────────────────┐    ┌─────────────────┐
│    Shipping     │    │   Customers     │
│  · Arrangement   │    │  · Profiles    │
│  · Tracking      │    │  · Auth        │
│  · Delivery      │    │  · Points      │
└─────────────────┘    └─────────────────┘

Decomposition Guidelines

High cohesion: Keep related functionality in the same service
Low coupling: Minimize dependencies between services
Align with team boundaries: One team per service (or set of services)
Data ownership: Each service owns its own data

Anti-patterns: Splits to Avoid

// ❌ Splitting by technical layer (don't do this)
// - API Gateway Service
// - Business Logic Service
// - Data Access Service

// ✅ Splitting by business domain (correct approach)
// - Order Service (includes all layers for orders)
// - Inventory Service (includes all layers for inventory)
// - Payment Service (includes all layers for payments)

Service Communication Patterns

1. Synchronous Communication (REST / gRPC)

Use when an immediate response is required.

// REST API example
// Order Service calls Inventory Service
async function createOrder(orderData: CreateOrderDto): Promise<Order> {
  // Check stock (synchronous call)
  const stockResponse = await fetch(
    `${INVENTORY_SERVICE_URL}/api/stock/${orderData.productId}`
  );
  const stock = await stockResponse.json();

  if (stock.quantity < orderData.quantity) {
    throw new Error('Insufficient stock');
  }

  // Create order
  const order = await orderRepository.create(orderData);

  // Reserve stock (synchronous call)
  await fetch(`${INVENTORY_SERVICE_URL}/api/stock/reserve`, {
    method: 'POST',
    body: JSON.stringify({
      productId: orderData.productId,
      quantity: orderData.quantity,
      orderId: order.id
    })
  });

  return order;
}

When to choose gRPC:

High-speed communication between internal services
When type-safe interfaces are required
When bidirectional streaming is needed

// inventory.proto
syntax = "proto3";

service InventoryService {
  rpc CheckStock(StockRequest) returns (StockResponse);
  rpc ReserveStock(ReserveRequest) returns (ReserveResponse);
}

message StockRequest {
  string product_id = 1;
}

message StockResponse {
  int32 quantity = 1;
  bool available = 2;
}

2. Asynchronous Communication (Message Queues)

When an immediate response is not required, this enables loose coupling.

// Event-driven architecture example
// Order Service publishes an event
async function completeOrder(orderId: string): Promise<void> {
  const order = await orderRepository.findById(orderId);
  order.status = 'completed';
  await orderRepository.save(order);

  // Publish event (asynchronous)
  await messageQueue.publish('order.completed', {
    orderId: order.id,
    customerId: order.customerId,
    items: order.items,
    totalAmount: order.totalAmount
  });
}

// Inventory Service subscribes to the event
messageQueue.subscribe('order.completed', async (event) => {
  // Finalize stock reservation
  for (const item of event.items) {
    await inventoryService.confirmReservation(item.productId, item.quantity);
  }
});

// Notification Service subscribes to the event
messageQueue.subscribe('order.completed', async (event) => {
  // Send confirmation email
  await emailService.sendOrderConfirmation(event.customerId, event.orderId);
});

Communication Pattern Selection Guide

Pattern	Use Case	Benefits	Drawbacks
REST	External APIs, simple CRUD	Widely adopted, easy to debug	Overhead
gRPC	Internal communication, high performance	Fast, type-safe	Learning curve
Message queue	Async processing, event-driven	Loose coupling, scalable	Added complexity

Data Management Strategy

Database per Service Pattern

Each service owns its own database.

# docker-compose.yml
version: '3.8'
services:
  order-service:
    build: ./services/order
    environment:
      DATABASE_URL: postgresql://order-db:5432/orders

  order-db:
    image: postgres:15
    volumes:
      - order-data:/var/lib/postgresql/data

  inventory-service:
    build: ./services/inventory
    environment:
      DATABASE_URL: postgresql://inventory-db:5432/inventory

  inventory-db:
    image: postgres:15
    volumes:
      - inventory-data:/var/lib/postgresql/data

  payment-service:
    build: ./services/payment
    environment:
      DATABASE_URL: postgresql://payment-db:5432/payments

  payment-db:
    image: postgres:15
    volumes:
      - payment-data:/var/lib/postgresql/data

volumes:
  order-data:
  inventory-data:
  payment-data:

Data Consistency Challenges and Solutions

In distributed systems, ACID transactions are not available. Instead, we embrace eventual consistency.

The Saga Pattern

Multi-service operations are implemented as a series of local transactions.

// Order creation Saga (Choreography style)
// 1. Order Service: Create order in "pending" state
// 2. Inventory Service: Reserve stock
// 3. Payment Service: Process payment
// 4. Order Service: Update order to "confirmed"

// Compensating transactions on failure
// Payment fails → Inventory: release reservation → Order: cancel order

class OrderSaga {
  async execute(orderData: CreateOrderDto): Promise<Order> {
    const sagaId = generateId();

    try {
      // Step 1: Create order (pending state)
      const order = await this.orderService.createPending(orderData, sagaId);

      // Step 2: Reserve stock
      await this.inventoryService.reserve(orderData.items, sagaId);

      // Step 3: Process payment
      await this.paymentService.charge(order.totalAmount, sagaId);

      // Step 4: Confirm order
      return await this.orderService.confirm(order.id);

    } catch (error) {
      // Execute compensating transactions
      await this.compensate(sagaId, error);
      throw error;
    }
  }

  private async compensate(sagaId: string, error: Error): Promise<void> {
    // Undo completed steps in reverse order
    await this.paymentService.refund(sagaId).catch(() => {});
    await this.inventoryService.releaseReservation(sagaId).catch(() => {});
    await this.orderService.cancel(sagaId).catch(() => {});
  }
}

Fault Tolerance and Resilience

In distributed systems, network failures and service outages should be treated as expected events, not exceptions.

Circuit Breaker Pattern

Detects consecutive failures and temporarily stops calls to the failing service.

class CircuitBreaker {
  private failureCount = 0;
  private lastFailureTime: Date | null = null;
  private state: 'CLOSED' | 'OPEN' | 'HALF_OPEN' = 'CLOSED';

  constructor(
    private threshold: number = 5,
    private timeout: number = 30000
  ) {}

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === 'OPEN') {
      if (Date.now() - this.lastFailureTime!.getTime() > this.timeout) {
        this.state = 'HALF_OPEN';
      } else {
        throw new Error('Circuit is OPEN');
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  private onSuccess(): void {
    this.failureCount = 0;
    this.state = 'CLOSED';
  }

  private onFailure(): void {
    this.failureCount++;
    this.lastFailureTime = new Date();
    if (this.failureCount >= this.threshold) {
      this.state = 'OPEN';
    }
  }
}

// Usage example
const inventoryCircuit = new CircuitBreaker(5, 30000);

async function checkStock(productId: string) {
  return inventoryCircuit.call(() =>
    fetch(`${INVENTORY_SERVICE_URL}/api/stock/${productId}`)
  );
}

Retry with Exponential Backoff

Attempts recovery from transient failures.

async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  maxRetries: number = 3,
  baseDelay: number = 1000
): Promise<T> {
  let lastError: Error;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error as Error;

      if (attempt < maxRetries - 1) {
        // Exponential backoff + jitter
        const delay = baseDelay * Math.pow(2, attempt) + Math.random() * 1000;
        await sleep(delay);
      }
    }
  }

  throw lastError!;
}

Timeout Configuration

Always set timeouts on all external calls.

async function fetchWithTimeout(
  url: string,
  options: RequestInit = {},
  timeout: number = 5000
): Promise<Response> {
  const controller = new AbortController();
  const timeoutId = setTimeout(() => controller.abort(), timeout);

  try {
    return await fetch(url, {
      ...options,
      signal: controller.signal
    });
  } finally {
    clearTimeout(timeoutId);
  }
}

Monitoring and Observability

In distributed systems, pinpointing problems becomes significantly harder. Use the three pillars of observability to maintain visibility.

1. Logs

Structured logging with correlation IDs makes requests traceable across services.

// Structured logging with correlation ID
const logger = {
  info: (message: string, context: object) => {
    console.log(JSON.stringify({
      level: 'info',
      message,
      timestamp: new Date().toISOString(),
      correlationId: getCorrelationId(),
      service: process.env.SERVICE_NAME,
      ...context
    }));
  }
};

// Usage example
logger.info('Order created', {
  orderId: order.id,
  customerId: order.customerId,
  totalAmount: order.totalAmount
});

2. Metrics

Monitor service health with Prometheus.

import { Counter, Histogram, Registry } from 'prom-client';

const registry = new Registry();

// Request counter
const httpRequestsTotal = new Counter({
  name: 'http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['method', 'path', 'status'],
  registers: [registry]
});

// Response time
const httpRequestDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'HTTP request duration in seconds',
  labelNames: ['method', 'path'],
  buckets: [0.1, 0.5, 1, 2, 5],
  registers: [registry]
});

// Measure via middleware
app.use((req, res, next) => {
  const start = Date.now();

  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    httpRequestsTotal.inc({ method: req.method, path: req.path, status: res.statusCode });
    httpRequestDuration.observe({ method: req.method, path: req.path }, duration);
  });

  next();
});

3. Traces

Visualize the flow of requests across services with OpenTelemetry.

import { trace, SpanKind } from '@opentelemetry/api';

const tracer = trace.getTracer('order-service');

async function createOrder(orderData: CreateOrderDto): Promise<Order> {
  return tracer.startActiveSpan('createOrder', async (span) => {
    try {
      span.setAttribute('order.customer_id', orderData.customerId);

      // Check inventory (child span)
      const stock = await tracer.startActiveSpan('checkInventory', {
        kind: SpanKind.CLIENT,
      }, async (childSpan) => {
        const result = await inventoryService.checkStock(orderData.productId);
        childSpan.end();
        return result;
      });

      // Create order
      const order = await orderRepository.create(orderData);
      span.setAttribute('order.id', order.id);

      return order;
    } catch (error) {
      span.recordException(error as Error);
      throw error;
    } finally {
      span.end();
    }
  });
}

Gradual Migration in Practice

The Strangler Fig Pattern

Rather than replacing the monolith all at once, migrate functionality piece by piece.

[Phase 1: Running in Parallel]
                    ┌─────────────┐
                    │ API Gateway │
                    └──────┬──────┘
                           │
            ┌──────────────┼──────────────┐
            ▼              ▼              ▼
    ┌───────────┐   ┌───────────┐   ┌───────────┐
    │ Monolith  │   │ Monolith  │   │ New Order │
    │  (Auth)   │   │ (Stock)   │   │  Service  │
    └───────────┘   └───────────┘   └───────────┘

[Phase 2: Feature Migration Complete]
                    ┌─────────────┐
                    │ API Gateway │
                    └──────┬──────┘
                           │
            ┌──────────────┼──────────────┐
            ▼              ▼              ▼
    ┌───────────┐   ┌───────────┐   ┌───────────┐
    │ New Auth  │   │ New Stock │   │   Order   │
    │  Service  │   │  Service  │   │  Service  │
    └───────────┘   └───────────┘   └───────────┘

Migration Priorities

Start with features that have fewer dependencies
Prioritize features that change frequently
Split along team boundaries
Plan data migration carefully

Conclusion: Making Microservices Work

Microservices is a powerful architectural style, but it is not a universal solution.

Before You Migrate, Confirm

Organizational readiness: DevOps culture, autonomous team structure
Technical readiness: Containers, CI/CD, monitoring infrastructure
Domain understanding: Deep familiarity with the business domain

Principles for Success

Start small: Begin with a single service
Migrate incrementally: Avoid Big Bang rewrites
Ensure observability: Logs, metrics, and traces
Design for failure: Circuit Breaker, Retry, Timeout

Migrating to microservices is as much an organizational transformation as it is a technical one. Take your time, move steadily, and success will follow.

Resources

If you're struggling with a microservices migration, feel free to reach out.