API Load Testing and Performance Tuning

Introduction to API Load Testing

API load testing evaluates how an API performs under expected and peak traffic conditions. It helps identify bottlenecks, measure response times, and ensure reliability. Modern APIs must handle thousands of requests per second while maintaining low latency and high availability.

Key Metrics to Monitor

Throughput: Requests processed per second (RPS)
Latency: Time between request and response (P50, P90, P99)
Error Rate: Percentage of failed requests
Resource Utilization: CPU, memory, and network usage

Load Testing Tools

1. k6 (Open Source)

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 100 },  // Ramp-up
    { duration: '1m', target: 100 },   // Maintain
    { duration: '30s', target: 0 },    // Ramp-down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],  // 95% < 500ms
    http_req_failed: ['rate<0.01'],    // <1% errors
  },
};

export default function () {
  const res = http.get('https://api.example.com/v1/users');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });
  sleep(1);
}

2. Locust (Python-based)

from locust import HttpUser, task, between

class ApiUser(HttpUser):
    wait_time = between(1, 5)

    @task
    def get_users(self):
        self.client.get("/v1/users")
    
    @task(3)  # 3x more frequent
    def create_user(self):
        self.client.post("/v1/users", json={
            "name": "test",
            "email": "test@example.com"
        })

Performance Tuning Strategies

Database Optimization

Indexing: Add indexes for frequently queried fields
Query Optimization: Use EXPLAIN ANALYZE to identify slow queries
Connection Pooling: Reuse database connections

-- PostgreSQL example
CREATE INDEX idx_users_email ON users(email);
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';

Caching Strategies

Redis Cache: Store frequently accessed data
CDN Caching: For static API responses
HTTP Caching Headers: Cache-Control, ETag

# Flask caching example
from flask import Flask
from flask_caching import Cache

app = Flask(__name__)
cache = Cache(app, config={'CACHE_TYPE': 'RedisCache'})

@app.route('/v1/users/<user_id>')
@cache.cached(timeout=300)  # 5 minutes
def get_user(user_id):
    return db.get_user(user_id)

Horizontal Scaling

Container Orchestration: Kubernetes with auto-scaling
Load Balancing: Distribute traffic evenly
Stateless Design: Enable easy scaling

# Kubernetes HPA example
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-deployment
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Advanced Techniques

Rate Limiting

// Go rate limiting middleware example
func RateLimit(next http.Handler) http.Handler {
    limiter := rate.NewLimiter(100, 200) // 100 RPS, burst of 200
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        if !limiter.Allow() {
            http.Error(w, "Too many requests", http.StatusTooManyRequests)
            return
        }
        next.ServeHTTP(w, r)
    })
}

Connection Pool Tuning

// HikariCP configuration example
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://localhost/api_db");
config.setUsername("user");
config.setPassword("password");
config.setMaximumPoolSize(20);
config.setConnectionTimeout(30000);
config.setIdleTimeout(600000);
config.setMaxLifetime(1800000);

HikariDataSource ds = new HikariDataSource(config);

Real-World Testing Scenario

E-commerce API Load Test

// k6 test for checkout flow
import http from 'k6/http';
import { check, group } from 'k6';

export const options = {
  scenarios: {
    spike: {
      executor: 'ramping-arrival-rate',
      startRate: 50,
      timeUnit: '1s',
      preAllocatedVUs: 50,
      maxVUs: 500,
      stages: [
        { target: 200, duration: '30s' },  // Black Friday spike
        { target: 50, duration: '1m' },   // Normal traffic
      ],
    },
  },
};

export default function () {
  group('checkout flow', function () {
    const product = http.get('https://api.store.com/products/123');
    check(product, { 'product status 200': (r) => r.status === 200 });

    const cart = http.post('https://api.store.com/cart', JSON.stringify({
      productId: 123,
      quantity: 1
    }));
    check(cart, { 'cart status 201': (r) => r.status === 201 });

    const checkout = http.post('https://api.store.com/checkout', JSON.stringify({
      cartId: cart.json().cartId,
      payment: { /* ... */ }
    }));
    check(checkout, { 'checkout status 200': (r) => r.status === 200 });
  });
}

Monitoring and Analysis

Prometheus + Grafana Setup

# prometheus.yml example
scrape_configs:
  - job_name: 'api'
    metrics_path: '/metrics'
    static_configs:
      - targets: ['api-service:8080']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox:9115

Key Grafana panels to create:

Requests per second
Error rate percentage
95th percentile latency
Database query duration
CPU/memory usage

Cloud-Specific Considerations

AWS API Gateway Tuning

resource "aws_api_gateway_rest_api" "example" {
  name = "example-api"
}

resource "aws_api_gateway_stage" "prod" {
  stage_name    = "prod"
  rest_api_id   = aws_api_gateway_rest_api.example.id
  deployment_id = aws_api_gateway_deployment.example.id

  cache_cluster_enabled = true
  cache_cluster_size    = "0.5"  # 0.5GB cache
}

resource "aws_api_gateway_usage_plan" "pro" {
  name = "pro-plan"

  throttle_settings {
    burst_limit = 1000
    rate_limit  = 500
  }
}

Machine learning for predictive scaling

Back to Blog