Back to Blog
API authentication
API testing
API standards
API rate limiting

API Load Testing and Performance Tuning

4 min read
K
Kevin
API Security Specialist

API Load Testing and Performance Tuning

Introduction to API Load Testing

API load testing evaluates how an API performs under expected and peak traffic conditions. It helps identify bottlenecks, measure response times, and ensure reliability. Modern APIs must handle thousands of requests per second while maintaining low latency and high availability.

Key Metrics to Monitor

  1. Throughput: Requests processed per second (RPS)
  2. Latency: Time between request and response (P50, P90, P99)
  3. Error Rate: Percentage of failed requests
  4. Resource Utilization: CPU, memory, and network usage

Load Testing Tools

1. k6 (Open Source)

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '30s', target: 100 },  // Ramp-up
    { duration: '1m', target: 100 },   // Maintain
    { duration: '30s', target: 0 },    // Ramp-down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],  // 95% < 500ms
    http_req_failed: ['rate<0.01'],    // <1% errors
  },
};

export default function () {
  const res = http.get('https://api.example.com/v1/users');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });
  sleep(1);
}

2. Locust (Python-based)

from locust import HttpUser, task, between

class ApiUser(HttpUser):
    wait_time = between(1, 5)

    @task
    def get_users(self):
        self.client.get("/v1/users")
    
    @task(3)  # 3x more frequent
    def create_user(self):
        self.client.post("/v1/users", json={
            "name": "test",
            "email": "test@example.com"
        })

Performance Tuning Strategies

Database Optimization

  1. Indexing: Add indexes for frequently queried fields
  2. Query Optimization: Use EXPLAIN ANALYZE to identify slow queries
  3. Connection Pooling: Reuse database connections
-- PostgreSQL example
CREATE INDEX idx_users_email ON users(email);
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'test@example.com';

Caching Strategies

  1. Redis Cache: Store frequently accessed data
  2. CDN Caching: For static API responses
  3. HTTP Caching Headers: Cache-Control, ETag
# Flask caching example
from flask import Flask
from flask_caching import Cache

app = Flask(__name__)
cache = Cache(app, config={'CACHE_TYPE': 'RedisCache'})

@app.route('/v1/users/<user_id>')
@cache.cached(timeout=300)  # 5 minutes
def get_user(user_id):
    return db.get_user(user_id)

Horizontal Scaling

  1. Container Orchestration: Kubernetes with auto-scaling
  2. Load Balancing: Distribute traffic evenly
  3. Stateless Design: Enable easy scaling
# Kubernetes HPA example
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-deployment
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Advanced Techniques

Rate Limiting

// Go rate limiting middleware example
func RateLimit(next http.Handler) http.Handler {
    limiter := rate.NewLimiter(100, 200) // 100 RPS, burst of 200
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        if !limiter.Allow() {
            http.Error(w, "Too many requests", http.StatusTooManyRequests)
            return
        }
        next.ServeHTTP(w, r)
    })
}

Connection Pool Tuning

// HikariCP configuration example
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://localhost/api_db");
config.setUsername("user");
config.setPassword("password");
config.setMaximumPoolSize(20);
config.setConnectionTimeout(30000);
config.setIdleTimeout(600000);
config.setMaxLifetime(1800000);

HikariDataSource ds = new HikariDataSource(config);

Real-World Testing Scenario

E-commerce API Load Test

// k6 test for checkout flow
import http from 'k6/http';
import { check, group } from 'k6';

export const options = {
  scenarios: {
    spike: {
      executor: 'ramping-arrival-rate',
      startRate: 50,
      timeUnit: '1s',
      preAllocatedVUs: 50,
      maxVUs: 500,
      stages: [
        { target: 200, duration: '30s' },  // Black Friday spike
        { target: 50, duration: '1m' },   // Normal traffic
      ],
    },
  },
};

export default function () {
  group('checkout flow', function () {
    const product = http.get('https://api.store.com/products/123');
    check(product, { 'product status 200': (r) => r.status === 200 });

    const cart = http.post('https://api.store.com/cart', JSON.stringify({
      productId: 123,
      quantity: 1
    }));
    check(cart, { 'cart status 201': (r) => r.status === 201 });

    const checkout = http.post('https://api.store.com/checkout', JSON.stringify({
      cartId: cart.json().cartId,
      payment: { /* ... */ }
    }));
    check(checkout, { 'checkout status 200': (r) => r.status === 200 });
  });
}

Monitoring and Analysis

Prometheus + Grafana Setup

# prometheus.yml example
scrape_configs:
  - job_name: 'api'
    metrics_path: '/metrics'
    static_configs:
      - targets: ['api-service:8080']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox:9115

Key Grafana panels to create:

  1. Requests per second
  2. Error rate percentage
  3. 95th percentile latency
  4. Database query duration
  5. CPU/memory usage

Cloud-Specific Considerations

AWS API Gateway Tuning

resource "aws_api_gateway_rest_api" "example" {
  name = "example-api"
}

resource "aws_api_gateway_stage" "prod" {
  stage_name    = "prod"
  rest_api_id   = aws_api_gateway_rest_api.example.id
  deployment_id = aws_api_gateway_deployment.example.id

  cache_cluster_enabled = true
  cache_cluster_size    = "0.5"  # 0.5GB cache
}

resource "aws_api_gateway_usage_plan" "pro" {
  name = "pro-plan"

  throttle_settings {
    burst_limit = 1000
    rate_limit  = 500
  }
}
  1. Machine learning for predictive scaling
Back to Blog