explorer-monorepo/docs/specs/api/api-gateway.md

# API Gateway Architecture Specification

## Overview

This document specifies the API Gateway architecture that provides unified edge API access, authentication, authorization, rate limiting, and request routing for all explorer platform services.

## Architecture

```mermaid
flowchart TB
    subgraph Clients
        Web[Web Clients]
        Mobile[Mobile Apps]
        API_Client[API Clients]
    end

    subgraph Gateway[API Gateway]
        Router[Request Router]
        Auth[Authentication]
        RateLimit[Rate Limiter]
        Transform[Request Transformer]
        Cache[Response Cache]
        LB[Load Balancer]
    end

    subgraph Services[Backend Services]
        Explorer[Explorer API]
        Search[Search Service]
        Analytics[Analytics Service]
        Actions[Action Service]
        Banking[Banking API]
    end

    Clients --> Router
    Router --> Auth
    Auth --> RateLimit
    RateLimit --> Transform
    Transform --> Cache
    Cache --> LB
    LB --> Explorer
    LB --> Search
    LB --> Analytics
    LB --> Actions
    LB --> Banking
```

## Gateway Components

### Request Router

**Purpose**: Route requests to appropriate backend services based on path, method, and headers.

**Routing Rules**:
- `/api/v1/blocks/*` → Explorer API
- `/api/v1/transactions/*` → Explorer API
- `/api/v1/addresses/*` → Explorer API
- `/api/v1/search/*` → Search Service
- `/api/v1/analytics/*` → Analytics Service
- `/api/v1/swap/*` → Action Service
- `/api/v1/bridge/*` → Action Service
- `/api/v1/banking/*` → Banking API
- `/graphql` → Explorer API (GraphQL endpoint)

**Implementation**: NGINX, Kong, AWS API Gateway, or custom router

### Authentication

**Purpose**: Authenticate requests and extract user/API key information.

**Methods**:
1. **API Key Authentication**:
   - Header: `X-API-Key: <key>`
   - Query parameter: `?api_key=<key>` (less secure)
   - Validate key hash against database
   - Extract user_id and tier from key

2. **OAuth 2.0**:
   - Bearer token: `Authorization: Bearer <token>`
   - Validate JWT token
   - Extract user claims

3. **Session Authentication**:
   - Cookie-based for web clients
   - Validate session token

**Anonymous Access**: Allow unauthenticated requests with lower rate limits

### Authorization

**Purpose**: Authorize requests based on user permissions and API key tier.

**Authorization Checks**:
- API key tier (free, pro, enterprise)
- User roles and permissions
- Resource-level permissions (user's own data)
- IP whitelist restrictions

**Enforcement**:
- Block unauthorized requests (403 Forbidden)
- Filter responses based on permissions
- Log authorization failures

### Rate Limiting

**Purpose**: Prevent abuse and ensure fair usage.

**Rate Limiting Strategy**:

**Tiers**:
1. **Anonymous**: 10 req/s, 100 req/min
2. **Free API Key**: 100 req/s, 1000 req/min
3. **Pro API Key**: 500 req/s, 5000 req/min
4. **Enterprise API Key**: 1000 req/s, unlimited

**Per-Endpoint Limits**:
- Simple queries (GET): Higher limits
- Complex queries (search, analytics): Lower limits
- Write operations (POST): Strict limits

**Implementation**:
- Token bucket algorithm
- Redis for distributed rate limiting
- Sliding window for smooth rate limiting

**Rate Limit Headers**:
```
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200
```

**Response on Limit Exceeded**:
```json
{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Please try again later.",
  "retry_after": 60
}
```
HTTP Status: 429 Too Many Requests

### Request Transformation

**Purpose**: Transform requests before forwarding to backend services.

**Transformations**:
- Add user context headers (user_id, tier)
- Normalize request format
- Add tracing headers (request_id, trace_id)
- Validate and sanitize input
- Add default parameters

**Example Headers Added**:
```
X-User-ID: <uuid>
X-API-Tier: pro
X-Request-ID: <uuid>
X-Trace-ID: <uuid>
```

### Response Caching

**Purpose**: Cache responses to reduce backend load and improve latency.

**Cacheable Endpoints**:
- Block data (cache for 10 seconds)
- Token metadata (cache for 1 hour)
- Contract ABIs (cache for 1 hour)
- Analytics aggregates (cache for 5 minutes)

**Cache Keys**:
- Include path, query parameters, API key tier
- Exclude user-specific parameters

**Cache Headers**:
- `Cache-Control: public, max-age=60`
- `ETag` for conditional requests
- `Vary: X-API-Key` to vary by tier

**Cache Invalidation**:
- Time-based expiration
- Manual invalidation on data updates
- Event-driven invalidation

### Load Balancing

**Purpose**: Distribute requests across backend service instances.

**Strategies**:
- Round-robin: Even distribution
- Least connections: Send to least loaded instance
- Health-aware: Skip unhealthy instances

**Health Checks**:
- HTTP health check endpoint: `/health`
- Check interval: 10 seconds
- Unhealthy threshold: 3 consecutive failures
- Recovery threshold: 2 successful checks

## Request/Response Flow

### Request Flow

1. **Client Request** → Gateway
2. **Routing** → Determine target service
3. **Authentication** → Validate credentials
4. **Authorization** → Check permissions
5. **Rate Limiting** → Check limits
6. **Cache Check** → Return cached if available
7. **Request Transformation** → Add headers, normalize
8. **Load Balancing** → Select backend instance
9. **Forward Request** → Send to backend service
10. **Response Handling** → Transform, cache, return

### Response Flow

1. **Backend Response** → Gateway
2. **Response Transformation** → Add headers, format
3. **Cache Storage** → Store if cacheable
4. **Error Handling** → Format errors consistently
5. **Response to Client** → Return response

## Error Handling

### Error Format

**Standard Error Response**:
```json
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please try again later.",
    "details": {},
    "request_id": "uuid",
    "timestamp": "2024-01-01T00:00:00Z"
  }
}
```

### Error Codes

- `400 Bad Request`: Invalid request
- `401 Unauthorized`: Authentication required
- `403 Forbidden`: Authorization failed
- `404 Not Found`: Resource not found
- `429 Too Many Requests`: Rate limit exceeded
- `500 Internal Server Error`: Server error
- `502 Bad Gateway`: Backend service unavailable
- `503 Service Unavailable`: Service temporarily unavailable
- `504 Gateway Timeout`: Backend timeout

## Security Considerations

### Request Validation

- Validate request size (max 10MB)
- Validate content-type
- Sanitize input to prevent injection
- Validate API key format

### DDoS Protection

- Rate limiting per IP
- Connection limits
- Request size limits
- WAF integration (Cloudflare)

### TLS/SSL

- Require HTTPS for all requests
- TLS 1.2+ only
- Strong cipher suites
- Certificate management

## Monitoring and Observability

### Metrics

**Key Metrics**:
- Request rate (requests/second)
- Response time (p50, p95, p99)
- Error rate (by error type)
- Cache hit rate
- Rate limit hits
- Backend service health

### Logging

**Log Fields**:
- Request ID
- User ID / API Key
- Method, path, query parameters
- Response status, latency
- Backend service called
- Error messages

**Log Level**: INFO for normal requests, ERROR for failures

### Tracing

- Add trace ID to requests
- Propagate trace ID to backend services
- Correlate requests across services

## Configuration

### Gateway Configuration

```yaml
gateway:
  port: 443
  health_check_path: /health

  rate_limiting:
    redis_url: "redis://..."
    default_limit: 100
    default_period: 60

  cache:
    redis_url: "redis://..."
    default_ttl: 60

  services:
    explorer_api:
      base_url: "http://explorer-api:8080"
      health_check: "/health"

    search_service:
      base_url: "http://search-service:8080"
      health_check: "/health"
```

## Implementation Options

### Option 1: NGINX + Lua

- High performance
- Custom logic with Lua
- Good caching support

### Option 2: Kong

- API Gateway features out of the box
- Plugin ecosystem
- Good rate limiting

### Option 3: AWS API Gateway

- Managed service
- Good integration with AWS services
- Cost considerations

### Option 4: Custom Gateway (Go/Node.js)

- Full control
- Custom features
- More maintenance

**Recommendation**: Start with Kong or NGINX, migrate to custom if needed.

## References

- REST API: See `rest-api.md`
- Authentication: See `../security/auth-spec.md`
- Rate Limiting: See `../security/ddos-protection.md`