Files
explorer-monorepo/docs/specs/api/api-gateway.md

367 lines
8.4 KiB
Markdown

# API Gateway Architecture Specification
## Overview
This document specifies the API Gateway architecture that provides unified edge API access, authentication, authorization, rate limiting, and request routing for all explorer platform services.
## Architecture
```mermaid
flowchart TB
subgraph Clients
Web[Web Clients]
Mobile[Mobile Apps]
API_Client[API Clients]
end
subgraph Gateway[API Gateway]
Router[Request Router]
Auth[Authentication]
RateLimit[Rate Limiter]
Transform[Request Transformer]
Cache[Response Cache]
LB[Load Balancer]
end
subgraph Services[Backend Services]
Explorer[Explorer API]
Search[Search Service]
Analytics[Analytics Service]
Actions[Action Service]
Banking[Banking API]
end
Clients --> Router
Router --> Auth
Auth --> RateLimit
RateLimit --> Transform
Transform --> Cache
Cache --> LB
LB --> Explorer
LB --> Search
LB --> Analytics
LB --> Actions
LB --> Banking
```
## Gateway Components
### Request Router
**Purpose**: Route requests to appropriate backend services based on path, method, and headers.
**Routing Rules**:
- `/api/v1/blocks/*` → Explorer API
- `/api/v1/transactions/*` → Explorer API
- `/api/v1/addresses/*` → Explorer API
- `/api/v1/search/*` → Search Service
- `/api/v1/analytics/*` → Analytics Service
- `/api/v1/swap/*` → Action Service
- `/api/v1/bridge/*` → Action Service
- `/api/v1/banking/*` → Banking API
- `/graphql` → Explorer API (GraphQL endpoint)
**Implementation**: NGINX, Kong, AWS API Gateway, or custom router
### Authentication
**Purpose**: Authenticate requests and extract user/API key information.
**Methods**:
1. **API Key Authentication**:
- Header: `X-API-Key: <key>`
- Query parameter: `?api_key=<key>` (less secure)
- Validate key hash against database
- Extract user_id and tier from key
2. **OAuth 2.0**:
- Bearer token: `Authorization: Bearer <token>`
- Validate JWT token
- Extract user claims
3. **Session Authentication**:
- Cookie-based for web clients
- Validate session token
**Anonymous Access**: Allow unauthenticated requests with lower rate limits
### Authorization
**Purpose**: Authorize requests based on user permissions and API key tier.
**Authorization Checks**:
- API key tier (free, pro, enterprise)
- User roles and permissions
- Resource-level permissions (user's own data)
- IP whitelist restrictions
**Enforcement**:
- Block unauthorized requests (403 Forbidden)
- Filter responses based on permissions
- Log authorization failures
### Rate Limiting
**Purpose**: Prevent abuse and ensure fair usage.
**Rate Limiting Strategy**:
**Tiers**:
1. **Anonymous**: 10 req/s, 100 req/min
2. **Free API Key**: 100 req/s, 1000 req/min
3. **Pro API Key**: 500 req/s, 5000 req/min
4. **Enterprise API Key**: 1000 req/s, unlimited
**Per-Endpoint Limits**:
- Simple queries (GET): Higher limits
- Complex queries (search, analytics): Lower limits
- Write operations (POST): Strict limits
**Implementation**:
- Token bucket algorithm
- Redis for distributed rate limiting
- Sliding window for smooth rate limiting
**Rate Limit Headers**:
```
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200
```
**Response on Limit Exceeded**:
```json
{
"error": "rate_limit_exceeded",
"message": "Rate limit exceeded. Please try again later.",
"retry_after": 60
}
```
HTTP Status: 429 Too Many Requests
### Request Transformation
**Purpose**: Transform requests before forwarding to backend services.
**Transformations**:
- Add user context headers (user_id, tier)
- Normalize request format
- Add tracing headers (request_id, trace_id)
- Validate and sanitize input
- Add default parameters
**Example Headers Added**:
```
X-User-ID: <uuid>
X-API-Tier: pro
X-Request-ID: <uuid>
X-Trace-ID: <uuid>
```
### Response Caching
**Purpose**: Cache responses to reduce backend load and improve latency.
**Cacheable Endpoints**:
- Block data (cache for 10 seconds)
- Token metadata (cache for 1 hour)
- Contract ABIs (cache for 1 hour)
- Analytics aggregates (cache for 5 minutes)
**Cache Keys**:
- Include path, query parameters, API key tier
- Exclude user-specific parameters
**Cache Headers**:
- `Cache-Control: public, max-age=60`
- `ETag` for conditional requests
- `Vary: X-API-Key` to vary by tier
**Cache Invalidation**:
- Time-based expiration
- Manual invalidation on data updates
- Event-driven invalidation
### Load Balancing
**Purpose**: Distribute requests across backend service instances.
**Strategies**:
- Round-robin: Even distribution
- Least connections: Send to least loaded instance
- Health-aware: Skip unhealthy instances
**Health Checks**:
- HTTP health check endpoint: `/health`
- Check interval: 10 seconds
- Unhealthy threshold: 3 consecutive failures
- Recovery threshold: 2 successful checks
## Request/Response Flow
### Request Flow
1. **Client Request** → Gateway
2. **Routing** → Determine target service
3. **Authentication** → Validate credentials
4. **Authorization** → Check permissions
5. **Rate Limiting** → Check limits
6. **Cache Check** → Return cached if available
7. **Request Transformation** → Add headers, normalize
8. **Load Balancing** → Select backend instance
9. **Forward Request** → Send to backend service
10. **Response Handling** → Transform, cache, return
### Response Flow
1. **Backend Response** → Gateway
2. **Response Transformation** → Add headers, format
3. **Cache Storage** → Store if cacheable
4. **Error Handling** → Format errors consistently
5. **Response to Client** → Return response
## Error Handling
### Error Format
**Standard Error Response**:
```json
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Please try again later.",
"details": {},
"request_id": "uuid",
"timestamp": "2024-01-01T00:00:00Z"
}
}
```
### Error Codes
- `400 Bad Request`: Invalid request
- `401 Unauthorized`: Authentication required
- `403 Forbidden`: Authorization failed
- `404 Not Found`: Resource not found
- `429 Too Many Requests`: Rate limit exceeded
- `500 Internal Server Error`: Server error
- `502 Bad Gateway`: Backend service unavailable
- `503 Service Unavailable`: Service temporarily unavailable
- `504 Gateway Timeout`: Backend timeout
## Security Considerations
### Request Validation
- Validate request size (max 10MB)
- Validate content-type
- Sanitize input to prevent injection
- Validate API key format
### DDoS Protection
- Rate limiting per IP
- Connection limits
- Request size limits
- WAF integration (Cloudflare)
### TLS/SSL
- Require HTTPS for all requests
- TLS 1.2+ only
- Strong cipher suites
- Certificate management
## Monitoring and Observability
### Metrics
**Key Metrics**:
- Request rate (requests/second)
- Response time (p50, p95, p99)
- Error rate (by error type)
- Cache hit rate
- Rate limit hits
- Backend service health
### Logging
**Log Fields**:
- Request ID
- User ID / API Key
- Method, path, query parameters
- Response status, latency
- Backend service called
- Error messages
**Log Level**: INFO for normal requests, ERROR for failures
### Tracing
- Add trace ID to requests
- Propagate trace ID to backend services
- Correlate requests across services
## Configuration
### Gateway Configuration
```yaml
gateway:
port: 443
health_check_path: /health
rate_limiting:
redis_url: "redis://..."
default_limit: 100
default_period: 60
cache:
redis_url: "redis://..."
default_ttl: 60
services:
explorer_api:
base_url: "http://explorer-api:8080"
health_check: "/health"
search_service:
base_url: "http://search-service:8080"
health_check: "/health"
```
## Implementation Options
### Option 1: NGINX + Lua
- High performance
- Custom logic with Lua
- Good caching support
### Option 2: Kong
- API Gateway features out of the box
- Plugin ecosystem
- Good rate limiting
### Option 3: AWS API Gateway
- Managed service
- Good integration with AWS services
- Cost considerations
### Option 4: Custom Gateway (Go/Node.js)
- Full control
- Custom features
- More maintenance
**Recommendation**: Start with Kong or NGINX, migrate to custom if needed.
## References
- REST API: See `rest-api.md`
- Authentication: See `../security/auth-spec.md`
- Rate Limiting: See `../security/ddos-protection.md`