388 lines
11 KiB
Markdown
388 lines
11 KiB
Markdown
|
|
# Node & RPC Architecture Specification
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
This document specifies the architecture for ChainID 138 blockchain nodes and RPC endpoints that power the explorer platform. The architecture ensures high availability, performance, and reliability for block data retrieval and transaction submission.
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
```mermaid
|
||
|
|
flowchart TB
|
||
|
|
subgraph Nodes[ChainID 138 Nodes]
|
||
|
|
AN[Archive Node<br/>Full History]
|
||
|
|
TN[Tracing Node<br/>Call Traces]
|
||
|
|
VN[Validator Node<br/>Block Production]
|
||
|
|
end
|
||
|
|
|
||
|
|
subgraph LoadBalancer[Load Balancer Layer]
|
||
|
|
LB[NGINX/HAProxy<br/>SSL Termination]
|
||
|
|
end
|
||
|
|
|
||
|
|
subgraph RPCPool[RPC Endpoint Pool]
|
||
|
|
RPC1[RPC Endpoint 1<br/>Archive + Trace]
|
||
|
|
RPC2[RPC Endpoint 2<br/>Archive + Trace]
|
||
|
|
RPC3[RPC Endpoint 3<br/>Archive Only]
|
||
|
|
end
|
||
|
|
|
||
|
|
subgraph Services[Explorer Services]
|
||
|
|
Indexer[Indexer Service]
|
||
|
|
API[API Gateway]
|
||
|
|
Mempool[Mempool Service]
|
||
|
|
end
|
||
|
|
|
||
|
|
AN --> RPC1
|
||
|
|
TN --> RPC1
|
||
|
|
AN --> RPC2
|
||
|
|
TN --> RPC2
|
||
|
|
AN --> RPC3
|
||
|
|
|
||
|
|
LB --> RPC1
|
||
|
|
LB --> RPC2
|
||
|
|
LB --> RPC3
|
||
|
|
|
||
|
|
Indexer --> LB
|
||
|
|
API --> LB
|
||
|
|
Mempool --> LB
|
||
|
|
```
|
||
|
|
|
||
|
|
## Node Setup Requirements
|
||
|
|
|
||
|
|
### Archive Node Configuration
|
||
|
|
|
||
|
|
**Purpose**: Store complete blockchain history for historical queries and data indexing.
|
||
|
|
|
||
|
|
**Requirements**:
|
||
|
|
- **Archive mode**: Full historical state access (`--sync-mode=full --gcmode=archive`)
|
||
|
|
- **Storage**: High-capacity storage with SSD for recent blocks, HDD for cold storage
|
||
|
|
- **Memory**: Minimum 32GB RAM for efficient state access
|
||
|
|
- **Database**: Fast state database (LevelDB, RocksDB)
|
||
|
|
- **Network**: High-bandwidth connection for syncing
|
||
|
|
|
||
|
|
**Configuration Example** (Hyperledger Besu):
|
||
|
|
```toml
|
||
|
|
data-path="./data"
|
||
|
|
genesis-file="./genesis.json"
|
||
|
|
network-id=138
|
||
|
|
rpc-http-enabled=true
|
||
|
|
rpc-http-host="0.0.0.0"
|
||
|
|
rpc-http-port=8545
|
||
|
|
rpc-http-api=["ETH","NET","WEB3","ADMIN","DEBUG","TRACE","EEA","PRIV"]
|
||
|
|
rpc-http-cors-origins=["*"]
|
||
|
|
rpc-ws-enabled=true
|
||
|
|
rpc-ws-host="0.0.0.0"
|
||
|
|
rpc-ws-port=8546
|
||
|
|
rpc-ws-api=["ETH","NET","WEB3"]
|
||
|
|
sync-mode="FAST"
|
||
|
|
data-storage-format="BONSAI"
|
||
|
|
```
|
||
|
|
|
||
|
|
### Tracing Node Configuration
|
||
|
|
|
||
|
|
**Purpose**: Provide call trace capabilities for transaction debugging and internal transaction tracking.
|
||
|
|
|
||
|
|
**Requirements**:
|
||
|
|
- Archive mode enabled for historical trace access
|
||
|
|
- Debug and trace API endpoints enabled
|
||
|
|
- Additional CPU for trace computation
|
||
|
|
- Trace storage strategy (on-demand vs cached)
|
||
|
|
|
||
|
|
**Trace Types Supported**:
|
||
|
|
- `trace_call`: Single call trace
|
||
|
|
- `trace_block`: All traces in a block
|
||
|
|
- `trace_transaction`: All traces in a transaction
|
||
|
|
- `trace_replayBlockTransactions`: Replay with state
|
||
|
|
- `trace_replayTransaction`: Replay single transaction
|
||
|
|
|
||
|
|
### Node Health Monitoring
|
||
|
|
|
||
|
|
**Health Check Endpoints**:
|
||
|
|
- `eth_blockNumber`: Returns latest block number
|
||
|
|
- `eth_syncing`: Returns sync status
|
||
|
|
- `net_peerCount`: Returns connected peer count
|
||
|
|
|
||
|
|
**Health Check Criteria**:
|
||
|
|
- Latest block is within 5 blocks of network head
|
||
|
|
- Peer count > 5 (configurable)
|
||
|
|
- Response time < 500ms for simple queries
|
||
|
|
- No critical errors in logs
|
||
|
|
|
||
|
|
**Monitoring Metrics**:
|
||
|
|
- Block height lag vs network
|
||
|
|
- Peer connection count
|
||
|
|
- RPC request latency (p50, p95, p99)
|
||
|
|
- Error rate by endpoint
|
||
|
|
- CPU and memory usage
|
||
|
|
- Disk I/O and storage usage
|
||
|
|
|
||
|
|
## RPC Endpoint Architecture
|
||
|
|
|
||
|
|
### Load Balancing Strategy
|
||
|
|
|
||
|
|
**Distribution Method**: Round-robin with health-aware routing
|
||
|
|
|
||
|
|
**Failover Strategy**:
|
||
|
|
- Primary: Archive + Trace capable nodes
|
||
|
|
- Secondary: Archive-only nodes (fallback for non-trace requests)
|
||
|
|
- Automatic failover on health check failure
|
||
|
|
- Circuit breaker pattern to prevent cascading failures
|
||
|
|
|
||
|
|
**Session Affinity**: Not required for stateless RPC requests
|
||
|
|
|
||
|
|
### RPC Capabilities Matrix
|
||
|
|
|
||
|
|
| Capability | Archive Node | Tracing Node | Light Node |
|
||
|
|
|------------|--------------|--------------|------------|
|
||
|
|
| Historical blocks | ✅ Full | ✅ Full | ❌ Limited |
|
||
|
|
| Current state | ✅ Full | ✅ Full | ✅ Full |
|
||
|
|
| Call traces | ❌ No | ✅ Yes | ❌ No |
|
||
|
|
| Archive state queries | ✅ Yes | ✅ Yes | ❌ No |
|
||
|
|
| Transaction submission | ✅ Yes | ✅ Yes | ✅ Yes |
|
||
|
|
|
||
|
|
### Endpoint Types
|
||
|
|
|
||
|
|
#### Public RPC Endpoint
|
||
|
|
- **Internal URL**: `http://192.168.11.221:8545` (VMID 2201 - direct connection, recommended for internal services)
|
||
|
|
- **Public URL**: `https://rpc-http-pub.d-bis.org` (via proxy - for external access)
|
||
|
|
- **Authentication**: Optional API keys for rate limiting
|
||
|
|
- **Rate Limits**:
|
||
|
|
- Unauthenticated: 10 req/s
|
||
|
|
- Authenticated: 100 req/s (tiered)
|
||
|
|
- **CORS**: Enabled for web applications
|
||
|
|
|
||
|
|
#### Private RPC Endpoint
|
||
|
|
- **URL**: `https://rpc-http-prv.d-bis.org`
|
||
|
|
- **Authentication**: Required (authorized access)
|
||
|
|
- **Rate Limits**: Higher limits for authenticated users
|
||
|
|
- **CORS**: Enabled for authorized domains
|
||
|
|
|
||
|
|
#### Internal RPC Endpoint
|
||
|
|
- **URL**: `http://192.168.11.211:8545` (internal network - VMID 2101)
|
||
|
|
- **Authentication**: Network-based (firewall rules)
|
||
|
|
- **Rate Limits**: Higher limits for internal services
|
||
|
|
- **CORS**: Disabled
|
||
|
|
|
||
|
|
### WebSocket Endpoint
|
||
|
|
|
||
|
|
**Configuration**:
|
||
|
|
- **Public WS (Internal)**: `ws://192.168.11.221:8546` (VMID 2201 - direct connection, recommended for internal services)
|
||
|
|
- **Public WS (Public)**: `wss://rpc-ws-pub.d-bis.org` (via proxy - for external access)
|
||
|
|
- **Private WS (Internal)**: `ws://192.168.11.211:8546` (VMID 2101 - direct connection)
|
||
|
|
- **Private WS (Public)**: `wss://rpc-ws-prv.d-bis.org` (via proxy - for external access)
|
||
|
|
- **Subscriptions**: New blocks, pending transactions, logs
|
||
|
|
- **Connection Limits**: 100 concurrent connections per IP
|
||
|
|
- **Heartbeat**: 30-second ping/pong
|
||
|
|
|
||
|
|
**Subscription Types**:
|
||
|
|
- `newHeads`: New block headers
|
||
|
|
- `logs`: Event logs matching filter
|
||
|
|
- `pendingTransactions`: Pending transactions
|
||
|
|
- `syncing`: Sync status updates
|
||
|
|
|
||
|
|
## Rate Limiting and Access Control
|
||
|
|
|
||
|
|
### Rate Limiting Strategy
|
||
|
|
|
||
|
|
**Tiers**:
|
||
|
|
1. **Anonymous**: 10 requests/second, 100 requests/minute
|
||
|
|
2. **API Key (Free)**: 100 requests/second, 1000 requests/minute
|
||
|
|
3. **API Key (Pro)**: 500 requests/second, 5000 requests/minute
|
||
|
|
4. **Internal Services**: 1000 requests/second, unlimited
|
||
|
|
|
||
|
|
**Implementation**:
|
||
|
|
- Token bucket algorithm
|
||
|
|
- Per-IP and per-API-key limits
|
||
|
|
- Separate limits for different endpoint categories:
|
||
|
|
- Simple queries (eth_blockNumber, eth_getBalance): Higher limits
|
||
|
|
- Complex queries (trace_block, eth_call with large state): Lower limits
|
||
|
|
- Write operations (eth_sendTransaction): Strict limits
|
||
|
|
|
||
|
|
### Access Control
|
||
|
|
|
||
|
|
**API Key Management**:
|
||
|
|
- Key generation with secure random tokens
|
||
|
|
- Key metadata: name, tier, creation date, last used
|
||
|
|
- Key rotation support
|
||
|
|
- Revocation capability
|
||
|
|
|
||
|
|
**IP Whitelisting**:
|
||
|
|
- Support for IP whitelisting per API key
|
||
|
|
- CIDR notation support
|
||
|
|
- Geographic restrictions (optional)
|
||
|
|
|
||
|
|
**Rate Limit Headers**:
|
||
|
|
```
|
||
|
|
X-RateLimit-Limit: 100
|
||
|
|
X-RateLimit-Remaining: 95
|
||
|
|
X-RateLimit-Reset: 1640995200
|
||
|
|
```
|
||
|
|
|
||
|
|
## Failover Strategies
|
||
|
|
|
||
|
|
### Automatic Failover
|
||
|
|
|
||
|
|
**Detection**:
|
||
|
|
- Health checks every 10 seconds
|
||
|
|
- Consecutive failures threshold: 3
|
||
|
|
- Recovery threshold: 2 successful checks
|
||
|
|
|
||
|
|
**Failover Actions**:
|
||
|
|
1. Mark node as unhealthy
|
||
|
|
2. Route traffic to healthy nodes
|
||
|
|
3. Continue health checking for recovery
|
||
|
|
4. Log failover events
|
||
|
|
|
||
|
|
### Manual Failover
|
||
|
|
|
||
|
|
**Scenarios**:
|
||
|
|
- Planned maintenance
|
||
|
|
- Node performance degradation
|
||
|
|
- Security incidents
|
||
|
|
|
||
|
|
**Procedure**:
|
||
|
|
1. Drain connections (wait for active requests)
|
||
|
|
2. Remove from load balancer pool
|
||
|
|
3. Perform maintenance
|
||
|
|
4. Verify health
|
||
|
|
5. Re-add to pool
|
||
|
|
|
||
|
|
## Performance Requirements
|
||
|
|
|
||
|
|
### Latency Targets
|
||
|
|
|
||
|
|
| Operation Type | p50 Target | p95 Target | p99 Target |
|
||
|
|
|----------------|------------|------------|------------|
|
||
|
|
| Simple queries (blockNumber, balance) | < 50ms | < 100ms | < 200ms |
|
||
|
|
| Block queries | < 100ms | < 200ms | < 500ms |
|
||
|
|
| Transaction queries | < 150ms | < 300ms | < 1000ms |
|
||
|
|
| Trace queries | < 500ms | < 2000ms | < 5000ms |
|
||
|
|
| Historical state queries | < 1000ms | < 5000ms | < 10000ms |
|
||
|
|
|
||
|
|
### Throughput Targets
|
||
|
|
|
||
|
|
- **Read operations**: 1000 req/s per node
|
||
|
|
- **Write operations**: 100 req/s per node
|
||
|
|
- **WebSocket subscriptions**: 1000 concurrent per node
|
||
|
|
|
||
|
|
### Availability Targets
|
||
|
|
|
||
|
|
- **Uptime**: 99.9% (8.76 hours downtime/year)
|
||
|
|
- **Failover time**: < 30 seconds
|
||
|
|
- **Data consistency**: Zero data loss
|
||
|
|
|
||
|
|
## Security Considerations
|
||
|
|
|
||
|
|
### Network Security
|
||
|
|
|
||
|
|
- **TLS/SSL**: Required for public endpoints (TLS 1.2+)
|
||
|
|
- **Firewall Rules**: Restrict access to internal endpoints
|
||
|
|
- **DDoS Protection**: WAF/CDN integration for public endpoints
|
||
|
|
- **IP Filtering**: Support for IP allowlists/blocklists
|
||
|
|
|
||
|
|
### Authentication
|
||
|
|
|
||
|
|
- **API Keys**: HMAC-based authentication
|
||
|
|
- **Key Storage**: Encrypted storage with KMS
|
||
|
|
- **Key Rotation**: Support for periodic rotation
|
||
|
|
|
||
|
|
### Audit Logging
|
||
|
|
|
||
|
|
- All RPC requests logged (with PII sanitization)
|
||
|
|
- Log retention: 90 days
|
||
|
|
- Log fields: timestamp, IP, API key (hash), endpoint, response code, latency
|
||
|
|
|
||
|
|
## Integration Points
|
||
|
|
|
||
|
|
### Indexer Service Integration
|
||
|
|
|
||
|
|
- Primary connection to archive nodes
|
||
|
|
- Batch requests for efficiency
|
||
|
|
- Connection pooling
|
||
|
|
- Retry logic with exponential backoff
|
||
|
|
|
||
|
|
### API Gateway Integration
|
||
|
|
|
||
|
|
- Proxy requests to RPC pool
|
||
|
|
- Add authentication/authorization
|
||
|
|
- Apply rate limiting
|
||
|
|
- Cache common queries
|
||
|
|
|
||
|
|
### Mempool Service Integration
|
||
|
|
|
||
|
|
- WebSocket subscription to pending transactions
|
||
|
|
- Direct connection for transaction submission
|
||
|
|
- Priority queuing for indexer submissions
|
||
|
|
|
||
|
|
## Implementation Guidelines
|
||
|
|
|
||
|
|
### Node Deployment
|
||
|
|
|
||
|
|
**Infrastructure**:
|
||
|
|
- Use container orchestration (Kubernetes) for scalability
|
||
|
|
- Separate deployments for archive and tracing nodes
|
||
|
|
- Horizontal scaling based on load
|
||
|
|
|
||
|
|
**Configuration Management**:
|
||
|
|
- Use configuration files (TOML/YAML)
|
||
|
|
- Environment-specific overrides
|
||
|
|
- Secrets management via KMS/Vault
|
||
|
|
|
||
|
|
### Monitoring Integration
|
||
|
|
|
||
|
|
**Metrics to Export**:
|
||
|
|
- Prometheus metrics for node health
|
||
|
|
- Custom metrics for RPC performance
|
||
|
|
- Integration with Grafana dashboards
|
||
|
|
|
||
|
|
**Alerting Rules**:
|
||
|
|
- Node down alert
|
||
|
|
- High latency alert
|
||
|
|
- High error rate alert
|
||
|
|
- Low peer count alert
|
||
|
|
|
||
|
|
## Testing Strategy
|
||
|
|
|
||
|
|
### Unit Tests
|
||
|
|
|
||
|
|
- RPC endpoint implementations
|
||
|
|
- Rate limiting logic
|
||
|
|
- Health check logic
|
||
|
|
|
||
|
|
### Integration Tests
|
||
|
|
|
||
|
|
- Load balancer failover
|
||
|
|
- Rate limiting enforcement
|
||
|
|
- Authentication/authorization
|
||
|
|
|
||
|
|
### Load Tests
|
||
|
|
|
||
|
|
- Simulate production load
|
||
|
|
- Test failover scenarios
|
||
|
|
- Validate performance targets
|
||
|
|
|
||
|
|
## Migration from Existing Setup
|
||
|
|
|
||
|
|
### Current Setup (Blockscout Integration)
|
||
|
|
|
||
|
|
- **Public RPC (Internal)**: `http://192.168.11.221:8545` (VMID 2201 - recommended for internal services)
|
||
|
|
- **Public RPC (Public)**: `https://rpc-http-pub.d-bis.org` (via proxy - for external access)
|
||
|
|
- **Private RPC (Internal)**: `http://192.168.11.211:8545` (VMID 2101)
|
||
|
|
- **Private RPC (Public)**: `https://rpc-http-prv.d-bis.org` (via proxy)
|
||
|
|
- **Deprecated**: `https://rpc-core.d-bis.org` (no longer public, internal only)
|
||
|
|
- **Internal RPC**: `http://192.168.11.250:8545`
|
||
|
|
- **WebSocket**: `ws://192.168.11.250:8546`
|
||
|
|
|
||
|
|
### Migration Steps
|
||
|
|
|
||
|
|
1. Deploy new load balancer layer
|
||
|
|
2. Configure health checks
|
||
|
|
3. Gradually migrate traffic
|
||
|
|
4. Monitor performance
|
||
|
|
5. Decommission old direct connections
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- Existing Blockscout deployment: `docs/BLOCKSCOUT_COMPLETE_SUMMARY.md`
|
||
|
|
- Network architecture: See `network-topology.md`
|
||
|
|
- API Gateway: See `../api/api-gateway.md`
|
||
|
|
|