- Introduced Aggregator.sol for Chainlink-compatible oracle functionality, including round-based updates and access control. - Added OracleWithCCIP.sol to extend Aggregator with CCIP cross-chain messaging capabilities. - Created .gitmodules to include OpenZeppelin contracts as a submodule. - Developed a comprehensive deployment guide in NEXT_STEPS_COMPLETE_GUIDE.md for Phase 2 and smart contract deployment. - Implemented Vite configuration for the orchestration portal, supporting both Vue and React frameworks. - Added server-side logic for the Multi-Cloud Orchestration Portal, including API endpoints for environment management and monitoring. - Created scripts for resource import and usage validation across non-US regions. - Added tests for CCIP error handling and integration to ensure robust functionality. - Included various new files and directories for the orchestration portal and deployment scripts.
6.3 KiB
6.3 KiB
Troubleshooting Guide
Common Issues and Solutions
Network Issues
Blocks Not Being Produced
Symptoms: No new blocks, validators not responding
Diagnosis:
# Check validator status
kubectl get pods -n besu-network -l component=validator
# Check logs
kubectl logs -n besu-network <validator-pod> --tail=100
# Check block number
curl -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
http://<rpc-endpoint>
Solutions:
- Restart validators:
kubectl rollout restart statefulset/besu-validator -n besu-network - Check network connectivity
- Verify validator keys
- Check IBFT configuration
- Verify genesis file
Validators Not Peering
Symptoms: Validators not connecting to each other
Diagnosis:
# Check peer count
kubectl exec -n besu-network <validator-pod> -- \
curl -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"admin_peers","params":[],"id":1}' \
http://localhost:8545
# Check static nodes
kubectl get configmap besu-validator-config -n besu-network -o yaml
Solutions:
- Verify static-nodes.json configuration
- Check network policies
- Verify firewall rules
- Check P2P port (30303) connectivity
- Verify enode addresses
RPC Issues
RPC Endpoints Not Responding
Symptoms: RPC calls failing, timeouts
Diagnosis:
# Check RPC pod status
kubectl get pods -n besu-network -l component=rpc
# Check logs
kubectl logs -n besu-network <rpc-pod> --tail=100
# Test RPC endpoint
curl -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
http://<rpc-endpoint>
Solutions:
- Restart RPC pods:
kubectl rollout restart statefulset/besu-rpc -n besu-network - Check Application Gateway status
- Verify network policies
- Check rate limiting
- Scale RPC nodes if needed
High Latency
Symptoms: Slow RPC responses
Diagnosis:
# Check pod resources
kubectl top pods -n besu-network -l component=rpc
# Check metrics
curl http://<rpc-pod>:9545/metrics
# Check sync status
curl -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' \
http://<rpc-endpoint>
Solutions:
- Scale RPC nodes
- Increase resource limits
- Check disk I/O
- Verify network connectivity
- Check for sync issues
Oracle Issues
Oracle Not Updating
Symptoms: Oracle price not updating, circuit breaker open
Diagnosis:
# Check oracle publisher status
kubectl get pods -n besu-network -l app=oracle-publisher
# Check logs
kubectl logs -n besu-network <oracle-pod> --tail=100
# Check health endpoint
curl http://<oracle-pod>:8080/health
# Check metrics
curl http://<oracle-pod>:8000/metrics
Solutions:
- Restart oracle publisher
- Check data sources
- Verify RPC connectivity
- Check private key access
- Verify circuit breaker configuration
Data Source Failures
Symptoms: Failed to fetch from data sources
Diagnosis:
# Check data source connectivity
curl <data-source-url>
# Check oracle publisher logs
kubectl logs -n besu-network <oracle-pod> | grep -i "data source"
Solutions:
- Verify data source URLs
- Check network connectivity
- Verify API keys
- Check rate limiting
- Update data source configuration
Storage Issues
Disk Full
Symptoms: Pods failing, disk space errors
Diagnosis:
# Check disk usage
kubectl exec -n besu-network <pod> -- df -h
# Check PVC usage
kubectl get pvc -n besu-network
# Check pod logs
kubectl logs -n besu-network <pod> | grep -i "disk\|space\|full"
Solutions:
- Increase PVC size
- Clean up old data
- Archive chaindata
- Use snap sync for RPC nodes
- Implement data retention policies
Slow Disk I/O
Symptoms: Slow sync, high latency
Diagnosis:
# Check disk I/O
kubectl exec -n besu-network <pod> -- iostat -x 1
# Check metrics
curl http://<pod>:9545/metrics | grep -i "disk\|io"
Solutions:
- Upgrade to Premium SSD
- Increase disk size
- Optimize Besu configuration
- Check for disk contention
- Use faster storage class
Monitoring Issues
Metrics Not Collecting
Symptoms: No metrics in Prometheus
Diagnosis:
# Check Prometheus targets
curl http://<prometheus>:9090/api/v1/targets
# Check service discovery
kubectl get servicemonitors -n besu-network
# Check pod metrics endpoint
curl http://<pod>:9545/metrics
Solutions:
- Verify ServiceMonitor configuration
- Check network policies
- Verify metrics endpoint
- Restart Prometheus
- Check service discovery configuration
Alerts Not Firing
Symptoms: Alerts not triggering
Diagnosis:
# Check Alertmanager status
curl http://<alertmanager>:9093/api/v1/status
# Check alert rules
kubectl get prometheusrules -n besu-network
# Check notification channels
kubectl get secret alertmanager-config -n besu-network -o yaml
Solutions:
- Verify alert rules
- Check Alertmanager configuration
- Verify notification channels
- Check alert thresholds
- Test alert rules
Debugging Commands
Network Debugging
# Check pod networking
kubectl exec -n besu-network <pod> -- ip addr
# Check DNS
kubectl exec -n besu-network <pod> -- nslookup <service>
# Check connectivity
kubectl exec -n besu-network <pod> -- ping <target>
Besu Debugging
# Check Besu version
kubectl exec -n besu-network <pod> -- /opt/besu/bin/besu --version
# Check configuration
kubectl exec -n besu-network <pod> -- cat /config/besu-config.toml
# Check logs
kubectl logs -n besu-network <pod> --tail=100 -f
Kubernetes Debugging
# Check pod status
kubectl describe pod <pod> -n besu-network
# Check events
kubectl get events -n besu-network --sort-by='.lastTimestamp'
# Check resources
kubectl top nodes
kubectl top pods -n besu-network
Useful Resources
Getting Help
- Check logs first
- Review monitoring dashboards
- Consult runbooks
- Contact on-call engineer
- Escalate if needed