12 KiB
Troubleshooting FAQ
Common issues and solutions for Besu validated set deployment.
Table of Contents
- Container Issues
- Service Issues
- Network Issues
- Consensus Issues
- Configuration Issues
- Performance Issues
Container Issues
Q: Container won't start
Symptoms: pct status <vmid> shows "stopped" or errors during startup
Solutions:
# Check container status
pct status <vmid>
# View container console
pct console <vmid>
# Check logs
journalctl -u pve-container@<vmid>
# Check container configuration
pct config <vmid>
# Try starting manually
pct start <vmid>
Common Causes:
- Insufficient resources (RAM, disk)
- Network configuration errors
- Invalid container configuration
- OS template issues
Q: Container runs out of disk space
Symptoms: Services fail, "No space left on device" errors
Solutions:
# Check disk usage
pct exec <vmid> -- df -h
# Check Besu database size
pct exec <vmid> -- du -sh /data/besu/database/
# Clean up old logs
pct exec <vmid> -- journalctl --vacuum-time=7d
# Increase disk size (if using LVM)
pct resize <vmid> rootfs +10G
Q: Container network issues
Symptoms: Cannot ping, cannot connect to services
Solutions:
# Check network configuration
pct config <vmid> | grep net0
# Check if container has IP
pct exec <vmid> -- ip addr show
# Check routing
pct exec <vmid> -- ip route
# Restart container networking
pct stop <vmid>
pct start <vmid>
Service Issues
Q: Besu service won't start
Symptoms: systemctl status besu-validator shows failed
Solutions:
# Check service status
pct exec <vmid> -- systemctl status besu-validator
# View service logs
pct exec <vmid> -- journalctl -u besu-validator -n 100
# Check for configuration errors
pct exec <vmid> -- besu --config-file=/etc/besu/config-validator.toml --help
# Verify configuration file syntax
pct exec <vmid> -- cat /etc/besu/config-validator.toml
Common Causes:
- Missing configuration files
- Invalid configuration syntax
- Missing validator keys
- Port conflicts
- Insufficient resources
Q: Service starts but crashes
Symptoms: Service starts then stops, high restart count
Solutions:
# Check crash logs
pct exec <vmid> -- journalctl -u besu-validator --since "10 minutes ago"
# Check for out of memory
pct exec <vmid> -- dmesg | grep -i "out of memory"
# Check system resources
pct exec <vmid> -- free -h
pct exec <vmid> -- df -h
# Check JVM heap settings
pct exec <vmid> -- cat /etc/systemd/system/besu-validator.service | grep BESU_OPTS
Q: Service shows as active but not responding
Symptoms: Service status shows "active" but RPC/P2P not responding
Solutions:
# Check if process is actually running
pct exec <vmid> -- ps aux | grep besu
# Check if ports are listening
pct exec <vmid> -- netstat -tuln | grep -E "30303|8545|9545"
# Check firewall rules
pct exec <vmid> -- iptables -L -n
# Test connectivity
pct exec <vmid> -- curl -s http://localhost:8545
Network Issues
Q: Nodes cannot connect to peers
Symptoms: Low or zero peer count, "No peers" in logs
Solutions:
# Check static-nodes.json
pct exec <vmid> -- cat /etc/besu/static-nodes.json
# Check permissions-nodes.toml
pct exec <vmid> -- cat /etc/besu/permissions-nodes.toml
# Verify enode URLs are correct
pct exec <vmid> -- besu public-key export --node-private-key-file=/data/besu/nodekey --format=enode
# Check P2P port is open
pct exec <vmid> -- netstat -tuln | grep 30303
# Test connectivity to peer
pct exec <vmid> -- ping -c 3 <peer-ip>
Common Causes:
- Incorrect enode URLs in static-nodes.json
- Firewall blocking P2P port (30303)
- Nodes not in permissions-nodes.toml
- Network connectivity issues
Q: Invalid enode URL errors
Symptoms: "Invalid enode URL syntax" or "Invalid node ID" in logs
Solutions:
# Check node ID length (must be 128 hex chars)
pct exec <vmid> -- besu public-key export --node-private-key-file=/data/besu/nodekey --format=enode | \
sed 's|^enode://||' | cut -d'@' -f1 | wc -c
# Should output 129 (128 chars + newline)
# Fix node IDs using allowlist scripts
./scripts/besu-collect-all-enodes.sh
./scripts/besu-generate-allowlist.sh
./scripts/besu-deploy-allowlist.sh
Q: RPC endpoint not accessible
Symptoms: Cannot connect to RPC on port 8545
Solutions:
# Check if RPC is enabled (validators typically don't have RPC)
pct exec <vmid> -- grep -i "rpc-http-enabled" /etc/besu/config-*.toml
# Check if RPC port is listening
pct exec <vmid> -- netstat -tuln | grep 8545
# Check firewall
pct exec <vmid> -- iptables -L -n | grep 8545
# Test from container
pct exec <vmid> -- curl -X POST -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
http://localhost:8545
# Check host allowlist in config
pct exec <vmid> -- grep -i "host-allowlist\|rpc-http-host" /etc/besu/config-*.toml
Consensus Issues
Q: No blocks being produced
Symptoms: Block height not increasing, "No blocks" in logs
Solutions:
# Check validator service is running
pct exec <vmid> -- systemctl status besu-validator
# Check validator keys
pct exec <vmid> -- ls -la /keys/validators/
# Check consensus logs
pct exec <vmid> -- journalctl -u besu-validator | grep -i "consensus\|qbft\|proposing"
# Verify validators are in genesis (if static validators)
pct exec <vmid> -- cat /etc/besu/genesis.json | grep -A 20 "qbft"
# Check peer connectivity
pct exec <vmid> -- curl -s -X POST -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"admin_peers","params":[],"id":1}' \
http://localhost:8545
Common Causes:
- Validator keys missing or incorrect
- Not enough validators online
- Network connectivity issues
- Consensus configuration errors
Q: Validator not participating in consensus
Symptoms: Validator running but not producing blocks
Solutions:
# Verify validator address
pct exec <vmid> -- cat /keys/validators/validator-*/address.txt
# Check if address is in validator contract (for dynamic validators)
# Or check genesis.json (for static validators)
pct exec <vmid> -- cat /etc/besu/genesis.json | python3 -m json.tool | grep -A 10 "qbft"
# Verify validator keys are loaded
pct exec <vmid> -- journalctl -u besu-validator | grep -i "validator.*key"
# Check for permission errors
pct exec <vmid> -- journalctl -u besu-validator | grep -i "permission\|denied"
Configuration Issues
Q: Configuration file not found
Symptoms: "File not found" errors, service won't start
Solutions:
# List all config files
pct exec <vmid> -- ls -la /etc/besu/
# Verify required files exist
pct exec <vmid> -- test -f /etc/besu/genesis.json && echo "genesis.json OK" || echo "genesis.json MISSING"
pct exec <vmid> -- test -f /etc/besu/config-validator.toml && echo "config OK" || echo "config MISSING"
# Copy missing files
# (Use copy-besu-config.sh script)
./scripts/copy-besu-config.sh /path/to/smom-dbis-138
Q: Invalid configuration syntax
Symptoms: "Invalid option" or syntax errors in logs
Solutions:
# Validate TOML syntax
pct exec <vmid> -- python3 -c "import tomllib; open('/etc/besu/config-validator.toml').read()" 2>&1
# Validate JSON syntax
pct exec <vmid> -- python3 -m json.tool /etc/besu/genesis.json > /dev/null
# Check for deprecated options
pct exec <vmid> -- journalctl -u besu-validator | grep -i "deprecated\|unknown option"
# Review Besu documentation for current options
Q: Path errors in configuration
Symptoms: "File not found" errors with paths like "/config/genesis.json"
Solutions:
# Check configuration file paths
pct exec <vmid> -- grep -E "genesis-file|data-path" /etc/besu/config-validator.toml
# Correct paths should be:
# genesis-file="/etc/besu/genesis.json"
# data-path="/data/besu"
# Fix paths if needed
pct exec <vmid> -- sed -i 's|/config/|/etc/besu/|g' /etc/besu/config-validator.toml
Performance Issues
Q: High CPU usage
Symptoms: Container CPU usage > 80% consistently
Solutions:
# Check CPU usage
pct exec <vmid> -- top -bn1 | head -20
# Check JVM GC activity
pct exec <vmid> -- journalctl -u besu-validator | grep -i "gc\|pause"
# Adjust JVM settings if needed
# Edit /etc/systemd/system/besu-validator.service
# Adjust BESU_OPTS and JAVA_OPTS
# Consider allocating more CPU cores
pct set <vmid> --cores 4
Q: High memory usage
Symptoms: Container running out of memory, OOM kills
Solutions:
# Check memory usage
pct exec <vmid> -- free -h
# Check JVM heap settings
pct exec <vmid> -- ps aux | grep besu | grep -oP 'Xm[xs]\K[0-9]+[gm]'
# Reduce heap size if too large
# Edit /etc/systemd/system/besu-validator.service
# Adjust BESU_OPTS="-Xmx4g" to appropriate size
# Or increase container memory
pct set <vmid> --memory 8192
Q: Slow sync or block processing
Symptoms: Blocks processing slowly, falling behind
Solutions:
# Check database size and health
pct exec <vmid> -- du -sh /data/besu/database/
# Check disk I/O
pct exec <vmid> -- iostat -x 1 5
# Consider using SSD storage
# Check network latency
pct exec <vmid> -- ping -c 10 <peer-ip>
# Verify sufficient peers
pct exec <vmid> -- curl -s -X POST -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"admin_peers","params":[],"id":1}' \
http://localhost:8545 | python3 -c "import sys, json; print(len(json.load(sys.stdin).get('result', [])))"
General Troubleshooting Commands
# View all container statuses
for vmid in 1000 1001 1002 1003 1004 1500 1501 1502 1503 2500 2501 2502; do
echo "=== Container $vmid ==="
pct status $vmid
done
# Check all service statuses
for vmid in 1000 1001 1002 1003 1004; do
pct exec $vmid -- systemctl status besu-validator --no-pager -l | head -10
done
# View recent logs from all nodes
for vmid in 1000 1001 1002 1003 1004; do
echo "=== Logs for container $vmid ==="
pct exec $vmid -- journalctl -u besu-validator -n 20 --no-pager
done
# Check network connectivity between nodes
pct exec 1000 -- ping -c 3 192.168.11.14 # validator to validator
# Verify RPC endpoint (RPC nodes only)
pct exec 2500 -- curl -s -X POST -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
http://localhost:8545 | python3 -m json.tool
Getting Help
If issues persist:
-
Collect Information:
- Service logs:
journalctl -u besu-validator -n 100 - Container status:
pct status <vmid> - Configuration:
pct exec <vmid> -- cat /etc/besu/config-validator.toml - Network:
pct exec <vmid> -- ip addr show
- Service logs:
-
Check Documentation:
-
Validate Configuration:
- Run prerequisites check:
./scripts/validation/check-prerequisites.sh - Validate validators:
./scripts/validation/validate-validator-set.sh
- Run prerequisites check:
-
Review Logs:
- Check deployment logs:
logs/deploy-validated-set-*.log - Check service logs in containers
- Check Proxmox host logs
- Check deployment logs:
Related Documentation
Operational Procedures
- OPERATIONAL_RUNBOOKS.md - Complete operational runbooks
- QBFT_TROUBLESHOOTING.md - QBFT consensus troubleshooting
- BESU_ALLOWLIST_QUICK_START.md - Allowlist troubleshooting
Deployment & Configuration
- DEPLOYMENT_STATUS_CONSOLIDATED.md - Current deployment status
- NETWORK_ARCHITECTURE.md - Network architecture reference
- VALIDATED_SET_DEPLOYMENT_GUIDE.md - Deployment guide
Monitoring
- MONITORING_SUMMARY.md - Monitoring setup
- BLOCK_PRODUCTION_MONITORING.md - Block production monitoring
Reference
- MASTER_INDEX.md - Complete documentation index
Last Updated: 2025-01-20
Version: 1.0