Files
proxmox/docs/03-deployment/VALIDATED_SET_DEPLOYMENT_GUIDE.md

290 lines
7.1 KiB
Markdown
Raw Normal View History

# Validated Set Deployment Guide
Complete guide for deploying a validated Besu node set using the script-based approach.
## Overview
This guide covers deploying a validated set of Besu nodes (validators, sentries, RPC) on Proxmox VE LXC containers using automated scripts. The deployment uses a **script-based approach** with `static-nodes.json` for peer discovery (no boot node required).
## Prerequisites
- Proxmox VE 7.0+ installed
- Root access to Proxmox host
- Sufficient resources (RAM, disk, CPU)
- Network connectivity
- Source project with Besu configuration files
## Deployment Methods
### Method 1: Complete Deployment (Recommended)
Deploy everything from scratch in one command:
```bash
cd /opt/smom-dbis-138-proxmox
sudo ./scripts/deployment/deploy-validated-set.sh \
--source-project /path/to/smom-dbis-138
```
**What this does:**
1. Deploys all containers (validators, sentries, RPC)
2. Copies configuration files from source project
3. Bootstraps the network (generates and deploys static-nodes.json)
4. Validates the deployment
### Method 2: Step-by-Step Deployment
If you prefer more control, deploy step by step:
```bash
# Step 1: Deploy containers
sudo ./scripts/deployment/deploy-besu-nodes.sh
# Step 2: Copy configuration files
SOURCE_PROJECT=/path/to/smom-dbis-138 \
./scripts/copy-besu-config.sh
# Step 3: Bootstrap network
sudo ./scripts/network/bootstrap-network.sh
# Step 4: Validate validators
sudo ./scripts/validation/validate-validator-set.sh
```
### Method 3: Bootstrap Existing Containers
If containers are already deployed and configured:
```bash
# Quick bootstrap (just network bootstrap)
sudo ./scripts/deployment/bootstrap-quick.sh
# Or use the full script with skip options
sudo ./scripts/deployment/deploy-validated-set.sh \
--skip-deployment \
--skip-config \
--source-project /path/to/smom-dbis-138
```
## Detailed Steps
### Step 1: Prepare Source Project
Ensure your source project has the required files:
```
smom-dbis-138/
├── config/
│ ├── genesis.json
│ ├── permissions-nodes.toml
│ ├── permissions-accounts.toml
│ ├── static-nodes.json (will be generated/updated)
│ ├── config-validator.toml
│ ├── config-sentry.toml
│ └── config-rpc-public.toml
└── keys/
└── validators/
├── validator-1/
├── validator-2/
├── validator-3/
├── validator-4/
└── validator-5/
```
### Step 2: Review Configuration
Check your deployment configuration:
```bash
cat config/proxmox.conf
cat config/network.conf
```
Key settings:
- `VALIDATOR_START`, `VALIDATOR_COUNT` - Validator VMID range
- `SENTRY_START`, `SENTRY_COUNT` - Sentry VMID range
- `RPC_START`, `RPC_COUNT` - RPC VMID range
- `CONTAINER_OS_TEMPLATE` - OS template to use
### Step 3: Run Deployment
Execute the deployment script:
```bash
sudo ./scripts/deployment/deploy-validated-set.sh \
--source-project /path/to/smom-dbis-138
```
### Step 4: Monitor Progress
The script will output progress for each phase:
```
=========================================
Phase 1: Deploy Containers
=========================================
[INFO] Deploying Besu nodes...
[✓] Besu nodes deployed
=========================================
Phase 2: Copy Configuration Files
=========================================
[INFO] Copying Besu configuration files...
[✓] Configuration files copied
=========================================
Phase 3: Bootstrap Network
=========================================
[INFO] Bootstrapping network...
[INFO] Collecting enodes from validators...
[✓] Network bootstrapped
=========================================
Phase 4: Validate Deployment
=========================================
[INFO] Validating validator set...
[✓] All validators validated successfully!
```
### Step 5: Verify Deployment
After deployment completes, verify everything is working:
```bash
# Check all containers are running
pct list | grep -E "1000|1001|1002|1003|1004|1500|1501|1502|1503|2500|2501|2502"
# Check service status
for vmid in 1000 1001 1002 1003 1004; do
echo "=== Validator $vmid ==="
pct exec $vmid -- systemctl status besu-validator --no-pager -l
done
# Check consensus is active (blocks being produced)
pct exec 2500 -- curl -s -X POST \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
http://localhost:8545 | python3 -m json.tool
```
## Health Checks
### Check Individual Node Health
```bash
# Human-readable output
sudo ./scripts/health/check-node-health.sh 1000
# JSON output (for automation)
sudo ./scripts/health/check-node-health.sh 1000 --json
```
### Validate Validator Set
```bash
sudo ./scripts/validation/validate-validator-set.sh
```
This checks:
- Container and service status
- Validator keys exist and are accessible
- Configuration files are present
- Consensus participation
## Troubleshooting
### Containers Won't Start
```bash
# Check container status
pct status <vmid>
# View container console
pct console <vmid>
# Check logs
pct exec <vmid> -- journalctl -xe
```
### Services Won't Start
```bash
# Check service status
pct exec <vmid> -- systemctl status besu-validator
# View service logs
pct exec <vmid> -- journalctl -u besu-validator -f
# Check configuration
pct exec <vmid> -- cat /etc/besu/config-validator.toml
```
### Network Connectivity Issues
```bash
# Check P2P port is listening
pct exec <vmid> -- netstat -tuln | grep 30303
# Check peer connections (if RPC enabled)
pct exec <vmid> -- curl -s -X POST \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"admin_peers","params":[],"id":1}' \
http://localhost:8545
# Verify static-nodes.json
pct exec <vmid> -- cat /etc/besu/static-nodes.json
```
### Consensus Issues
```bash
# Check validator is participating
pct exec <vmid> -- journalctl -u besu-validator --no-pager | grep -i "consensus\|qbft\|proposing"
# Verify validator keys
pct exec <vmid> -- ls -la /keys/validators/
# Check genesis file
pct exec <vmid> -- cat /etc/besu/genesis.json | python3 -m json.tool
```
## Rollback
If deployment fails, you can remove containers:
```bash
# Remove specific containers
for vmid in 1000 1001 1002 1003 1004 1500 1501 1502 1503 2500 2501 2502; do
pct stop $vmid 2>/dev/null || true
pct destroy $vmid 2>/dev/null || true
done
```
Then re-run the deployment after fixing any issues.
## Post-Deployment
After successful deployment:
1. **Monitor Logs**: Keep an eye on service logs for the first few hours
2. **Verify Consensus**: Ensure blocks are being produced
3. **Check Resources**: Monitor CPU, memory, and disk usage
4. **Network Health**: Verify all nodes are connected
5. **Backup**: Consider creating snapshots of working containers
## Next Steps
- Set up monitoring (Prometheus, Grafana)
- Configure backups
- Document node endpoints
- Set up alerting
- Plan for maintenance windows
## Additional Resources
- [Besu Nodes File Reference](BESU_NODES_FILE_REFERENCE.md)
- [Network Bootstrap Guide](NETWORK_BOOTSTRAP_GUIDE.md)
- [Boot Node Runbook](BOOT_NODE_RUNBOOK.md) (if using boot node)
- [Besu Allowlist Runbook](BESU_ALLOWLIST_RUNBOOK.md)