Co-authored-by: Cursor <cursoragent@cursor.com>
11 KiB
RPC Nodes Block Production — Fix Runbook
Purpose: Fix RPC nodes that do not respond or report block 0 so all RPCs see chain 138 and current block production.
Core Besu RPC (VMID 2101) — quick fix and full runbook
VMID 2101 is the Chain 138 Core RPC (admin/deploy; RPC_URL_138 = http://192.168.11.211:8545). It runs on r630-01 (192.168.11.11).
Health check (run first to see status):
./scripts/maintenance/health-check-rpc-2101.sh
One-command fix (from a host with SSH to r630-01):
./scripts/maintenance/fix-core-rpc-2101.sh
Options: --dry-run (print actions only); --restart-only (skip starting the container; only restart Besu inside CT). If Besu fails with JNA/NoClassDefFoundError, run ./scripts/maintenance/fix-rpc-2101-jna-reinstall.sh then re-run the fix script.
Manual steps (if script cannot be used):
- SSH to r630-01:
ssh root@192.168.11.11 - Start container if stopped:
pct start 2101 - Inside CT, start/restart Besu:
pct exec 2101 -- systemctl restart besu-rpcorpct exec 2101 -- systemctl start besu - Wait 30–60s, then verify:
curl -s -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' http://192.168.11.211:8545/
Ping works but curl to :8545 doesn’t?
If you can ping the Core RPC IP (e.g. 192.168.11.211) but curl to http://192.168.11.211:8545 returns nothing or fails:
-
Use POST with JSON-RPC — Besu’s HTTP RPC expects a POST with a JSON body. A plain
curl http://...:8545(GET) often returns nothing or empty. Always test with:curl -s -X POST -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' \ http://192.168.11.211:8545Expect something like
{"jsonrpc":"2.0","id":1,"result":"0x8a"}for chain 138. -
Check Besu is running on the RPC host (VMID 2101): from the Proxmox host (e.g. r630-01), run
pct exec 2101 -- systemctl status besu-rpc. If it’s down, start/restart as in the manual steps above. -
Check firewall — Ensure TCP port 8545 is allowed from the client to the RPC host. From the client:
nc -zv 192.168.11.211 8545(ortelnet 192.168.11.211 8545). If it doesn’t connect, a firewall (host or network) is likely blocking. -
Config binding — Besu config should have
rpc-http-host="0.0.0.0"andrpc-http-port=8545so it listens on all interfaces. If you changed it to 127.0.0.1, remote curl will not reach it.
VMID 2101: All steps to fix and prevent
- Check container —
ssh root@192.168.11.11 "pct status 2101"; if not running,pct start 2101. - Check disk —
pct exec 2101 -- df -h /andpct exec 2101 -- du -sh /data/besu. If near full, free space or resize CT disk (see Common fix #8). - Check Besu service —
pct exec 2101 -- systemctl status besu-rpc. If failing,pct exec 2101 -- journalctl -u besu-rpc -n 80 --no-pager. - Fix "No space left" — Free space or resize LV; then
systemctl restart besu-rpc(see #8). - Fix JNA / NoClassDefFoundError — Run
./scripts/maintenance/fix-rpc-2101-jna-reinstall.sh(reinstalls Besu in CT), then./scripts/maintenance/fix-core-rpc-2101.sh(see Common fix #9). - Verify RPC —
curl -s -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' http://192.168.11.211:8545/→ expect"result":"0x8a". - Prevent recurrence — Run
./scripts/maintenance/check-disk-all-vmids.shand./scripts/storage-monitor.shregularly; alert when root or/data/besuusage > 85%.
See also: 502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md (rpc-http-prv).
Quick status check
# From project root; requires curl and network to 192.168.11.x
for entry in 2101:192.168.11.211 2102:192.168.11.212 2201:192.168.11.221 2301:192.168.11.232 2303:192.168.11.233 2304:192.168.11.234 2305:192.168.11.235 2306:192.168.11.236 2400:192.168.11.240 2401:192.168.11.241 2402:192.168.11.242 2403:192.168.11.243 2500:192.168.11.172 2501:192.168.11.173 2502:192.168.11.174 2503:192.168.11.246 2504:192.168.11.247 2505:192.168.11.248; do
vmid="${entry%%:*}"; ip="${entry#*:}"
r=$(curl -s -m 3 -X POST -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' -H "Content-Type: application/json" "http://${ip}:8545" 2>/dev/null)
echo "$vmid $ip: ${r:-no response}"
done
Fixes applied (2026-02-09)
| VMID | Issue | Fix |
|---|---|---|
| 2102 | "Host not authorized" | Added host-allowlist=["*"] to /etc/besu/config-rpc.toml, restarted besu-rpc.service. |
| 2201 | Unknown option tx-pool-min-score |
Removed line from config, restarted besu-rpc.service. |
| 2303 | Wrong permissions path + tx-pool-min-score | Set permissions-nodes-config-file="/etc/besu/permissions-nodes.toml", removed tx-pool-min-score, restarted. |
| 2301 | Block 0 (syncing) | No config change; node is syncing. Wait or check peers. |
| 2401 | discovery + allowlist + paths | Set static-nodes/permissions/genesis to /etc/besu/, discovery-enabled=false. Still failing: genesis mismatch with existing /data/besu — either restore original genesis or resync (clear /data/besu and restart). |
| 2500–2505 | besu.service: /opt/besu/bin/besu missing or config errors | 2500: Installed Besu 23.10.3 to /opt, fixed config (removed qbft-enabled, log-destination, rpc-http-api-enable-unsafe-txsigning, fast-sync-min-peers, PERSONAL/MINER from API). Still failing: "Supplied file does not contain valid keyPair" (nodekey). 2501–2505: Same pattern — ensure /opt/besu/bin/besu exists (run fix-besu-installation.sh or install tarball), fix config.toml for Besu 23.10, ensure genesis.json and valid nodekey. |
Common fixes
1. Host not authorized (RPC returns JSON "Host not authorized")
Add to the node’s Besu TOML config (e.g. /etc/besu/config-rpc.toml):
host-allowlist=["*"]
Then: systemctl restart besu-rpc.service (or besu.service).
2. Unknown option tx-pool-min-score
Remove the line from the config (not supported in some Besu versions):
pct exec VMID -- sed -i '/tx-pool-min-score/d' /etc/besu/*.toml
pct exec VMID -- systemctl restart besu-rpc.service
3. Wrong permissions or static-nodes path
Ensure config uses /etc/besu/:
permissions-nodes-config-file="/etc/besu/permissions-nodes.toml"static-nodes-file="/etc/besu/static-nodes.json"genesis-file="/etc/besu/genesis.json"
Redeploy canonical lists: bash scripts/deploy-besu-node-lists-to-all.sh.
4. Discovery vs allowlist
If you see "Specified node(s) not in nodes-allowlist", either add those enodes to permissions-nodes.toml and redeploy, or set discovery-enabled=false so the node only uses static-nodes (all of which must be in the allowlist).
5. Besu binary missing (/opt/besu/bin/besu)
On containers that lack Besu (1505–1508 sentries, 2501–2505 RPCs):
-
Permanent install (recommended):
bash scripts/besu/install-besu-permanent-on-missing-nodes.sh
Installs Besu 23.10.3 in each CT (download inside container), deploys config/genesis/node lists, enables and starts the service. Sentries getbesu-sentry.service, RPCs getbesu.service+config.toml. Allow ~5–10 minutes per node (first run installs Java + Besu). Use--dry-runto see which VMIDs would be updated. -
Legacy (tarball already in CT):
scripts/fix-besu-installation.sh(expects tarball in each container /opt).
6. Genesis mismatch ("Supplied genesis block does not match chain data")
Either:
- Restore the original genesis file that matches existing
/data/besu, or - Resync from block 0: back up then remove
/data/besu(or use a new data-path), set correct genesis, restart.
7. Invalid keyPair / nodekey
Ensure the node has a valid nodekey (e.g. /data/besu/nodekey). If the config references a key file, fix the path or regenerate (Besu can create nodekey on first run if data-path is empty).
8. No space left on device (RocksDB in /data/besu)
If Besu fails with RocksDBException: ... No space left on device for a file under /data/besu/database/:
- Immediate: Free space: remove old logs, temporary files, or snapshots inside the CT; or from the Proxmox host, resize the CT’s disk (e.g.
pct resize 2101 rootfs +20Gif the storage allows). - Inside CT:
df -h /anddu -sh /data/besu; clear caches or old chain data only if you accept resync: e.g.rm -rf /data/besu/caches/*(Besu will recreate). Do not delete/data/besu/databaseunless you intend to resync from genesis. - Then:
systemctl restart besu-rpc(or besu.service).
9. JNA / NoClassDefFoundError (Besu fails to start)
If journalctl -u besu-rpc shows NoClassDefFoundError: Could not initialize class com.sun.jna.Native or "Did not find Udev class from JNA":
- Usually a Java/Besu or classpath issue (conflicting JNA, or broken install). Options: reinstall Besu in the CT (same or supported version), or ensure a single consistent JNA on the classpath; upgrade Java if it’s outdated.
- On VMID 2101: run
./scripts/maintenance/fix-rpc-2101-jna-reinstall.sh(reinstalls Besu in the CT to fix JNA), then./scripts/maintenance/fix-core-rpc-2101.sh. Or reinstall Besu manually per BESU_PATH_REFERENCE.md and restart the service.
Scripts
- Health check VMID 2101:
scripts/maintenance/health-check-rpc-2101.sh— container, besu-rpc service, port 8545, eth_chainId, eth_blockNumber. Run from LAN. - Fix Core RPC 2101:
scripts/maintenance/fix-core-rpc-2101.sh— start CT if needed, restart Besu, verify RPC. - Fix 2101 JNA (reinstall Besu):
scripts/maintenance/fix-rpc-2101-jna-reinstall.sh[--dry-run] — when Besu fails with NoClassDefFoundError/JNA; then re-run fix-core-rpc-2101.sh. - Check disk in all VMIDs:
scripts/maintenance/check-disk-all-vmids.sh[--csv] — root filesystem usage for every running container on ml110, r630-01, r630-02. Use for prevention and audits. - Host-level storage:
scripts/storage-monitor.sh— Proxmox storage and volume groups; alerts at 80%/90%. - Deploy node lists to all:
scripts/deploy-besu-node-lists-to-all.sh - Verify lists on all:
scripts/verify/verify-static-permissions-on-all-besu-nodes.sh --checksum - Restart Besu on all:
scripts/besu/restart-besu-reload-node-lists.sh - Install Besu permanently on nodes missing it (1505–1508, 2501–2505):
scripts/besu/install-besu-permanent-on-missing-nodes.sh(no tarball needed; downloads inside each CT). - Fix Besu install when tarball already in CT:
scripts/fix-besu-installation.sh.
Reference
- RPC IPs and VMIDs:
config/ip-addresses.conf, docs/06-besu/BESU_NODES_FILE_REFERENCE.md - Canonical node lists:
config/besu-node-lists/