Files

Deploy to Phoenix / deploy (push) Has been cancelled

Details

- Config, docs, scripts, and backup manifests
- Submodule refs unchanged (m = modified content in submodules)

Made-with: Cursor

2026-03-02 11:37:34 -08:00

Concrete Next Steps: RPC 2101 and Storage (thin5 / data)

Last updated: 2026-02-28

1. VMID 2101 (Core RPC) — RPC not responding

Symptom: Container running, besu-rpc active, but RPC (e.g. eth_blockNumber) returns no response from 192.168.11.211:8545.

Step	Action	Command
1	Diagnose	`bash scripts/maintenance/health-check-rpc-2101.sh`
2a	If read-only / database not writable	`bash scripts/maintenance/make-rpc-vmids-writable-via-ssh.sh` (then re-run step 1)
2b	If JNA / NoClassDefFoundError in logs	`bash scripts/maintenance/fix-rpc-2101-jna-reinstall.sh` (then step 3)
3	Fix (start CT if needed, restart Besu, verify)	`bash scripts/maintenance/fix-core-rpc-2101.sh`
4	Verify	`bash scripts/health/check-rpc-vms-health.sh` — 2101 should show block number

Optional: fix-core-rpc-2101.sh --restart-only if the container is already running and you only want to restart the Besu service.

Docs: docs/09-troubleshooting/RPC_NODES_BLOCK_PRODUCTION_FIX.md, docs/03-deployment/RPC_2101_READONLY_FIX.md (if present).

Risk: thin5 is approaching the 85% WARN threshold; LVM thin pools can become slow or fail above ~90%.

Step	Action	Command / notes
1	See which containers use thin5	On r630-02: `ssh root@192.168.11.12 'pct list; for v in $(pct list 2>/dev/null
2	Check disk usage inside those CTs	`bash scripts/maintenance/check-disk-all-vmids.sh` — find VMIDs on r630-02 with high %
3	Free space inside CTs (Besu/DB, logs)	Per VMID: `pct exec <vmid> -- du -sh /data /var/log 2>/dev/null`; prune logs, old snapshots, or Besu temp if safe
4	Optional: migrate one CT to another thin	If thin5 stays high: backup CT, restore to thin2/thin3/thin4/thin6 (e.g. `pct restore <vmid> /path/to/dump --storage thin2`)

Step	Action	Command / notes
5	Track growth	`bash scripts/monitoring/collect-storage-growth-data.sh --append` (or install cron: `bash scripts/maintenance/schedule-storage-growth-cron.sh --install`)
6	Prune old snapshots (on host)	`bash scripts/monitoring/prune-storage-snapshots.sh` (weekly; keeps last 30 days)

Risk: Still healthy; monitor so it does not reach 85%+.

Step	Action	Command / notes
1	Snapshot + growth check	`bash scripts/monitoring/collect-storage-growth-data.sh` — review `logs/storage-growth/`
2	Identify large CTs on r630-01	`bash scripts/maintenance/check-disk-all-vmids.sh` — ml110 + r630-01; VMIDs 2101, 2500–2505 are on r630-01

Step	Action	Command / notes
3	Same as thin5	Use `schedule-storage-growth-cron.sh --install` for weekly collection + prune
4	Before new deployments	Re-run `bash scripts/audit-proxmox-rpc-storage.sh` and check data% / local-lvm%

Item	Script	Purpose
2101 health	`scripts/maintenance/health-check-rpc-2101.sh`	Diagnose Core RPC
2101 fix	`scripts/maintenance/fix-core-rpc-2101.sh`	Restart Besu, verify RPC
2101 read-only	`scripts/maintenance/make-rpc-vmids-writable-via-ssh.sh`	e2fsck RPC VMIDs on r630-01
2101 JNA	`scripts/maintenance/fix-rpc-2101-jna-reinstall.sh`	Reinstall Besu in 2101
Storage audit	`scripts/audit-proxmox-rpc-storage.sh`	All hosts + RPC rootfs mapping
Disk in CTs	`scripts/maintenance/check-disk-all-vmids.sh`	Root / usage per running CT
Storage growth	`scripts/monitoring/collect-storage-growth-data.sh`	Snapshot pvesm/lvs/df
Growth cron	`scripts/maintenance/schedule-storage-growth-cron.sh --install`	Weekly collect + prune