Files
proxmox/docs/00-meta/NEXT_STEPS_2101_AND_STORAGE.md
defiQUG b3a8fe4496
Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
chore: sync all changes to Gitea
- Config, docs, scripts, and backup manifests
- Submodule refs unchanged (m = modified content in submodules)

Made-with: Cursor
2026-03-02 11:37:34 -08:00

4.1 KiB
Raw Blame History

Concrete Next Steps: RPC 2101 and Storage (thin5 / data)

Last updated: 2026-02-28


1. VMID 2101 (Core RPC) — RPC not responding

Symptom: Container running, besu-rpc active, but RPC (e.g. eth_blockNumber) returns no response from 192.168.11.211:8545.

Run order (from project root, on LAN with SSH to r630-01)

Step Action Command
1 Diagnose bash scripts/maintenance/health-check-rpc-2101.sh
2a If read-only / database not writable bash scripts/maintenance/make-rpc-vmids-writable-via-ssh.sh (then re-run step 1)
2b If JNA / NoClassDefFoundError in logs bash scripts/maintenance/fix-rpc-2101-jna-reinstall.sh (then step 3)
3 Fix (start CT if needed, restart Besu, verify) bash scripts/maintenance/fix-core-rpc-2101.sh
4 Verify bash scripts/health/check-rpc-vms-health.sh — 2101 should show block number

Optional: fix-core-rpc-2101.sh --restart-only if the container is already running and you only want to restart the Besu service.

Docs: docs/09-troubleshooting/RPC_NODES_BLOCK_PRODUCTION_FIX.md, docs/03-deployment/RPC_2101_READONLY_FIX.md (if present).


2. r630-02 thin5 — 84.6% used (monitor / reduce)

Risk: thin5 is approaching the 85% WARN threshold; LVM thin pools can become slow or fail above ~90%.

Immediate

Step Action Command / notes
1 See which containers use thin5 On r630-02: `ssh root@192.168.11.12 'pct list; for v in $(pct list 2>/dev/null
2 Check disk usage inside those CTs bash scripts/maintenance/check-disk-all-vmids.sh — find VMIDs on r630-02 with high %
3 Free space inside CTs (Besu/DB, logs) Per VMID: pct exec <vmid> -- du -sh /data /var/log 2>/dev/null; prune logs, old snapshots, or Besu temp if safe
4 Optional: migrate one CT to another thin If thin5 stays high: backup CT, restore to thin2/thin3/thin4/thin6 (e.g. pct restore <vmid> /path/to/dump --storage thin2)

Ongoing

Step Action Command / notes
5 Track growth bash scripts/monitoring/collect-storage-growth-data.sh --append (or install cron: bash scripts/maintenance/schedule-storage-growth-cron.sh --install)
6 Prune old snapshots (on host) bash scripts/monitoring/prune-storage-snapshots.sh (weekly; keeps last 30 days)

3. r630-01 data / local-lvm — 71.9% used (monitor)

Risk: Still healthy; monitor so it does not reach 85%+.

Immediate

Step Action Command / notes
1 Snapshot + growth check bash scripts/monitoring/collect-storage-growth-data.sh — review logs/storage-growth/
2 Identify large CTs on r630-01 bash scripts/maintenance/check-disk-all-vmids.sh — ml110 + r630-01; VMIDs 2101, 25002505 are on r630-01

Ongoing

Step Action Command / notes
3 Same as thin5 Use schedule-storage-growth-cron.sh --install for weekly collection + prune
4 Before new deployments Re-run bash scripts/audit-proxmox-rpc-storage.sh and check data% / local-lvm%

Quick reference

Item Script Purpose
2101 health scripts/maintenance/health-check-rpc-2101.sh Diagnose Core RPC
2101 fix scripts/maintenance/fix-core-rpc-2101.sh Restart Besu, verify RPC
2101 read-only scripts/maintenance/make-rpc-vmids-writable-via-ssh.sh e2fsck RPC VMIDs on r630-01
2101 JNA scripts/maintenance/fix-rpc-2101-jna-reinstall.sh Reinstall Besu in 2101
Storage audit scripts/audit-proxmox-rpc-storage.sh All hosts + RPC rootfs mapping
Disk in CTs scripts/maintenance/check-disk-all-vmids.sh Root / usage per running CT
Storage growth scripts/monitoring/collect-storage-growth-data.sh Snapshot pvesm/lvs/df
Growth cron scripts/maintenance/schedule-storage-growth-cron.sh --install Weekly collect + prune