Co-authored-by: Cursor <cursoragent@cursor.com>
6.9 KiB
VMID 2101 (Core RPC) — Changes and Why Failures Continue
Purpose: List all changes (including write/lock-related) made to VMID 2101 and what is causing the Core RPC to keep failing.
1. Changes made to VMID 2101
1.1 Make-writable (e2fsck) — make-rpc-vmids-writable-via-ssh.sh
| Action | What it does | Effect on 2101 |
|---|---|---|
pct stop 2101 |
Stops the container | Unmounts rootfs; LV may stay active or be deactivated by Proxmox. |
lvchange -ay /dev/pve/vm-2101-disk-0 |
Activates the LV | So e2fsck can run on the block device. |
e2fsck -f -y /dev/pve/vm-2101-disk-0 |
Full fsck, non-interactive | FILE SYSTEM WAS MODIFIED — fixes ext4 errors, journal, inodes. Clears the condition that caused the kernel to remount root read-only. No “write lock” is added; this allows the fs to be mounted read-write again. |
lvchange -an /dev/pve/vm-2101-disk-0 |
Deactivates the LV | LV is taken offline before start. On some setups this could be problematic if pct start does not reliably re-activate the LV before mount. |
pct start 2101 |
Starts the container | Rootfs is mounted (typically rw after a successful e2fsck). |
Evidence (from run): Logs showed FILE SYSTEM WAS MODIFIED and “e2fsck done for 2101”, then “VMID 2101 writable”. So the filesystem was corrected and the CT was brought back up.
1.2 Fix 2101 JNA reinstall — fix-rpc-2101-jna-reinstall.sh
| Action | What it does | Effect on 2101 |
|---|---|---|
| Writability check | touch /tmp/.w and touch /opt/.w in CT |
Fails and exits if /tmp (or /opt) is not writable. |
| Stop Besu | systemctl stop besu-rpc.service besu.service |
Stops RPC so files can be replaced. |
| Backup /opt/besu | mv /opt/besu /opt/besu.bak.<timestamp> |
Removes or renames existing Besu install. |
| Installer in /tmp | Copies install-besu-in-ct-standalone.sh to CT /tmp, runs with TMPDIR=/tmp |
Uses /tmp so install works even if /root is read-only. |
| Run install script | install-besu-in-ct-standalone.sh (NODE_TYPE=rpc) |
Runs apt-get update and apt-get install openjdk-17-jdk wget ..., downloads Besu tarball, extracts to /opt/besu, creates besu-rpc.service with -Djava.io.tmpdir=/data/besu/tmp. |
| Post-install | mkdir -p /data/besu/tmp, ensure -Djava.io.tmpdir=/data/besu/tmp in besu-rpc.service, daemon-reload |
So JNA/native libs use a writable dir; avoids “Read-only file system” for JNA. |
| Deploy genesis/node lists | Push genesis.json, static-nodes.json, permissions-nodes.toml to /etc/besu |
Config for Chain 138. |
| Start besu-rpc | systemctl start besu-rpc.service |
Brings Core RPC up. |
What actually happened in runs: The script stalled at “Installing packages…” (apt inside the CT). So:
- Root was made writable (e2fsck).
- The JNA reinstall script did not complete: apt hung or was very slow.
- Result: no valid
/opt/besu/bin/besu(or incomplete install), besu-rpc inactive, so Core RPC keeps failing.
1.3 Other scripts that touch 2101
- fix-core-rpc-2101.sh: Only starts/restarts the CT and
besu-rpc/besuservice; no filesystem or “write lock” changes. - fix-all-502s-comprehensive.sh: Ensures nodekey in
/data/besu, then runsfix-core-rpc-2101.sh; no e2fsck or LV changes. - install-besu-in-ct-standalone.sh (when run inside 2101): Writes to
/opt/besu,/etc/besu,/data/besu,/var/log/besuand creates systemd unit; adds-Djava.io.tmpdir=/data/besu/tmp(reduces risk of JNA write issues, does not add a lock).
2. “Write locks” and read-only behavior
- No explicit “write lock” is set by these scripts. The only lock-like behavior is the read-only root that the kernel sets when ext4 hits errors; e2fsck is what removes that by repairing the fs.
- e2fsck can set the “filesystem needs checking” flag or clear it; it does not leave a persistent write lock. After a successful e2fsck and
pct start, the rootfs should mount read-write. - lvchange -an in the make-writable script deactivates the LV right before
pct start. In normal Proxmox behavior, starting the CT should activate the LV again. If your host or storage stack behaves differently, deactivating the LV before start could in theory lead to start failures or odd state; removing lvchange -an (or running it only when the CT is not about to be started) avoids that possibility.
3. Why Core RPC (2101) continues to fail
From logs and summaries:
-
JNA reinstall never finished
The fix script repeatedly stalls at “Installing packages…” (apt in the CT). So:/opt/besuis missing or from an old/incomplete install.besu-rpc.serviceis inactive or fails (e.g. NoClassDefFoundError for JNA, or missing binary).- RPC on 192.168.11.211:8545 never comes up or stays down.
-
Root was fixed, but the service was not
Making the CT writable (e2fsck) succeeded; the service fix (reinstall Besu + JNA tmpdir) did not complete, so 2101 stays in a “writable but no working Besu” state. -
Possible contributing factors
- lvchange -an before
pct startin the make-writable script (see above). - Apt in the CT slow or hanging (network, mirrors, or I/O).
- If root ever goes read-only again (e.g. new ext4 errors), later fix attempts will again hit “/tmp not writable” until make-writable is run again.
- lvchange -an before
4. Recommended next steps
-
Complete the 2101 fix once (no e2fsck unless needed)
- Ensure 2101 is running and writable (if in doubt, run
make-rpc-vmids-writable-via-ssh.shonce). - Run only the 2101 fix with enough time for apt to finish:
./scripts/maintenance/fix-rpc-2101-jna-reinstall.sh - If it still stalls on apt, log into the CT and run apt by hand, then re-run the fix script (or install Besu manually and set
-Djava.io.tmpdir=/data/besu/tmpand startbesu-rpc).
- Ensure 2101 is running and writable (if in doubt, run
-
Optional: make-writable script
- Remove lvchange -an before
pct startinmake-rpc-vmids-writable-via-ssh.sh, or run it only when the CT will not be started immediately, so the LV is not deactivated right before start.
- Remove lvchange -an before
-
Verify
- After Besu is installed and
besu-rpcis started:pct exec 2101 -- systemctl status besu-rpcpct exec 2101 -- ss -tlnp | grep 8545curl -s -X POST -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' http://192.168.11.211:8545/
- After Besu is installed and
5. Reference
- 502_DEEP_DIVE: 502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md — Read-only CT, 2101 JNA, make-writable.
- DO_ALL summary: docs/04-configuration/verification-evidence/DO_ALL_20260215_SUMMARY.md — What was run, 2101 stall at “Installing packages…”.