proxmox/docs/03-deployment/RPC_2101_READONLY_FIX.md

# RPC 2101 (Core) — Read-only filesystem fix

**VMID 2101** (192.168.11.211, Chain 138 Core RPC) can fail with Besu in a crash loop and **port 8545 connection refused**. Root cause observed: **Read-only file system** on `/data/besu/database/`.

## Cause

- **Kernel I/O errors** on the host (Proxmox 192.168.11.11): `Buffer I/O error on device dm-*`, `EXT4-fs: failed to convert unwritten extents`, `potential data loss`.
- ext4 remounts the filesystem **read-only** to avoid further corruption. Besu then fails with:
  `RocksDBException: While appending to file: /data/besu/database/... : Read-only file system`.
- Besu may also crash at startup with **JNA**: `UnsatisfiedLinkError: Failed to create temporary file for ... libjnidispatch.so: Read-only file system` — JNA needs a writable temp dir (e.g. `/tmp` or `java.io.tmpdir`); if the whole root is ro, startup fails before RPC binds.

## Before deploying contracts

Contract deployments use **Core RPC only** (no Public fallback). Fix read-only and verify health first:

1. **Fix read-only:** `./scripts/maintenance/make-rpc-vmids-writable-via-ssh.sh`
2. **Health check:** `./scripts/maintenance/health-check-rpc-2101.sh` (must pass)
3. **Deploy:** `./scripts/deployment/deploy-transaction-mirror-and-pmm-pool-after-txpool-clear.sh`

If you get **"Known transaction"** (stuck tx at deployer nonce), clear the Core RPC tx pool: `./scripts/clear-all-transaction-pools.sh` then retry deploy.

## Fixing 2101 (operator)

1. **SSH to Proxmox host:** `ssh root@192.168.11.11`
2. **Check kernel logs for I/O errors:**
   `dmesg | grep -E "Buffer I/O|EXT4-fs|dm-"`
   Identify which dm-* (LV) is affected; `ls -la /dev/mapper/pve-vm--2101--disk--0` shows 2101’s device (e.g. dm-45).
3. **Storage health:** Check LVM and disks (e.g. `lvs`, `pvs`, `smartctl` on underlying disks). Replace or repair failing hardware.
4. **Remount read-write (only if storage is known good):**
   - Stop the container: `pct stop 2101`
   - From the host, the container root is mounted by Proxmox; after fixing storage you may need to run `fsck` on the LV or reboot the host. If the filesystem was remounted ro due to transient error, sometimes a container stop/start helps (host remounts the LV).
   - Start the container: `pct start 2101`
   - Inside container verify: `pct exec 2101 -- touch /data/besu/database/.write_test && rm /data/besu/database/.write_test`
5. **Restart Besu RPC:**
   `pct exec 2101 -- systemctl restart besu-rpc.service`
   Then: `./scripts/check-network-rpc-138.sh 192.168.11.211 8545`

### If still read-only after make-writable

If `make-rpc-vmids-writable-via-ssh.sh` completes but inside the container **`/tmp`, `/data/besu/database`, or `/data/besu/tmp`** are still read-only (`touch` fails with "Read-only file system"):

- **e2fsck** may have reported `Error writing file system info: Input/output error` — the **underlying storage** (LV or disk on the host) may be failing.
- **Thin pool 100% full:** CT 2101 (and other RPC nodes) use the LVM thin pool **pve/data**. If the pool is 100% full (`lvs pve/data` shows Data% 100.00), writes can fail and the kernel may remount the filesystem read-only. **Fix:** On the Proxmox host, extend the pool if the VG has free space: `lvextend -L +80G pve/data` (adjust size). Then re-run make-writable and restart the container. Alternatively migrate the CT to another pool (e.g. thin1) or free space by removing/moving other LVs.
- On the Proxmox host: check `dmesg | grep -E 'I/O error|dm-|ext4'`, and run `smartctl` / LVM checks on the storage backing the CT. If the LV or disk has persistent I/O errors, fix or replace storage, then re-run `make-rpc-vmids-writable-via-ssh.sh`, or migrate the CT to healthy storage.

## TransactionMirror address

Set `TRANSACTION_MIRROR_ADDRESS` in `smom-dbis-138/.env` from the deploy script output. A previous deploy used **0xE362aa10D3Af1A16880A799b78D18F923403B55a**; use the script output as source of truth.

## Scripts

- **Make Core writable (fix read-only):** `./scripts/maintenance/make-rpc-vmids-writable-via-ssh.sh` — run first when 2101 is read-only.
- **Health check:** `./scripts/maintenance/health-check-rpc-2101.sh` — container, service, port, RPC eth_chainId/eth_blockNumber, and database writability.
- **Fix/restart Besu:** `./scripts/maintenance/fix-core-rpc-2101.sh` [--dry-run] [--restart-only].
- **Check/start RPC service:** `./scripts/check-and-start-rpc-2101.sh` (cannot fix read-only; only restarts the service).
- **Network check:** `./scripts/check-network-rpc-138.sh [HOST] [PORT]` (default 192.168.11.211 8545).
- **Deploy (Core only):** `./scripts/deployment/deploy-transaction-mirror-and-pmm-pool-after-txpool-clear.sh`. No Public fallback; fix Core first.