ops: oracle publisher LXC 3500/3501, CT migrate docs, Besu/RPC maintenance
- Provision oracle-publisher on CT 3500 (quoted DATA_SOURCE URLs, dotenv). - Host-side pct-lxc-3501-net-up for ccip-monitor eth0 after migrate. - CoinGecko key script: avoid sed & corruption; document quoted URLs. - Besu node list reload, fstrim/RPC scripts, storage health docs. - Submodule smom-dbis-138: web3 v6 pin, oracle check default host r630-02. Made-with: Cursor
This commit is contained in:
@@ -26,7 +26,17 @@ If both show the same cluster and the other node is listed, migration is:
|
||||
pct migrate <VMID> r630-02 --restart
|
||||
```
|
||||
|
||||
Storage will be copied to the target; choose the target storage when prompted (e.g. `thin1`). Then **delete or leave stopped** the container on r630-01 so the same VMID/IP are only on r630-02.
|
||||
**CLI caveat:** `pct migrate` may fail if the CT references storages that do not exist on the target (e.g. `local-lvm` on r630-02) or if the source storage ID is inactive on the target (e.g. `thin1` on r630-02 vs `thin1-r630-02`). Remove **stale** `unusedN` volumes only after verifying with `lvs` that they are not the same LV as `rootfs` (see incident note below).
|
||||
|
||||
**Recommended (PVE API, maps rootfs to target pool):** use `pvesh` from the source node so disks land on e.g. `thin5`:
|
||||
|
||||
```bash
|
||||
ssh root@192.168.11.11 "pvesh create /nodes/r630-01/lxc/<VMID>/migrate --target r630-02 --target-storage thin5 --restart 1"
|
||||
```
|
||||
|
||||
This is the path that succeeded for **3501** (ccip-monitor) on 2026-03-28.
|
||||
|
||||
Storage will be copied to the target. The source volume is removed after a successful migrate. **Do not** use `pct set <vmid> --delete unused0` when `unused0` and `rootfs` both name `vm-<id>-disk-0` on different storages — Proxmox can delete the **only** root LV (Oracle publisher **3500** incident, 2026-03-28).
|
||||
|
||||
If the nodes are **not** in a cluster, use the backup/restore method below.
|
||||
|
||||
@@ -124,8 +134,8 @@ Containers that free meaningful space on r630-01 and are reasonable to run on r6
|
||||
| 6401 | indy-alltra-1 | 100G | ✅ Migrated (thin6) |
|
||||
| 6402 | indy-hybx-1 | 100G | ✅ Migrated (thin6) |
|
||||
| 5700 | dev-vm | 400G (thin) | ✅ Migrated (thin6) |
|
||||
| 3500 | oracle-publisher-1 | — | Oracle publisher |
|
||||
| 3501 | ccip-monitor-1 | — | CCIP monitor |
|
||||
| 3500 | oracle-publisher-1 | 20G thin1 (was) | **2026-03-28:** root LV accidentally removed; CT **recreated** on r630-02 `thin5` (fresh template). **Redeploy** app + `.env`. |
|
||||
| 3501 | ccip-monitor-1 | 20G | **2026-03-28:** migrated to r630-02 **`thin5`** via `pvesh … /migrate --target-storage thin5`. **Networking:** unprivileged Ubuntu image may leave **eth0 DOWN** after migrate; `unprivileged` cannot be toggled later. Mitigation: on **r630-02** install `scripts/maintenance/pct-lxc-3501-net-up.sh` to `/usr/local/sbin/` and optional **`@reboot`** cron (see script header). |
|
||||
|
||||
**High impact (larger disks):**
|
||||
|
||||
@@ -169,6 +179,23 @@ Example:
|
||||
|
||||
See the script for exact steps (stop, vzdump, scp, restore, start, optional destroy on source).
|
||||
|
||||
**Unprivileged CTs:** `vzdump` often fails with tar `Permission denied` under `lxc-usernsexec`. Prefer **section 1** `pvesh … /migrate` with `--target-storage` instead of this script for those guests.
|
||||
|
||||
## 5a. Reprovision Oracle Publisher (VMID 3500) on r630-02
|
||||
|
||||
After a fresh LXC template or data loss, from project root (LAN, secrets loaded):
|
||||
|
||||
```bash
|
||||
source scripts/lib/load-project-env.sh # or ensure PRIVATE_KEY / smom-dbis-138/.env
|
||||
./scripts/deployment/provision-oracle-publisher-lxc-3500.sh
|
||||
```
|
||||
|
||||
Uses `web3` 6.x (POA middleware). If on-chain `updateAnswer` fails, use a `PRIVATE_KEY` for an EOA allowed on the aggregator contract.
|
||||
|
||||
## 5b. r630-02 disk / VG limits (cannot automate)
|
||||
|
||||
Each `thin1`–`thin6` VG on r630-02 is a **single ~231 GiB SSD** with **~124 MiB `vg_free`**. There is **no** space to `lvextend` pools until you **grow the partition/PV** or add hardware. Guest `fstrim` and migration to `thin5` reduce **data** usage only within existing pools.
|
||||
|
||||
---
|
||||
|
||||
## 6. References
|
||||
|
||||
@@ -316,8 +316,8 @@ The following VMIDs have been permanently removed:
|
||||
|
||||
| VMID | IP Address | Hostname | Status | Endpoints | Purpose |
|
||||
|------|------------|----------|--------|-----------|---------|
|
||||
| 3500 | 192.168.11.29 | oracle-publisher-1 | ✅ Running | Oracle: Various | Oracle publisher service |
|
||||
| 3501 | 192.168.11.28 | ccip-monitor-1 | ✅ Running | Monitor: Various | CCIP monitoring service |
|
||||
| 3500 | 192.168.11.29 | oracle-publisher-1 | ✅ Running (verify on-chain) | Oracle: Various | **r630-02** `thin5`. Reprovisioned 2026-03-28 via `scripts/deployment/provision-oracle-publisher-lxc-3500.sh` (systemd `oracle-publisher`). If `updateAnswer` txs revert, set `PRIVATE_KEY` in `/opt/oracle-publisher/.env` to an EOA **authorized on the aggregator** (may differ from deployer). Metrics: `:8000/metrics`. |
|
||||
| 3501 | 192.168.11.28 | ccip-monitor-1 | ✅ Running | Monitor: Various | CCIP monitoring; **migrated 2026-03-28** to **r630-02** `thin5` (`pvesh` … `/migrate --target-storage thin5`). |
|
||||
| 5200 | 192.168.11.80 | cacti-1 | ✅ Running | Web: 80, 443 | Network monitoring (Cacti); **host r630-02** (migrated 2026-02-15) |
|
||||
|
||||
---
|
||||
|
||||
@@ -103,13 +103,19 @@ systemctl restart token-aggregation
|
||||
|
||||
### For Oracle Publisher Service
|
||||
|
||||
**Gas (Chain 138 / Besu):** In `/opt/oracle-publisher/.env`, use **`GAS_LIMIT=400000`** (not `100000`). The aggregator `updateAnswer` call can **run out of gas** at 100k (`gasUsed == gasLimit`, failed receipt) even when `isTransmitter` is true. Align with `ORACLE_UPDATE_GAS_LIMIT` in `smom-dbis-138/scripts/update-oracle-price.sh`. **`GAS_PRICE=1000000000`** (1 gwei) matches that script’s legacy defaults.
|
||||
|
||||
**Quoted URLs in `.env`:** `DATA_SOURCE_1_URL` (and Coinbase `DATA_SOURCE_2_URL`) must be **double-quoted** when passed through **systemd `EnvironmentFile`**, because unquoted `&` in query strings can be parsed incorrectly and corrupt the value. **`scripts/update-oracle-publisher-coingecko-key.sh`** uses `grep` + `append` (not `sed` with `&` in the replacement). Do not use `sed 's|...|...&...|'` for URLs that contain `&`.
|
||||
|
||||
**Dotenv sources for provisioning:** `scripts/lib/load-project-env.sh` loads **project root `.env`** then **`smom-dbis-138/.env`** — so `PRIVATE_KEY` / `DEPLOYER_PRIVATE_KEY`, `COINGECKO_API_KEY` (root `.env`), and `AGGREGATOR_ADDRESS` are available to `scripts/deployment/provision-oracle-publisher-lxc-3500.sh` and `scripts/update-oracle-publisher-coingecko-key.sh`.
|
||||
|
||||
**Step 1: SSH to Proxmox host**
|
||||
|
||||
```bash
|
||||
ssh root@192.168.11.10
|
||||
ssh root@192.168.11.12
|
||||
```
|
||||
|
||||
**Step 2: Access Oracle Publisher container**
|
||||
**Step 2: Access Oracle Publisher container** (VMID 3500 runs on **r630-02**)
|
||||
|
||||
```bash
|
||||
pct exec 3500 -- bash
|
||||
@@ -162,10 +168,10 @@ npm run test -- coingecko-adapter.test.ts
|
||||
|
||||
```bash
|
||||
# Check .env file
|
||||
ssh root@192.168.11.10 "pct exec 3500 -- cat /opt/oracle-publisher/.env | grep COINGECKO"
|
||||
ssh root@192.168.11.12 "pct exec 3500 -- cat /opt/oracle-publisher/.env | grep COINGECKO"
|
||||
|
||||
# Check service logs
|
||||
ssh root@192.168.11.10 "pct exec 3500 -- journalctl -u oracle-publisher -n 50 | grep -i coingecko"
|
||||
ssh root@192.168.11.12 "pct exec 3500 -- journalctl -u oracle-publisher -n 50 | grep -i coingecko"
|
||||
|
||||
# Should see successful price fetches without 429 rate limit errors
|
||||
```
|
||||
|
||||
@@ -1,8 +1,18 @@
|
||||
# Storage Growth and Health — Predictable Growth Table & Proactive Monitoring
|
||||
|
||||
**Last updated:** 2026-02-15
|
||||
**Last updated:** 2026-03-28
|
||||
**Purpose:** Real-time data collection and a predictable growth table so we can stay ahead of disk space issues on hosts and VMs.
|
||||
|
||||
### Recent operator maintenance (2026-03-28)
|
||||
|
||||
- **r630-01 `pve/data` (local-lvm):** Thin pool extended (+80 GiB data, +512 MiB metadata earlier); **LVM thin auto-extend** enabled in `lvm.conf` (`thin_pool_autoextend_threshold = 80`, `thin_pool_autoextend_percent = 20`); **dmeventd** must stay active.
|
||||
- **r630-01 `pve/thin1`:** Pool extended (+48 GiB data, +256 MiB metadata) to reduce pressure; metadata percent dropped accordingly.
|
||||
- **r630-01 `/var/lib/vz/dump`:** Removed obsolete **2026-02-15** vzdump archives/logs (~9 GiB); newer logs from 2026-02-28 retained.
|
||||
- **Fleet guest `fstrim`:** `scripts/maintenance/fstrim-all-running-ct.sh` supports **`FSTRIM_TIMEOUT_SEC`** and **`FSTRIM_HOSTS`** (e.g. `ml110`, `r630-01`, `r630-02`). Many CTs return FITRIM “not permitted” (guest/filesystem); others reclaim space on the thin pools (notably on **r630-02**).
|
||||
- **r630-02 `thin1`–`thin6` VGs:** Each VG is on a **single PV** with only **~124 MiB `vg_free`**; you **cannot** `lvextend` those thin pools until the underlying partition/disk is grown or a second PV is added. Monitor `pvesm status` and plan disk expansion before pools tighten.
|
||||
- **CT migration** off r630-01 for load balance remains a **planned** action when maintenance windows and target storage allow (not automated here).
|
||||
- **2026-03-28 (migration follow-up):** CT **3501** migrated to r630-02 **`thin5`** via `pvesh … lxc/3501/migrate --target-storage thin5`. CT **3500** had root LV removed after a mistaken `pct set --delete unused0` (config had `unused0: local-lvm:vm-3500-disk-0` and `rootfs: thin1:vm-3500-disk-0`); **3500** was recreated empty on r630-02 `thin5` — **reinstall Oracle Publisher** on the guest. See `MIGRATE_CT_R630_01_TO_R630_02.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1. Real-time data collection
|
||||
|
||||
Reference in New Issue
Block a user