Files
proxmox/docs/00-meta/OPERATOR_READY_CHECKLIST.md
defiQUG a086c451c3 docs: sync The Order routing (10210 HAProxy) and fix stale TBDs
- E2E, ALL_VMIDS, operator checklist, RPC_ENDPOINTS_MASTER, DNS/NPM architecture
- PROXMOX deployment template: the-order wired via 10210
- Placeholders master + r630-02 incomplete summary for 10210
- CT 10210: chown /var/cache on host idmap (mandb clean) — applied on cluster

Made-with: Cursor
2026-03-27 15:06:06 -07:00

283 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Operator Ready Checklist — Copy-Paste Commands
**Last Updated:** 2026-03-27
**Purpose:** Single page with exact commands to complete every pending todo. Run from **repo root** on a host with **LAN** access (and `smom-dbis-138/.env` with `PRIVATE_KEY`, `NPM_PASSWORD` where noted).
**Do you have all necessary creds?** See [OPERATOR_CREDENTIALS_CHECKLIST.md](OPERATOR_CREDENTIALS_CHECKLIST.md) — per-task list of LAN, PRIVATE_KEY, NPM_PASSWORD, RPC_URL_138, SSH, LINK, gas, token balance.
**From anywhere (no LAN):** `./scripts/run-completable-tasks-from-anywhere.sh`
**Ensure this machine always has Proxmox SSH access:** `./scripts/security/ensure-proxmox-ssh-access.sh` (verifies key-based SSH to .10, .11, .12; use `--copy` to install key if missing). **NPMplus from this machine (if direct 192.168.11.167:81 unreachable):** `ssh -L 8181:192.168.11.167:81 -N root@192.168.11.11` then use `http://127.0.0.1:8181` for NPMplus API.
**If deployer needs gas on currently active public chains:** Run `./scripts/deployment/deployer-gas-auto-route.sh` (optional: `--dry-run`, `--chain 138`). See [DEPLOYER_GAS_AUTO_ROUTE_RUNBOOK.md](../03-deployment/DEPLOYER_GAS_AUTO_ROUTE_RUNBOOK.md). **Current policy:** Wemix is deferred.
**Current live execution path:** [LIVE_SESSION_CRONOS_AND_TIER1_PHASE_C.md](../03-deployment/LIVE_SESSION_CRONOS_AND_TIER1_PHASE_C.md) — close Cronos config + LINK, then activate Tier 1 Phase C on Gnosis, Polygon, and BSC. **Current priority docs:** [FULLY_OPERATIONAL_EXECUTION_CHECKLIST.md](FULLY_OPERATIONAL_EXECUTION_CHECKLIST.md), [PHASE_C_PROFIT_FIRST_PRIORITY.md](../03-deployment/PHASE_C_PROFIT_FIRST_PRIORITY.md), [PHASE_C_TIER1_EXECUTION_TASK_SHEET.md](../03-deployment/PHASE_C_TIER1_EXECUTION_TASK_SHEET.md).
---
## Completed in this session (2026-03-26)
| Item | Result |
|------|--------|
| NPMplus recovery | VMID `10233` was wedged on `192.168.11.167:81` (TCP connect, no HTTP). `pct reboot 10233` on `r630-01` restored the expected `301` response on port `81`. |
| NPMplus API updater | `NPM_URL=https://192.168.11.167:81 bash scripts/nginx-proxy-manager/update-npmplus-proxy-hosts-api.sh` completed with **39 hosts updated, 0 failed**. |
| Sankofa / Order / Studio routing | **Superseded 2026-03-27:** Order hostnames default to **order-haproxy** `http://192.168.11.39:80` (10210 → `.51:3000`). Through 2026-03-26 NPM pointed Order directly at portal `:3000`. `studio.sankofa.nexus``http://192.168.11.72:8000`. |
| Public E2E | Latest run `bash scripts/verify/verify-end-to-end-routing.sh --profile=public` exited `0` with **Failed: 0**, **DNS passed: 37**, **HTTPS passed: 22**. Sankofa, Phoenix, Studio, The Order, DBIS, Mifos, and MIM4U public endpoints passed. Evidence: `docs/04-configuration/verification-evidence/e2e-verification-20260326_115013/`. |
| Private E2E | Latest run `bash scripts/verify/verify-end-to-end-routing.sh --profile=private` exited `0` with **Failed: 0** and **DNS passed: 4**. `rpc-http-prv.d-bis.org`, `rpc-fireblocks.d-bis.org`, `rpc-ws-prv.d-bis.org`, and `ws.rpc-fireblocks.d-bis.org` all passed. Evidence: `docs/04-configuration/verification-evidence/e2e-verification-20260326_120939/`. |
| NPMplus backup | Fresh backup completed: `backups/npmplus/backup-20260326_115622.tar.gz`. API exports succeeded; direct SQLite file copy and certbot path copy were partial/warn-only, but the backup manifest and compressed bundle were created successfully. |
| Blockscout verification run | `./scripts/verify/run-contract-verification-with-proxy.sh` completed; contracts were submitted or skipped if already verified. `WETH10` returned `The address is not a smart contract`; others like `Multicall`, `Aggregator`, `Proxy`, `CCIPSender`, `CCIPWETH10Bridge`, and `CCIPWETH9Bridge` submitted successfully. |
| Private RPC redirect fix | `rpc-http-prv.d-bis.org` no longer returns HTTP `301` on JSON-RPC POST. Live NPMplus host `11` was updated to `ssl_forced=false` while preserving upstream `192.168.11.211:8545`. |
| NPM creds loading | For NPM-only runs, prefer targeted `grep` of `NPM_EMAIL` / `NPM_PASSWORD` if full `.env` export triggers `Argument list too long`. |
---
## 1. High: Cronos closure + reachable CCIP funding
**Ref:** [CONFIG_READY_CHAINS_COMPLETION_RUNBOOK](../07-ccip/CONFIG_READY_CHAINS_COMPLETION_RUNBOOK.md)
**Prereqs:** Confirm [CCIP supports](https://docs.chain.link/ccip/supported-networks) for the chains you are actively using. Current focus: **Cronos (25)**, plus reachable funded lanes. Per chain: RPC, CCIP Router, LINK, WETH9/WETH10, deployer with native gas. **Do not block the session on Wemix.**
```bash
cd smom-dbis-138
source .env
# Per chain (set RPC_URL, CCIP_ROUTER_ADDRESS, LINK_TOKEN_ADDRESS, WETH9_ADDRESS, WETH10_ADDRESS, PRIVATE_KEY)
forge script script/deploy/bridge/DeployWETHBridges.s.sol:DeployWETHBridges --rpc-url "$RPC_URL" --broadcast -vvvv
```
Then add destinations (Chain 138 ↔ each chain) and fund with LINK — use:
```bash
DRY_RUN=1 ./scripts/deployment/complete-config-ready-chains.sh # print commands
./scripts/deployment/complete-config-ready-chains.sh # run (requires bridge addresses in .env)
```
**Cronos closure:** Cronos bridges are already present on-chain. Use:
```bash
cd smom-dbis-138
DRY_RUN=1 ./scripts/deployment/complete-config-ready-chains.sh
./scripts/deployment/complete-config-ready-chains.sh
./scripts/deployment/fund-ccip-bridges-with-link.sh --dry-run
./scripts/deployment/fund-ccip-bridges-with-link.sh
```
**Wemix:** deferred by policy. Revisit only after profitable routes fund expansion gas.
**Full live-session order:** See [LIVE_SESSION_CRONOS_AND_TIER1_PHASE_C.md](../03-deployment/LIVE_SESSION_CRONOS_AND_TIER1_PHASE_C.md).
---
## 2. Medium: LINK support on Mainnet relay
**Ref:** [RELAY_BRIDGE_ADD_LINK_SUPPORT_RUNBOOK](../07-ccip/RELAY_BRIDGE_ADD_LINK_SUPPORT_RUNBOOK.md)
**Options:** A = extend CCIPRelayBridge to accept LINK; B = deploy separate LINK receiver. After implement + deploy + fund:
```bash
# In config/token-mapping.json set relaySupported: true for LINK
# Update TOKEN_MAPPING_AND_MAINNET_ADDRESSES.md and CCIP_BRIDGE_MAINNET_CONNECTION.md
# Restart relay service on r630-01: /opt/smom-dbis-138/services/relay
```
---
## 3. LAN: Blockscout verification
```bash
source smom-dbis-138/.env 2>/dev/null
./scripts/verify/run-contract-verification-with-proxy.sh
```
Single contract retry: `./scripts/verify/run-contract-verification-with-proxy.sh --only ContractName`
---
## 4. LAN: Fix E2E 502s
```bash
./scripts/maintenance/run-all-maintenance-via-proxmox-ssh.sh --e2e
# Or lighter:
./scripts/maintenance/address-all-remaining-502s.sh --run-besu-fix --e2e
```
**Runbook:** [502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md](502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md)
**Current status after 2026-03-26:** no public 502s reproduced in the latest public E2E run. Use this section only if those endpoints regress.
---
## 5. LAN: Run all operator tasks (backup + verify ± deploy ± create-vms)
```bash
./scripts/run-all-operator-tasks-from-lan.sh --dry-run # print steps
./scripts/run-all-operator-tasks-from-lan.sh # backup + Blockscout verify
./scripts/run-all-operator-tasks-from-lan.sh --deploy # + contract deploy
./scripts/run-all-operator-tasks-from-lan.sh --create-vms # + create DBIS Core + TsunamiSwap VM (5010)
./scripts/run-all-operator-tasks-from-lan.sh --deploy --create-vms
```
---
## 5c. LAN: TsunamiSwap VM (5010) and CCIP funding
**TsunamiSwap VM:** Create once (default r630-01, 8 vCPU, 16 GB, 160 GB at 192.168.11.91). For r630-02 use `STORAGE=thin2 ./scripts/create-tsunamiswap-vm.sh --node r630-02`. Then run post-create setup (Docker + dirs):
```bash
./scripts/create-tsunamiswap-vm.sh --dry-run # print steps
./scripts/create-tsunamiswap-vm.sh # create VMID 5010
./scripts/setup-tsunamiswap-vm-5010.sh [--dry-run] # install Docker, create /opt/tsunamiswap (from LAN)
./scripts/deploy-tsunamiswap-to-5010.sh [--dry-run] # deploy backend+UI to 5010 (first run installs Node, ~510 min)
```
**CCIP funding (LINK):** After deployer has LINK and native gas on each chain:
```bash
cd smom-dbis-138
./scripts/deployment/fund-ccip-bridges-with-link.sh --dry-run # print commands
./scripts/deployment/fund-ccip-bridges-with-link.sh [--link 10] # run (non-fatal per chain)
```
**Ref:** [AAVE_CHAIN138_AND_MARIONETTE_TSUNAMISWAP_PLAN.md](AAVE_CHAIN138_AND_MARIONETTE_TSUNAMISWAP_PLAN.md), [OPERATIONAL_RUNBOOKS.md](../03-deployment/OPERATIONAL_RUNBOOKS.md) § TsunamiSwap.
---
## 5d. Sankofa Phoenix API — Enable railing proxy
**Ref:** [PHOENIX_RAILING_OPERATOR_SETUP.md](../04-configuration/PHOENIX_RAILING_OPERATOR_SETUP.md)
In the environment where **Sankofa Phoenix API** runs, set:
```bash
export PHOENIX_RAILING_URL=http://phoenix-deploy-api:4001 # or your Phoenix Deploy API URL
# Optional if railing enforces partner keys:
export PHOENIX_RAILING_API_KEY=<key>
```
Restart the API; then `/api/v1/infra/nodes`, `/api/v1/health/summary`, etc. will proxy to the railing.
---
## 5a. LAN: Token-aggregation DB and migrations (VMID 5000)
If `/health` returns "database token_aggregation does not exist":
```bash
./scripts/apply-token-aggregation-fix.sh # create DB, run migrations, restart (via Proxmox)
./scripts/apply-token-aggregation-fix.sh --dry-run # print steps only
```
If VMID 5000 has no `postgres` user, run `createdb` and migrations on the host where PostgreSQL runs, or set token-aggregation `DATABASE_URL` to `explorer_db` and run `smom-dbis-138/services/token-aggregation/scripts/run-migrations.sh` there.
---
## 5b. LAN: Chain 138 next steps (Phase 2: preflight → mirror+pool → register c* as GRU → verify)
**Ref:** [DEPLOYMENT_ORDER_OF_OPERATIONS](../03-deployment/DEPLOYMENT_ORDER_OF_OPERATIONS.md) Phase 2. Use when mirror/pool/GRU registration or verify are pending.
```bash
./scripts/deployment/run-all-next-steps-chain138.sh --dry-run # print steps only
./scripts/deployment/run-all-next-steps-chain138.sh # run all (preflight, deploy mirror+pool, register c*, verify)
./scripts/deployment/run-all-next-steps-chain138.sh --skip-mirror # pool + register + verify only (set TRANSACTION_MIRROR_ADDRESS in smom-dbis-138/.env first)
```
If TransactionMirror deploy fails with **CreateCollision:** set `TRANSACTION_MIRROR_ADDRESS=0xC7f2Cf4845C6db0e1a1e91ED41Bcd0FcC1b0E141` in `smom-dbis-138/.env` and re-run with `--skip-mirror`. See [TRANSACTION_MIRROR_CHAIN138_COLLISION_FIX](../03-deployment/TRANSACTION_MIRROR_CHAIN138_COLLISION_FIX.md).
---
## 6. Low: DODO PMM on Chain 138
**Ref:** [OPTIONAL_DEPLOYMENTS_START_HERE](../07-ccip/OPTIONAL_DEPLOYMENTS_START_HERE.md) §2B
**Prereqs:** Set in `smom-dbis-138/.env`: `DODO_VENDING_MACHINE_ADDRESS`, `COMPLIANT_USDT_ADDRESS`, `COMPLIANT_USDC_ADDRESS`.
```bash
./scripts/run-optional-deployments.sh --execute --phases 7
# Or from smom-dbis-138: ./scripts/deployment/deploy-optional-future-all.sh (Phase 7 = DODO)
```
---
## 7. Low: Mainnet trustless stack (Lockbox138 + Mainnet)
**Ref:** [OPTIONAL_DEPLOYMENTS_START_HERE](../07-ccip/OPTIONAL_DEPLOYMENTS_START_HERE.md) §2C
**Prereqs:** `ETHEREUM_MAINNET_RPC`, Mainnet ETH for deployer.
```bash
cd smom-dbis-138
source .env
forge script script/bridge/trustless/DeployTrustlessBridge.s.sol:DeployTrustlessBridge \
--rpc-url "$ETHEREUM_MAINNET_RPC" --broadcast --via-ir --verify
# Then: Lockbox138 on 138; configure Lockbox138↔InboxETH; fund liquidity. See runbook §C.
```
---
## 8. Wave 0: sendCrossChain (real) and NPMplus backup
**sendCrossChain (real):** Requires `PRIVATE_KEY` and LINK approved in `.env`. Bridge: `0xcacfd227A040002e49e2e01626363071324f820a`.
```bash
bash scripts/bridge/run-send-cross-chain.sh 0.01 [recipient_address]
# Omit --dry-run to execute. Example: bash scripts/bridge/run-send-cross-chain.sh 0.01 0x...
```
**NPMplus backup:** Requires `NPM_PASSWORD` in `.env` and host on LAN.
```bash
bash scripts/verify/backup-npmplus.sh
# Or combined Wave 0: bash scripts/run-wave0-from-lan.sh
```
**NPMplus RPC fix (405):** From LAN: `bash scripts/nginx-proxy-manager/update-npmplus-proxy-hosts-api.sh`. Verify: `bash scripts/verify/verify-end-to-end-routing.sh`.
**Status (2026-03-26):** main NPMplus API update completed successfully with `39 hosts updated, 0 failed`; public E2E now passes for Sankofa root, Phoenix, Studio, and The Order. Re-run only when upstream targets or proxy definitions change.
**Latest backup evidence:** `backups/npmplus/backup-20260326_115622.tar.gz`
**NPMplus API unreachable (167/169):** Restart Docker inside NPMplus LXC: `./scripts/maintenance/fix-npmplus-services-via-proxmox-ssh.sh` (SSH to r630-01, restarts npmplus in 10233 and 10235).
**If port 81 accepts TCP but hangs at HTTP:** reboot CT `10233` with `pct reboot 10233` on `r630-01`, then retry the API updater.
**E2E from LAN (no public DNS):** If E2E fails at DNS (`Could not resolve host`), use [E2E_DNS_FROM_LAN_RUNBOOK.md](../04-configuration/E2E_DNS_FROM_LAN_RUNBOOK.md): append `config/e2e-hosts-append.txt` to `/etc/hosts`, then run `E2E_USE_SYSTEM_RESOLVER=1 ./scripts/verify/verify-end-to-end-routing.sh --profile=public`. Revert with `sudo ./scripts/verify/remove-e2e-hosts-from-etc-hosts.sh`.
**E2E profiles:** Use `--profile=public` for public endpoints (default) or `--profile=private` for private/admin RPC only. Run sequentially to avoid timestamp collision in evidence dirs. **Known E2E warnings** (502/404 and WS): [E2E_ENDPOINTS_LIST.md](../04-configuration/E2E_ENDPOINTS_LIST.md) § Known E2E warnings and Remediation. MIM4U web 502s and WS test-format warnings are **non-blocking** for contract/pool completion.
**Pre-PR validation:** Before opening PRs (Chainlist, token list, Trust Wallet), run `./scripts/run-before-pr-validations.sh` from repo root.
---
## 8.5 PMM mesh (6s oracle / keeper / PMMWETH poll)
**Ref:** `smom-dbis-138/docs/integration/ORACLE_AND_KEEPER_CHAIN138.md` (PMM mesh automation)
```bash
cd smom-dbis-138
# .env should include: PRIVATE_KEY, AGGREGATOR_ADDRESS, PRICE_FEED_KEEPER_ADDRESS (optional: KEEPER_PRIVATE_KEY if different from PRIVATE_KEY)
./scripts/reserve/set-price-feed-keeper-interval.sh 6 # once per keeper deployment if interval was 30s
./scripts/update-oracle-price.sh # verify transmitter + gas (Besu needs explicit gas limit in script)
./scripts/reserve/sync-weth-mock-price.sh # if CHAIN138_WETH_MOCK_PRICE_FEED is set (keeper WETH path)
mkdir -p logs
nohup ./scripts/reserve/pmm-mesh-6s-automation.sh >> logs/pmm-mesh-automation.log 2>&1 &
# journalctl equivalent: tail -f logs/pmm-mesh-automation.log
```
**systemd:** `config/systemd/chain138-pmm-mesh-automation.service.example` — copy, set `User` and absolute paths, `enable --now`.
---
## 9. Wemix token verification (Deferred)
This is intentionally deferred with the rest of the Wemix path. If the chain is brought back into scope later, open [scan.wemix.com/tokens](https://scan.wemix.com/tokens); confirm WETH, USDT, USDC addresses. If different, update `config/token-mapping-multichain.json` and [WEMIX_TOKEN_VERIFICATION.md](../07-ccip/WEMIX_TOKEN_VERIFICATION.md). Then:
```bash
./scripts/validation/validate-config-files.sh
```
---
## References
- [COMPLETE_REQUIRED_OPTIONAL_RECOMMENDED_INDEX.md](COMPLETE_REQUIRED_OPTIONAL_RECOMMENDED_INDEX.md) — full plan (required, optional, recommended)
- [TODOS_CONSOLIDATED.md](TODOS_CONSOLIDATED.md) — full task list
- [NEXT_STEPS_AND_REMAINING_TODOS.md](NEXT_STEPS_AND_REMAINING_TODOS.md) — detail and completed items
- [STEPS_FROM_PROXMOX_OR_LAN_WITH_SECRETS.md](STEPS_FROM_PROXMOX_OR_LAN_WITH_SECRETS.md) — full LAN steps