Co-authored-by: Cursor <cursoragent@cursor.com>
15 KiB
Next Steps — Operator Runbook
Last Updated: 2026-02-20
Purpose: Single runbook of copy-paste commands for all remaining operator/LAN/creds steps. Use after automated steps are done.
References: REMAINING_WORK_DETAILED_STEPS.md, WAVE2_WAVE3_OPERATOR_CHECKLIST.md, INFRA_DEPLOYMENT_LOCKED_AND_LOADED.md. Single fixes checklist (required + optional): FIXES_PREPARED.md. Full fixes (validators, block/tx, Sentries, RPCs, network, optional): FULL_FIXES_PREPARED.md. All next steps (consolidated): NEXT_STEPS_ALL.md. Dev/Codespaces (76.53.10.40): DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md. Dev/Codespaces completion evidence: DEV_CODESPACES_COMPLETION_20260207.md.
Completed in this session (2026-02-20)
| Item | Result |
|---|---|
| Completable tasks | run-completable-tasks-from-anywhere.sh — config validation OK, on-chain 45/45, run-all-validation --skip-genesis OK, reconcile-env --print. |
| Doc consolidation | NEXT_STEPS_INDEX, DOCUMENTATION_CONSOLIDATION_PLAN; Batch 4+5 → 00-meta-pruned; root cleanup → archive/root-cleanup-20260220; ARCHIVE_CANDIDATES "Last reviewed" set. |
Completed in previous session (2026-02-19)
| Item | Result |
|---|---|
| Completable tasks | run-completable-tasks-from-anywhere.sh — config, 46 on-chain, validation passed. |
| Operator script | run-all-operator-tasks-from-lan.sh — W0-1 skipped (off-LAN); Blockscout verify attempted (Blockscout unreachable). |
| RPC 2101 verify | verify-rpc-2101-approve-and-sync.sh — ✅ Chain 138, 19 peers, 5 validators, blocks advancing. |
| 502 script | address-all-remaining-502s.sh — backends 10130/10150/10151 OK; Besu 2101 restarted (finish from LAN for NPMplus). |
| Optional Phase 9 | Smart accounts kit (informational) — ran; next: deploy EntryPoint/AccountFactory/Paymaster. |
| E2E verification | verify-end-to-end-routing.sh with E2E_ACCEPT_502_INTERNAL=1 — run (report in verification-evidence). |
Still from LAN: NPMplus backup, Blockscout verification, full 502/NPMplus proxy update. See COMPLETION_STATUS_20260215.
Completed in previous session (2026-02-06)
| Item | Result |
|---|---|
| Validation | run-all-validation.sh --skip-genesis — passed |
| W1-1 dry-run | setup-ssh-key-auth.sh --dry-run — steps printed |
| W1-2 dry-run | firewall-proxmox-8006.sh --dry-run — UFW commands printed (ADMIN_CIDR=192.168.11.0/24) |
| NPMplus backup | backup-npmplus.sh — ran successfully (local + on host); backup pulled to backups/npmplus/backup-20260206_171756.tar.gz |
| Bridge dry-run | run-send-cross-chain.sh 0.01 --dry-run — simulated (real run when PRIVATE_KEY/LINK ready) |
| .env NPM | NPM_URL/NPM_HOST set to 192.168.11.167:81 (use .167 if .166 refuses) |
| Copy to host | Scripts copied to root@192.168.11.11:/tmp/proxmox-scripts-run (wave0, backup, secure-validator-keys, create-missing-containers, schedule cron scripts, daily-weekly-checks) |
| Wave 0 on host | Ran on r630-01: W0-1 (19 NPMplus proxy hosts updated), W0-3 (backup); backup also on host at .../backups/npmplus/backup-20260206_171756.tar.gz |
| Backup pulled | Host backup copied to local backups/npmplus/backup-20260206_171756.tar.gz |
| Validator keys | secure-validator-keys.sh --dry-run run on host — 1000–1002 would be secured; 1003–1004 not running, skipped. Use --apply on host when ready. |
| Cron scripts on host | schedule-npmplus-backup-cron.sh and schedule-daily-weekly-cron.sh (and daily-weekly-checks.sh) copied; use --show then --install from /tmp/proxmox-scripts-run if you want cron there (note: /tmp may be cleared on reboot; for permanent cron, clone repo to a persistent path on the host). |
| Cron installed on host | NPMplus backup cron (03:00) and daily/weekly cron (08:00 daily, Sun 09:00 weekly) installed on root@192.168.11.11. Logs: /tmp/proxmox-scripts-run/logs/npmplus-backup.log, daily-weekly-checks.log. |
| Validator keys applied | secure-validator-keys.sh run on host (no --dry-run): VMIDs 1000, 1001, 1002 secured (chmod 600/700, chown besu); 1003, 1004 not running, skipped. |
Wave 0 — Gates
W0-2: sendCrossChain (real)
When: PRIVATE_KEY and LINK (or fee token) approved in .env; you are ready to broadcast.
cd /path/to/proxmox
# Optional: dry-run first
bash scripts/bridge/run-send-cross-chain.sh 0.01 --dry-run
# Real (no --dry-run)
bash scripts/bridge/run-send-cross-chain.sh 0.01
# Or with recipient:
bash scripts/bridge/run-send-cross-chain.sh 0.01 0xYourRecipientAddress
Bridge contract (reference): 0x971cD9D156f193df8051E48043C476e53ECd4693. Ensure CCIPWETH9_BRIDGE_CHAIN138 and RPC_URL_138/CHAIN138_RPC in .env.
W0-3: NPMplus backup (re-run anytime)
Backup already ran once; re-run when NPMplus is up and you want a fresh backup:
cd /path/to/proxmox
bash scripts/verify/backup-npmplus.sh
From a host without NPM API access, use: bash scripts/run-via-proxmox-ssh.sh wave0 --host 192.168.11.11 (r630-01) to run W0-1 + W0-3 on the host.
Crontab (install on jump host or Proxmox node)
cd /path/to/proxmox
# Show lines
bash scripts/maintenance/schedule-npmplus-backup-cron.sh --show
bash scripts/maintenance/schedule-daily-weekly-cron.sh --show
# Install
bash scripts/maintenance/schedule-npmplus-backup-cron.sh --install
bash scripts/maintenance/schedule-daily-weekly-cron.sh --install
Wave 1 — Security (run on each Proxmox host or via SSH)
W1-1: SSH key-based auth (disable password)
Pre-requisite: Deploy SSH keys to all hosts (ssh-copy-id root@<host>); test login; have break-glass access.
cd /path/to/proxmox
# On each Proxmox host (or: ssh root@192.168.11.11 'cd /path/to/proxmox && bash scripts/security/setup-ssh-key-auth.sh --apply')
bash scripts/security/setup-ssh-key-auth.sh --apply
W1-2: Firewall — restrict Proxmox API port 8006
Pre-requisite: Run on host where UFW is used (or apply equivalent iptables). Default CIDR: 192.168.11.0/24.
cd /path/to/proxmox
# Dry-run (already done)
bash scripts/security/firewall-proxmox-8006.sh --dry-run
# Apply (allow only ADMIN_CIDR)
bash scripts/security/firewall-proxmox-8006.sh --apply
# Or with custom CIDR:
bash scripts/security/firewall-proxmox-8006.sh --apply 192.168.11.0/24
Then verify: https://<proxmox-ip>:8006 only from allowed IPs.
W1-19: Secure validator keys (on Proxmox host as root)
cd /path/to/proxmox
bash scripts/secure-validator-keys.sh --dry-run # review
bash scripts/secure-validator-keys.sh # apply (chmod 600, chown besu)
VMIDs 2506, 2507, 2508 — Destroyed 2026-02-08
Containers 2506, 2507, 2508 were removed and destroyed on all Proxmox hosts. Script: scripts/destroy-vmids-2506-2508.sh. Besu RPC range is 2500–2505 only. See MISSING_CONTAINERS_LIST.md.
Dev/Codespaces (76.53.10.40) — Full completion
Single ordered checklist: 04-configuration/DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md — Phases 1–7 (fourth NPMplus, dev VM, UDM port forward, Cloudflare tunnel, NPMplus proxy hosts, projects/dotenv, verification).
Key commands (after fourth NPMplus and dev VM exist):
| Step | Command |
|---|---|
| Create fourth NPMplus LXC (10236 @ 192.168.11.170) | bash scripts/npmplus/create-npmplus-fourth-container.sh |
| Create dev VM (5700 @ 192.168.11.59) | bash scripts/create-dev-vm-5700.sh |
| Setup dev VM users + Gitea | ssh root@192.168.11.11 "pct exec 5700 -- bash -s" < scripts/setup-dev-vm-users-and-gitea.sh |
| Tunnel + DNS (set CLOUDFLARE_TUNNEL_ID_DEV_CODESPACES in .env first) | bash scripts/cloudflare/configure-dev-codespaces-tunnel-and-dns.sh |
| Fourth NPMplus proxy hosts | NPM_URL=https://192.168.11.170:81 NPM_PASSWORD='...' bash scripts/nginx-proxy-manager/update-npmplus-fourth-proxy-hosts.sh |
UDM Pro: add port forward 76.53.10.40 → 192.168.11.170 (80/81/443), optional 22 → 192.168.11.59. See UDM_PRO_DEV_CODESPACES_PORT_FORWARD.md.
Wave 2 & Wave 3 — Full checklist
Use the ordered checklist:
- WAVE2_WAVE3_OPERATOR_CHECKLIST.md — W2-1 (monitoring) through W2-8 (NPMplus HA), then W3-1 (CCIP Fleet), W3-2 (Phase 4 isolation).
Summary:
| Wave | Tasks |
|---|---|
| W2-1 | Monitoring stack (Prometheus, Grafana, Loki, Alertmanager) |
| W2-2 | Grafana via Cloudflare Access; alerts |
| W2-3 | VLAN enablement (UDM Pro, Proxmox bridge) |
| W2-4 | Phase 3 CCIP: Ops/Admin (5400–5401); NAT; scripts |
| W2-5 | Phase 4 sovereign tenant VLANs |
| W2-6 | |
| W2-7 | DBIS services (10100–10151) |
| W2-8 | NPMplus HA (optional) |
| W3-1 | CCIP Fleet (commit/execute/RMN nodes) |
| W3-2 | Phase 4 tenant isolation enforcement |
Explorer SSL (manual)
If explorer.d-bis.org shows "Your connection isn't private":
- Open NPMplus: https://192.168.11.167:81 (credentials:
NPM_EMAIL,NPM_PASSWORDfrom.env). - SSL Certificates → Add Let's Encrypt for
explorer.d-bis.org(DNS Challenge + Cloudflare credential if needed). - Proxy Hosts → explorer.d-bis.org → SSL tab → assign cert, Force SSL, Save.
See EXPLORER_TROUBLESHOOTING.md.
E2E 502s (when public domains return 502)
From LAN (SSH to Proxmox + reach NPMplus):
| Goal | Command |
|---|---|
| Fix all 502 backends + NPMplus proxy + RPC diagnostics | ./scripts/maintenance/address-all-remaining-502s.sh |
| Also Besu config fix + E2E at end | ./scripts/maintenance/address-all-remaining-502s.sh --run-besu-fix --e2e |
| Re-run E2E only | ./scripts/verify/verify-end-to-end-routing.sh |
Runbook: 502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md.
Remaining (operator only)
- W0-2 — sendCrossChain real (when PRIVATE_KEY/LINK ready).
- W1-1 / W1-2 — SSH key auth and firewall 8006
--applyon each Proxmox host (after keys deployed / CIDR decided). - Cron — ✅ Installed on root@192.168.11.11 (NPMplus 03:00; daily 08:00; weekly Sun 09:00). Re-install if you move repo to a permanent path.
- Validator keys — ✅ Applied on host for 1000–1002; 1003–1004 skipped (not running). Re-run when 1003/1004 are up if needed.
- 2506–2508 — Destroyed 2026-02-08; no action.
- Wave 2 / 3 — Monitoring, VLAN, CCIP, NPMplus HA, Phase 4 per WAVE2_WAVE3_OPERATOR_CHECKLIST.
- Explorer SSL — Let's Encrypt for explorer.d-bis.org in NPMplus UI (see above). One-time (and after NPMplus restore if certs lost).
- Explorer VM 5000 thin pool — If thin1-r630-02 is >85% or full, migrate VMID 5000 to thin5 per BLOCKSCOUT_FIX_RUNBOOK.md § "Fix: Migrate VM 5000 to thin5". Weekly cron now checks thin pool (138a); act when it warns or fails.
- NPMplus cert 134 (cross-all.defi-oracle.io) — If verification reports "cert files missing" for cert ID 134: in NPMplus at https://192.168.11.167:81 → SSL Certificates → find cross-all.defi-oracle.io → re-save or request Let's Encrypt again to restore cert files on disk.
- Dev/Codespaces (76.53.10.40) — Complete all phases in DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md: fourth NPMplus (10236), dev VM (5700), UDM port forward, Cloudflare tunnel, NPMplus fourth proxy hosts, Let's Encrypt, rsync/dotenv, verification.
After running "complete all next steps"
- Automated (workspace):
bash scripts/run-all-next-steps.sh— report indocs/04-configuration/verification-evidence/NEXT_STEPS_RUN_*.md. - Validators + tx-pool:
bash scripts/fix-all-validators-and-txpool.sh(requires SSH to .10, .11). - Flush stuck tx (if any):
bash scripts/flush-stuck-tx-rpc-and-validators.sh --full(clears RPC 2101 + validators 1000–1004). - Verify from LAN: From a host on 192.168.11.x run
bash scripts/monitoring/monitor-blockchain-health.shandbash scripts/skip-stuck-transactions.sh. See NEXT_STEPS_COMPLETION_RUN_20260208.md § Verify from LAN.
Quick command index
| Goal | Command |
|---|---|
| Run all automated next steps | bash scripts/run-all-next-steps.sh (validation, E2E, explorer check, dry-runs; report in verification-evidence/NEXT_STEPS_RUN_*.md) |
| W0-2 real | bash scripts/bridge/run-send-cross-chain.sh 0.01 |
| W0-3 backup | bash scripts/verify/backup-npmplus.sh |
| W0 from LAN | bash scripts/run-wave0-from-lan.sh |
| W1-1 apply | bash scripts/security/setup-ssh-key-auth.sh --apply (on each host) |
| W1-2 apply | bash scripts/security/firewall-proxmox-8006.sh --apply |
| NPMplus cron | bash scripts/maintenance/schedule-npmplus-backup-cron.sh --install |
| Daily/weekly cron | bash scripts/maintenance/schedule-daily-weekly-cron.sh --install |
| Validator keys | On Proxmox: bash scripts/secure-validator-keys.sh (after --dry-run) |
| Wave 0 via SSH | bash scripts/run-via-proxmox-ssh.sh wave0 --host 192.168.11.11 |
| Request cert (via SSH) | bash scripts/run-via-proxmox-ssh.sh request-cert --host 192.168.11.11 |
| Fourth NPMplus container | bash scripts/npmplus/create-npmplus-fourth-container.sh |
| Dev VM create | bash scripts/create-dev-vm-5700.sh |
| Dev/Codespaces tunnel+DNS | bash scripts/cloudflare/configure-dev-codespaces-tunnel-and-dns.sh (set CLOUDFLARE_TUNNEL_ID_DEV_CODESPACES in .env) |
| Fourth NPMplus proxy hosts | NPM_URL=https://192.168.11.170:81 NPM_PASSWORD='...' bash scripts/nginx-proxy-manager/update-npmplus-fourth-proxy-hosts.sh |
| Address all 502s (LAN) | ./scripts/maintenance/address-all-remaining-502s.sh (use --run-besu-fix --e2e for full flow) |
| E2E routing (after NPMplus/DNS change) | bash scripts/verify/verify-end-to-end-routing.sh |
| Explorer E2E from LAN (after frontend/Blockscout deploy) | bash explorer-monorepo/scripts/e2e-test-explorer.sh |
| Blockscout migrations (version/config change) | On r630-02: bash scripts/fix-blockscout-ssl-and-migrations.sh — see BLOCKSCOUT_FIX_RUNBOOK.md |
| When decommissioning RPC used by explorer | Update Blockscout RPC URL on VM 5000; restart Blockscout — see OPERATIONAL_RUNBOOKS.md § "When decommissioning or changing RPC nodes" |