Files
proxmox/reports/status/execution_review_summary.md
defiQUG fbda1b4beb
Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
docs: Ledger Live integration, contract deploy learnings, NEXT_STEPS updates
- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands
- CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround
- CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check
- NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere
- MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates
- LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-12 15:46:57 -08:00

6.9 KiB

Immediate Actions Execution Review

Date: 2026-01-20
Review of: Execution results from immediate actions


Executive Summary

Successes

  1. CPU Load Reduction: ml110 CPU usage dropped from 81.5% to 39.2% (52% reduction!)
  2. 7 Containers Successfully Migrated to r630-01:
    • besu-validator-1, 2, 3 (containers 1000, 1001, 1002)
    • besu-sentry-1, 2, 3 (containers 1500, 1501, 1502)
    • besu-rpc-core-1 (container 2101)
  3. r630-01 Utilization: CPU usage increased from 8.2% to 12.9% (still very healthy)
  4. All containers running successfully after migration

⚠️ Issues Encountered

1. Storage Incompatibility on r630-02

Problem: All 7 migrations to r630-02 failed with error:

storage 'local-lvm' is not available on node 'r630-02'

Root Cause:

  • Containers on ml110 use local-lvm storage
  • r630-02 has different storage pools: thin1-r630-02, thin2, thin3, thin4, thin5, thin6
  • The standard pct migrate command doesn't automatically handle storage conversion

Affected Containers:

  • besu-validator-4, 5 (1003, 1004)
  • besu-sentry-4, ali (1503, 1504)
  • besu-rpc-public-1 (2201)
  • besu-rpc-ali-0x8a (2303)
  • besu-rpc-thirdweb-0x8a-1 (2401)

2. thin2 Storage Migration Issue

Problem: Container 5000 (blockscout-1) migration failed due to incorrect command syntax:

Unknown option: storage
pct migrate <vmid> <target> [OPTIONS]

Root Cause: The pct migrate command doesn't support --storage flag directly. Need to use API-based migration.

Current Status:

  • Container 5000 still on thin2 (200GB disk, 96% used)
  • Container 6200 also on thin2 (50GB disk)
  • thin2 is at 88.86% capacity (210.7GB used of 226.13GB)

Current System State

ml110

  • Before: 23 containers, 81.5% CPU usage
  • After: 16 containers, 39.2% CPU usage
  • Improvement: 52% CPU reduction
  • Remaining High-CPU Containers:
    • besu-validator-4 (95.2% CPU) - Failed to migrate
    • besu-validator-5 (60.9% CPU) - Failed to migrate
    • besu-sentry-4 (96.8% CPU) - Failed to migrate
    • besu-sentry-ali (94.1% CPU) - Failed to migrate
    • besu-rpc-public-1 (80.0% CPU) - Failed to migrate
    • besu-rpc-ali-0x8a (93.3% CPU) - Failed to migrate
    • besu-rpc-thirdweb-0x8a-1 (94.1% CPU) - Failed to migrate

r630-01

  • Before: 50 containers, 8.2% CPU usage
  • After: 57 containers, 12.9% CPU usage
  • Status: Healthy, well within capacity

r630-02

  • Before: 7 containers, 5.3% CPU usage
  • After: 7 containers, 5.3% CPU usage
  • Status: ⚠️ Still underutilized - migrations failed

Solutions Required

1. Fix r630-02 Migrations (High Priority)

Solution: Use API-based migration with storage parameter:

# Method 1: Use pvesh API
pvesh create /nodes/ml110/lxc/<vmid>/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

# Method 2: Stop container, migrate, change storage
pct stop <vmid>
pct migrate <vmid> r630-02
# Then manually move storage if needed

Available Storage on r630-02:

  • thin1-r630-02: 0.34% used (225.36 GiB available) Recommended
  • thin3: 3.11% used (219.10 GiB available)
  • thin4: 22.59% used (175.05 GiB available)
  • thin5: 0.00% used (226.13 GiB available)
  • thin6: 0.00% used (226.13 GiB available)

2. Fix thin2 Capacity Issue (Critical)

Containers Using thin2:

  • CT 5000 (blockscout-1): 200GB disk, 96% used
  • CT 6200: 50GB disk, 10% used
  • Orphaned volume: vm-6201-disk-0 (50GB, 7.72% used) - may be unused

Solutions:

  1. Migrate containers to free storage:

    • Use pvesh API to migrate CT 5000 to thin1-r630-02 or thin3
    • Migrate CT 6200 to available storage
    • Clean up orphaned volumes if not in use
  2. Alternative: Expand thin2 storage if possible


Immediate (Critical)

  1. Complete r630-02 migrations using API-based method with storage parameter
  2. Migrate containers from thin2 to free up capacity
  3. Verify all migrations and check container health

High Priority

  1. Monitor CPU usage on ml110 - should stabilize around 30-40%
  2. Check container health after migrations
  3. Document storage mapping for future migrations

Medium Priority

  1. Investigate inactive storage pools (data/thin1 on r630-02 are node-restricted)
  2. Optimize storage distribution across all nodes
  3. Set up monitoring alerts for storage >80% and CPU >70%

Migration Commands for r630-02

Using API-based Migration (Correct Method)

# On ml110 or via SSH
# For each container, use:

# besu-validator-4 (1003)
pvesh create /nodes/ml110/lxc/1003/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

# besu-validator-5 (1004)
pvesh create /nodes/ml110/lxc/1004/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

# besu-sentry-4 (1503)
pvesh create /nodes/ml110/lxc/1503/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

# besu-sentry-ali (1504)
pvesh create /nodes/ml110/lxc/1504/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

# besu-rpc-public-1 (2201)
pvesh create /nodes/ml110/lxc/2201/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

# besu-rpc-ali-0x8a (2303)
pvesh create /nodes/ml110/lxc/2303/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

# besu-rpc-thirdweb-0x8a-1 (2401)
pvesh create /nodes/ml110/lxc/2401/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 1

Migrate thin2 Containers

# On r630-02
# Migrate CT 5000 (blockscout-1) to thin1-r630-02
pvesh create /nodes/r630-02/lxc/5000/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 0  # Stop first if needed

# Migrate CT 6200 to thin1-r630-02
pvesh create /nodes/r630-02/lxc/6200/migrate \
  --target r630-02 \
  --storage thin1-r630-02 \
  --online 0

Expected Results After Completion

ml110

  • CPU Usage: ~15-20% (down from 81.5%)
  • Container Count: ~9 containers (down from 23)
  • Status: Optimally loaded for management/light workloads

r630-01

  • CPU Usage: ~15-20% (up from 8.2%)
  • Container Count: ~57 containers
  • Status: Well-balanced workload distribution

r630-02

  • CPU Usage: ~15-20% (up from 5.3%)
  • Container Count: ~14 containers (up from 7)
  • Status: Better utilization of high-core CPU
  • Storage: thin2 below 50% usage

Lessons Learned

  1. Storage Compatibility: Always check available storage on target node before migration
  2. API vs CLI: Use pvesh API for migrations when storage conversion is needed
  3. Migration Strategy: Consider two-step migration (node first, then storage) for complex scenarios
  4. Verification: Always verify migrations and check container health after completion

Report Generated: 2026-01-20
Status: Partial Success - 7/14 migrations completed successfully