Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands - CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround - CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check - NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere - MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates - LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference Co-authored-by: Cursor <cursoragent@cursor.com>
6.9 KiB
6.9 KiB
Immediate Actions Execution Review
Date: 2026-01-20
Review of: Execution results from immediate actions
Executive Summary
✅ Successes
- CPU Load Reduction: ml110 CPU usage dropped from 81.5% to 39.2% (52% reduction!)
- 7 Containers Successfully Migrated to r630-01:
- besu-validator-1, 2, 3 (containers 1000, 1001, 1002)
- besu-sentry-1, 2, 3 (containers 1500, 1501, 1502)
- besu-rpc-core-1 (container 2101)
- r630-01 Utilization: CPU usage increased from 8.2% to 12.9% (still very healthy)
- All containers running successfully after migration
⚠️ Issues Encountered
1. Storage Incompatibility on r630-02
Problem: All 7 migrations to r630-02 failed with error:
storage 'local-lvm' is not available on node 'r630-02'
Root Cause:
- Containers on ml110 use
local-lvmstorage - r630-02 has different storage pools:
thin1-r630-02,thin2,thin3,thin4,thin5,thin6 - The standard
pct migratecommand doesn't automatically handle storage conversion
Affected Containers:
- besu-validator-4, 5 (1003, 1004)
- besu-sentry-4, ali (1503, 1504)
- besu-rpc-public-1 (2201)
- besu-rpc-ali-0x8a (2303)
- besu-rpc-thirdweb-0x8a-1 (2401)
2. thin2 Storage Migration Issue
Problem: Container 5000 (blockscout-1) migration failed due to incorrect command syntax:
Unknown option: storage
pct migrate <vmid> <target> [OPTIONS]
Root Cause: The pct migrate command doesn't support --storage flag directly. Need to use API-based migration.
Current Status:
- Container 5000 still on thin2 (200GB disk, 96% used)
- Container 6200 also on thin2 (50GB disk)
- thin2 is at 88.86% capacity (210.7GB used of 226.13GB)
Current System State
ml110
- Before: 23 containers, 81.5% CPU usage
- After: 16 containers, 39.2% CPU usage
- Improvement: ✅ 52% CPU reduction
- Remaining High-CPU Containers:
- besu-validator-4 (95.2% CPU) - Failed to migrate
- besu-validator-5 (60.9% CPU) - Failed to migrate
- besu-sentry-4 (96.8% CPU) - Failed to migrate
- besu-sentry-ali (94.1% CPU) - Failed to migrate
- besu-rpc-public-1 (80.0% CPU) - Failed to migrate
- besu-rpc-ali-0x8a (93.3% CPU) - Failed to migrate
- besu-rpc-thirdweb-0x8a-1 (94.1% CPU) - Failed to migrate
r630-01
- Before: 50 containers, 8.2% CPU usage
- After: 57 containers, 12.9% CPU usage
- Status: ✅ Healthy, well within capacity
r630-02
- Before: 7 containers, 5.3% CPU usage
- After: 7 containers, 5.3% CPU usage
- Status: ⚠️ Still underutilized - migrations failed
Solutions Required
1. Fix r630-02 Migrations (High Priority)
Solution: Use API-based migration with storage parameter:
# Method 1: Use pvesh API
pvesh create /nodes/ml110/lxc/<vmid>/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
# Method 2: Stop container, migrate, change storage
pct stop <vmid>
pct migrate <vmid> r630-02
# Then manually move storage if needed
Available Storage on r630-02:
thin1-r630-02: 0.34% used (225.36 GiB available) ✅ Recommendedthin3: 3.11% used (219.10 GiB available)thin4: 22.59% used (175.05 GiB available)thin5: 0.00% used (226.13 GiB available)thin6: 0.00% used (226.13 GiB available)
2. Fix thin2 Capacity Issue (Critical)
Containers Using thin2:
- CT 5000 (blockscout-1): 200GB disk, 96% used
- CT 6200: 50GB disk, 10% used
- Orphaned volume: vm-6201-disk-0 (50GB, 7.72% used) - may be unused
Solutions:
-
Migrate containers to free storage:
- Use
pveshAPI to migrate CT 5000 tothin1-r630-02orthin3 - Migrate CT 6200 to available storage
- Clean up orphaned volumes if not in use
- Use
-
Alternative: Expand thin2 storage if possible
Recommended Next Steps
Immediate (Critical)
- ✅ Complete r630-02 migrations using API-based method with storage parameter
- ✅ Migrate containers from thin2 to free up capacity
- ✅ Verify all migrations and check container health
High Priority
- ✅ Monitor CPU usage on ml110 - should stabilize around 30-40%
- ✅ Check container health after migrations
- ✅ Document storage mapping for future migrations
Medium Priority
- ✅ Investigate inactive storage pools (data/thin1 on r630-02 are node-restricted)
- ✅ Optimize storage distribution across all nodes
- ✅ Set up monitoring alerts for storage >80% and CPU >70%
Migration Commands for r630-02
Using API-based Migration (Correct Method)
# On ml110 or via SSH
# For each container, use:
# besu-validator-4 (1003)
pvesh create /nodes/ml110/lxc/1003/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
# besu-validator-5 (1004)
pvesh create /nodes/ml110/lxc/1004/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
# besu-sentry-4 (1503)
pvesh create /nodes/ml110/lxc/1503/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
# besu-sentry-ali (1504)
pvesh create /nodes/ml110/lxc/1504/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
# besu-rpc-public-1 (2201)
pvesh create /nodes/ml110/lxc/2201/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
# besu-rpc-ali-0x8a (2303)
pvesh create /nodes/ml110/lxc/2303/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
# besu-rpc-thirdweb-0x8a-1 (2401)
pvesh create /nodes/ml110/lxc/2401/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 1
Migrate thin2 Containers
# On r630-02
# Migrate CT 5000 (blockscout-1) to thin1-r630-02
pvesh create /nodes/r630-02/lxc/5000/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 0 # Stop first if needed
# Migrate CT 6200 to thin1-r630-02
pvesh create /nodes/r630-02/lxc/6200/migrate \
--target r630-02 \
--storage thin1-r630-02 \
--online 0
Expected Results After Completion
ml110
- CPU Usage: ~15-20% (down from 81.5%)
- Container Count: ~9 containers (down from 23)
- Status: ✅ Optimally loaded for management/light workloads
r630-01
- CPU Usage: ~15-20% (up from 8.2%)
- Container Count: ~57 containers
- Status: ✅ Well-balanced workload distribution
r630-02
- CPU Usage: ~15-20% (up from 5.3%)
- Container Count: ~14 containers (up from 7)
- Status: ✅ Better utilization of high-core CPU
- Storage: thin2 below 50% usage
Lessons Learned
- Storage Compatibility: Always check available storage on target node before migration
- API vs CLI: Use
pveshAPI for migrations when storage conversion is needed - Migration Strategy: Consider two-step migration (node first, then storage) for complex scenarios
- Verification: Always verify migrations and check container health after completion
Report Generated: 2026-01-20
Status: Partial Success - 7/14 migrations completed successfully