- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands - CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround - CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check - NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere - MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates - LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference Co-authored-by: Cursor <cursoragent@cursor.com>
3.6 KiB
Stability Remediation Execution Plan
Last Updated: 2026-01-31
Document Version: 1.0
Status: Active Documentation
Date: 2025-01-20
Status: 📋 READY FOR EXECUTION
Priority: 🔴 CRITICAL
Immediate Execution Steps
Step 1: Deploy Enhanced Systemd Services (30 minutes)
Action: Update all validator systemd services with enhanced restart policies
# For each validator, update systemd service
# Use enhanced-besu-validator.service as template
# Deploy check-validator-prerequisites.sh and verify-validator-started.sh
Expected Outcome: Validators auto-restart on failure with health checks
Step 2: Deploy Configuration Auto-Fix (15 minutes)
Action: Run auto-fix script on all validators
cd /home/intlc/projects/proxmox
./scripts/monitoring/auto-fix-validator-config.sh
Expected Outcome: All validators have consistent, correct configuration
Step 3: Deploy Health Monitoring (30 minutes)
Action: Set up health checks on all validators
# Deploy monitoring scripts
./scripts/monitoring/setup-validator-monitoring.sh
# Test health checks
./scripts/monitoring/check-validator-health.sh
Expected Outcome: Continuous health monitoring active
Step 4: Deploy Block Production Monitor (15 minutes)
Action: Start continuous block production monitoring
# Start as background service
nohup ./scripts/monitoring/monitor-block-production.sh > /var/log/block-monitor.log 2>&1 &
Expected Outcome: Continuous block production monitoring with alerts
Step 5: Deploy Transaction Pool Monitor (15 minutes)
Action: Start transaction pool monitoring
# Start as background service
nohup ./scripts/monitoring/monitor-transaction-pool.sh > /var/log/txpool-monitor.log 2>&1 &
Expected Outcome: Continuous transaction pool monitoring
Step 6: Deploy Master Monitor (15 minutes)
Action: Start master stability monitor
# Start as systemd service or background process
nohup ./scripts/monitoring/master-stability-monitor.sh > /var/log/stability-monitor.log 2>&1 &
Expected Outcome: Comprehensive stability monitoring active
Validation Steps
After Deployment
-
Verify Health Checks:
./scripts/monitoring/check-validator-health.sh -
Verify Block Production:
./scripts/monitoring/monitor-block-production.sh --once -
Verify Configuration:
./scripts/monitoring/validate-all-configs.sh -
Check Monitoring Logs:
tail -f /var/log/block-monitor.log tail -f /var/log/txpool-monitor.log tail -f /var/log/stability-monitor.log
Success Criteria
Immediate (Day 1)
- ✅ All validators have enhanced systemd services
- ✅ Auto-fix scripts deployed
- ✅ Health monitoring active
- ✅ Block production monitoring active
Short-term (Week 1)
- ✅ All monitoring scripts running
- ✅ Alerting configured
- ✅ No configuration issues
- ✅ Block production stable
Long-term (Month 1)
- ✅ 99.9% block production uptime
- ✅ < 5 minute MTTR
- ✅ Automated recovery working
- ✅ Comprehensive monitoring coverage
Maintenance Schedule
Daily
- Review monitoring logs
- Check for alerts
- Verify block production
Weekly
- Run comprehensive health audit
- Review configuration consistency
- Update documentation
Monthly
- Performance review
- Process improvements
- Capacity planning
Status: Ready for immediate execution
Estimated Time: 2-3 hours for full deployment
Priority: Execute immediately