Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands - CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround - CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check - NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere - MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates - LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference Co-authored-by: Cursor <cursoragent@cursor.com>
361 lines
14 KiB
Markdown
361 lines
14 KiB
Markdown
# Orchestration Deployment Guide - Enterprise-Grade
|
||
|
||
**Navigation:** [Home](/docs/01-getting-started/README.md) > [Architecture](/docs/01-getting-started/README.md) > Orchestration Deployment Guide
|
||
|
||
**Sankofa / Phoenix / PanTel · ChainID 138 · Proxmox + Cloudflare Zero Trust + Dual ISP + 6×/28**
|
||
|
||
**Last Updated:** 2025-01-20
|
||
**Document Version:** 1.1
|
||
**Status:** 🟢 Active Documentation
|
||
|
||
---
|
||
|
||
## Overview
|
||
|
||
This is the **complete orchestration technical plan** for your environment, using your actual **Spectrum /28 #1** and **placeholders for the other five /28 blocks**, explicitly mapping to your hardware:
|
||
|
||
- **2× ER605** (edge + HA/failover design)
|
||
- **3× ES216G switches**
|
||
- **1× ML110 Gen9** (management / seed / bootstrap)
|
||
- **4× Dell R630** (compute cluster; 512GB RAM each; 2×600GB boot; 6×250GB SSD)
|
||
|
||
This guide provides a **buildable blueprint**: network, VLANs, Proxmox cluster, IPAM, CCIP next-phase matrix, Cloudflare Zero Trust, and operational runbooks.
|
||
|
||
---
|
||
|
||
## Table of Contents
|
||
|
||
**Estimated Reading Time:** 45 minutes
|
||
**Progress:** Use this TOC to track your reading progress
|
||
|
||
1. ✅ [Core Principles](#core-principles) - *Foundation concepts*
|
||
2. ✅ [Physical Topology & Roles](#physical-topology--roles) - *Hardware layout*
|
||
3. ✅ [ISP & Public IP Plan](#isp--public-ip-plan) - *Public IP allocation*
|
||
4. ✅ [Layer-2 & VLAN Orchestration](#layer-2--vlan-orchestration) - *VLAN configuration*
|
||
5. ✅ [Routing, NAT, and Egress Segmentation](#routing-nat-and-egress-segmentation) - *Network routing*
|
||
6. ✅ [Proxmox Cluster Orchestration](#proxmox-cluster-orchestration) - *Proxmox setup*
|
||
7. ✅ [Cloudflare Zero Trust Orchestration](#cloudflare-zero-trust-orchestration) - *Cloudflare integration*
|
||
8. ✅ [VMID Allocation Registry](#vmid-allocation-registry) - *VMID planning*
|
||
9. ✅ [CCIP Fleet Deployment Matrix](#ccip-fleet-deployment-matrix) - *CCIP deployment*
|
||
10. ✅ [Deployment Orchestration Workflow](#deployment-orchestration-workflow) - *Deployment process*
|
||
11. ✅ [Operational Runbooks](#operational-runbooks) - *Operations guide*
|
||
|
||
---
|
||
|
||
## Core Principles
|
||
|
||
1. **No public IPs on Proxmox hosts or LXCs/VMs** (default)
|
||
2. **Inbound access = Cloudflare Zero Trust + cloudflared** (primary)
|
||
3. **Public IPs are used for:**
|
||
- ER605 WAN addressing
|
||
- **Egress NAT pools** (role-based allowlisting)
|
||
- **Break-glass** emergency endpoints only
|
||
4. **Segmentation by VLAN/VRF**: consensus vs services vs sovereign tenants vs ops
|
||
5. **Deterministic VMID registry** + IPAM that matches
|
||
|
||
---
|
||
|
||
## Physical Topology & Roles
|
||
|
||
> **Reference:** For complete hardware role assignments, physical topology, and detailed specifications, see **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md#1-physical-topology--hardware-roles)**.
|
||
|
||
> **Hardware Inventory:** For complete physical hardware inventory including IP addresses, credentials, hostnames, and detailed specifications, see **[PHYSICAL_HARDWARE_INVENTORY.md](PHYSICAL_HARDWARE_INVENTORY.md)** ⭐⭐⭐.
|
||
|
||
**Summary:**
|
||
- **2× ER605** (edge + HA/failover design)
|
||
- **3× ES216G switches** (core, compute, mgmt)
|
||
- **1× ML110 Gen9** (management / seed / bootstrap) - IP: 192.168.11.10
|
||
- **4× Dell R630** (compute cluster; 512GB RAM each; 2×600GB boot; 6×250GB SSD)
|
||
|
||
---
|
||
|
||
## ISP & Public IP Plan
|
||
|
||
> **Reference:** For complete public IP block plan, usage policy, and NAT pool assignments, see **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md#2-isp--public-ip-plan-6--28)**.
|
||
|
||
**Summary:**
|
||
- **Block #1** (76.53.10.32/28): Router WAN + break-glass VIPs ✅ Configured
|
||
- **Blocks #2-6**: Placeholders for CCIP Commit, Execute, RMN, Service, and Sovereign tenant egress NAT pools
|
||
|
||
---
|
||
|
||
## Layer-2 & VLAN Orchestration
|
||
|
||
> **Reference:** For complete VLAN orchestration plan, subnet allocations, and switching configuration, see **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md#3-layer-2--vlan-orchestration-plan)**.
|
||
|
||
**Summary:**
|
||
- **19 VLANs** defined with complete subnet plan
|
||
- **VLAN 11**: MGMT-LAN (192.168.11.0/24) - Current flat LAN
|
||
- **VLANs 110-203**: Service-specific VLANs (10.x.0.0/24 or /20 or /22)
|
||
- **Migration path**: From flat LAN to VLANs while maintaining compatibility
|
||
|
||
---
|
||
|
||
## Routing, NAT, and Egress Segmentation
|
||
|
||
> **Reference:** For complete routing configuration, NAT policies, and egress segmentation details, see **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md#4-routing-nat-and-egress-segmentation-er605)**.
|
||
|
||
**Summary:**
|
||
- **Inbound NAT**: Default none (Cloudflare Tunnel primary)
|
||
- **Outbound NAT**: Role-based pools using /28 blocks #2-6
|
||
- **Egress Segmentation**: CCIP Commit → Block #2, Execute → Block #3, RMN → Block #4, Services → Block #5, Sovereign → Block #6
|
||
|
||
---
|
||
|
||
## Proxmox Cluster Orchestration
|
||
|
||
> **Reference:** For complete Proxmox cluster orchestration, networking, and storage details, see **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md#5-proxmox-cluster-orchestration)**.
|
||
|
||
**Summary:**
|
||
- **Node Layout**: ml110 (mgmt) + r630-01..04 (compute)
|
||
- **Networking**: VLAN-aware bridge `vmbr0` with native VLAN 11
|
||
- **Storage**: ZFS recommended for R630 data SSDs
|
||
|
||
---
|
||
|
||
## Cloudflare Zero Trust Orchestration
|
||
|
||
> **Reference:** For complete Cloudflare Zero Trust orchestration, cloudflared gateway pattern, and tunnel configuration, see **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md#6-cloudflare-zero-trust-orchestration)**.
|
||
|
||
**Summary:**
|
||
- **2 cloudflared LXCs** for redundancy (ML110 + R630)
|
||
- **Tunnels for**: Blockscout, FireFly, Gitea, internal admin dashboards
|
||
- **Proxmox UI**: LAN-only (publish via Cloudflare Access if needed)
|
||
|
||
For detailed Cloudflare configuration guides, see:
|
||
- **[../04-configuration/cloudflare/CLOUDFLARE_ZERO_TRUST_GUIDE.md](../04-configuration/cloudflare/CLOUDFLARE_ZERO_TRUST_GUIDE.md)**
|
||
- **[../04-configuration/cloudflare/CLOUDFLARE_DNS_TO_CONTAINERS.md](../04-configuration/cloudflare/CLOUDFLARE_DNS_TO_CONTAINERS.md)**
|
||
|
||
---
|
||
|
||
## VMID Allocation Registry
|
||
|
||
> **Reference:** For complete VMID allocation registry with detailed breakdowns, see **[VMID_ALLOCATION_FINAL.md](VMID_ALLOCATION_FINAL.md)**.
|
||
|
||
**Summary:**
|
||
- **Total Allocated**: 11,000 VMIDs (1000-13999)
|
||
- **Besu Network**: 4,000 VMIDs (1000-4999)
|
||
- **CCIP**: 200 VMIDs (5400-5599)
|
||
- **Sovereign Cloud Band**: 4,000 VMIDs (10000-13999)
|
||
|
||
See also **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md#7-complete-vmid-and-network-allocation-table)** for VMID-to-VLAN mapping.
|
||
|
||
---
|
||
|
||
## CCIP Fleet Deployment Matrix
|
||
|
||
### Lane A — Minimum Production Fleet
|
||
|
||
**Total new CCIP nodes:** 41 (or 43 if you add 2 monitoring nodes)
|
||
|
||
### VMIDs + Hostnames
|
||
|
||
| Group | Count | VMIDs | Hostname Pattern |
|
||
|-------|------:|------:|------------------|
|
||
| Ops/Admin | 2 | 5400–5401 | `ccip-ops-01..02` |
|
||
| Monitoring (optional) | 2 | 5402–5403 | `ccip-mon-01..02` |
|
||
| Commit Oracles | 16 | 5410–5425 | `ccip-commit-01..16` |
|
||
| Execute Oracles | 16 | 5440–5455 | `ccip-exec-01..16` |
|
||
| RMN | 7 | 5470–5476 | `ccip-rmn-01..07` |
|
||
|
||
### Private IP Assignments (VLAN-based)
|
||
|
||
Once VLANs are active, assign:
|
||
|
||
| Role | VLAN | Subnet |
|
||
|------|-----:|--------|
|
||
| Ops/Admin | 130 | 10.130.0.0/24 |
|
||
| Commit | 132 | 10.132.0.0/24 |
|
||
| Execute | 133 | 10.133.0.0/24 |
|
||
| RMN | 134 | 10.134.0.0/24 |
|
||
|
||
> **Interim Plan:** While still on the flat LAN, use 192.168.11.170-212 (cleared 2026-02-01). Migrate to VLANs when ready.
|
||
|
||
### Egress NAT Mapping (Public blocks placeholder)
|
||
|
||
- Commit VLAN (10.132.0.0/24) → **Block #2** `<PUBLIC_BLOCK_2>/28`
|
||
- Execute VLAN (10.133.0.0/24) → **Block #3** `<PUBLIC_BLOCK_3>/28`
|
||
- RMN VLAN (10.134.0.0/24) → **Block #4** `<PUBLIC_BLOCK_4>/28`
|
||
|
||
See **[CCIP_DEPLOYMENT_SPEC.md](../07-ccip/CCIP_DEPLOYMENT_SPEC.md)** for complete specification.
|
||
|
||
---
|
||
|
||
## Deployment Orchestration Workflow
|
||
|
||
### Deployment Workflow Diagram
|
||
|
||
```mermaid
|
||
flowchart TD
|
||
Start[Start Deployment] --> Phase0[Phase 0: Validate Foundation]
|
||
Phase0 --> Check1{Foundation Valid?}
|
||
Check1 -->|No| Fix1[Fix Issues]
|
||
Fix1 --> Phase0
|
||
Check1 -->|Yes| Phase1[Phase 1: Enable VLANs]
|
||
Phase1 --> Verify1{VLANs Working?}
|
||
Verify1 -->|No| FixVLAN[Fix VLAN Config]
|
||
FixVLAN --> Phase1
|
||
Verify1 -->|Yes| Phase2[Phase 2: Deploy Observability]
|
||
Phase2 --> Verify2{Monitoring Active?}
|
||
Verify2 -->|No| FixMonitor[Fix Monitoring]
|
||
FixMonitor --> Phase2
|
||
Verify2 -->|Yes| Phase3[Phase 3: Deploy CCIP Fleet]
|
||
Phase3 --> Verify3{CCIP Nodes Running?}
|
||
Verify3 -->|No| FixCCIP[Fix CCIP Config]
|
||
FixCCIP --> Phase3
|
||
Verify3 -->|Yes| Phase4[Phase 4: Deploy Sovereign Tenants]
|
||
Phase4 --> Verify4{Tenants Operational?}
|
||
Verify4 -->|No| FixTenants[Fix Tenant Config]
|
||
FixTenants --> Phase4
|
||
Verify4 -->|Yes| Complete[Deployment Complete]
|
||
```
|
||
|
||
### Phase 0 — Validate Foundation
|
||
|
||
1. ✅ Confirm ER605-A WAN1 static: **76.53.10.34/28**, GW **76.53.10.33**
|
||
2. ⏳ Confirm WAN2 on ER605-A (ISP #2) failover
|
||
3. ⏳ Confirm ES216G trunks and native VLAN 11 mgmt access is stable
|
||
4. ⏳ Confirm Proxmox mgmt reachable only from trusted admin endpoints
|
||
|
||
### Phase 1 — VLAN Enablement
|
||
|
||
1. ⏳ Configure ES216G trunk ports
|
||
2. ⏳ Enable VLAN-aware bridge `vmbr0` on Proxmox nodes
|
||
3. ⏳ Create VLAN interfaces on ER605 for routing + DHCP (where appropriate)
|
||
4. ⏳ Move services one domain at a time (start with monitoring)
|
||
|
||
### Phase 2 — Observability First
|
||
|
||
1. ⏳ Deploy monitoring stack (Prometheus/Grafana/Loki/Alertmanager)
|
||
2. ⏳ Publish Grafana via Cloudflare Access (not public IPs)
|
||
3. ⏳ Set alerts for node health, disk, latency, chain metrics
|
||
|
||
### Phase 3 — CCIP Fleet (Lane A)
|
||
|
||
1. ⏳ Deploy CCIP Ops/Admin
|
||
2. ⏳ Deploy 16 commit nodes (VLAN 132)
|
||
3. ⏳ Deploy 16 execute nodes (VLAN 133)
|
||
4. ⏳ Deploy 7 RMN nodes (VLAN 134)
|
||
5. ⏳ Apply ER605 outbound NAT pools per VLAN using /28 blocks #2–#4 placeholders
|
||
6. ⏳ Verify node egress identity by role (allowlisting ready)
|
||
|
||
### Phase 4 — Sovereign Tenant Rollout
|
||
|
||
1. ⏳ Stand up Phoenix Sovereign Cloud Band VLANs 200–203
|
||
2. ⏳ Apply Block #6 egress NAT
|
||
3. ⏳ Enforce tenant isolation (ACLs, deny east-west)
|
||
|
||
---
|
||
|
||
## Operational Runbooks
|
||
|
||
### Network Operations
|
||
|
||
- **[../04-configuration/ER605_ROUTER_CONFIGURATION.md](/docs/04-configuration/ER605_ROUTER_CONFIGURATION.md)** - Router configuration guide
|
||
- **[../06-besu/BESU_ALLOWLIST_RUNBOOK.md](../06-besu/BESU_ALLOWLIST_RUNBOOK.md)** - Besu allowlist management
|
||
- **[../04-configuration/cloudflare/CLOUDFLARE_ZERO_TRUST_GUIDE.md](../04-configuration/cloudflare/CLOUDFLARE_ZERO_TRUST_GUIDE.md)** - Cloudflare Zero Trust setup
|
||
|
||
### Deployment Operations
|
||
|
||
- **[VALIDATED_SET_DEPLOYMENT_GUIDE.md](../03-deployment/VALIDATED_SET_DEPLOYMENT_GUIDE.md)** - Validated set deployment
|
||
- **[CCIP_DEPLOYMENT_SPEC.md](../07-ccip/CCIP_DEPLOYMENT_SPEC.md)** - CCIP fleet deployment
|
||
- **[DEPLOYMENT_READINESS.md](../03-deployment/DEPLOYMENT_READINESS.md)** - Pre-deployment validation
|
||
|
||
### Troubleshooting
|
||
|
||
- **[../09-troubleshooting/TROUBLESHOOTING_FAQ.md](/docs/09-troubleshooting/TROUBLESHOOTING_FAQ.md)** - Common issues and solutions
|
||
- **[../09-troubleshooting/QBFT_TROUBLESHOOTING.md](/docs/09-troubleshooting/QBFT_TROUBLESHOOTING.md)** - QBFT consensus troubleshooting
|
||
|
||
---
|
||
|
||
## Deliverables
|
||
|
||
### Completed ✅
|
||
|
||
- ✅ Authoritative VLAN and subnet plan
|
||
- ✅ Public block usage model (with placeholders for 5 blocks)
|
||
- ✅ Proxmox cluster topology plan
|
||
- ✅ CCIP fleet deployment matrix
|
||
- ✅ Stepwise orchestration workflow
|
||
|
||
### Pending ⏳
|
||
|
||
- ⏳ Exact NAT/VIP rules (requires public blocks #2-6)
|
||
- ⏳ ER605-B role decision (standby edge vs dedicated sovereign edge)
|
||
- ⏳ VLAN migration execution
|
||
- ⏳ CCIP fleet deployment
|
||
|
||
---
|
||
|
||
## Next Steps
|
||
|
||
### To Finalize Placeholders
|
||
|
||
Paste the other five /28 blocks in the same format as Block #1:
|
||
|
||
- Network / Gateway / Usable / Broadcast
|
||
|
||
And specify:
|
||
|
||
- ER605-B usage: **standby edge** OR **dedicated sovereign edge**
|
||
|
||
Then we can produce:
|
||
- **Exact NAT pool assignment sheet** per role
|
||
- **Break-glass VIP table**
|
||
- **Complete ER605 configuration**
|
||
|
||
---
|
||
|
||
## Related Documentation
|
||
|
||
### Prerequisites
|
||
- **[../01-getting-started/PREREQUISITES.md](/docs/01-getting-started/PREREQUISITES.md)** - System requirements and prerequisites
|
||
- **[../03-deployment/DEPLOYMENT_READINESS.md](../03-deployment/DEPLOYMENT_READINESS.md)** - Pre-deployment validation checklist
|
||
|
||
### Architecture
|
||
- **[NETWORK_ARCHITECTURE.md](NETWORK_ARCHITECTURE.md)** ⭐⭐⭐ - Complete network architecture (authoritative reference)
|
||
- **[PHYSICAL_HARDWARE_INVENTORY.md](PHYSICAL_HARDWARE_INVENTORY.md)** ⭐⭐⭐ - Physical hardware inventory and specifications
|
||
- **[VMID_ALLOCATION_FINAL.md](VMID_ALLOCATION_FINAL.md)** ⭐⭐⭐ - VMID allocation registry
|
||
- **[DOMAIN_STRUCTURE.md](DOMAIN_STRUCTURE.md)** ⭐⭐ - Domain structure and DNS assignments
|
||
- **[CCIP_DEPLOYMENT_SPEC.md](../07-ccip/CCIP_DEPLOYMENT_SPEC.md)** - CCIP deployment specification
|
||
|
||
### Configuration
|
||
- **[../04-configuration/ER605_ROUTER_CONFIGURATION.md](/docs/04-configuration/ER605_ROUTER_CONFIGURATION.md)** - Router configuration
|
||
- **[../04-configuration/cloudflare/CLOUDFLARE_ZERO_TRUST_GUIDE.md](../04-configuration/cloudflare/CLOUDFLARE_ZERO_TRUST_GUIDE.md)** - Cloudflare Zero Trust setup
|
||
|
||
### Operations
|
||
- **[../03-deployment/OPERATIONAL_RUNBOOKS.md](../03-deployment/OPERATIONAL_RUNBOOKS.md)** - Operational procedures
|
||
- **[../03-deployment/DEPLOYMENT_STATUS_CONSOLIDATED.md](../03-deployment/DEPLOYMENT_STATUS_CONSOLIDATED.md)** - Deployment status
|
||
- **[../09-troubleshooting/TROUBLESHOOTING_FAQ.md](/docs/09-troubleshooting/TROUBLESHOOTING_FAQ.md)** - Troubleshooting guide
|
||
|
||
### Best Practices
|
||
- **[../10-best-practices/RECOMMENDATIONS_AND_SUGGESTIONS.md](../10-best-practices/RECOMMENDATIONS_AND_SUGGESTIONS.md)** - Comprehensive recommendations
|
||
- **[../10-best-practices/IMPLEMENTATION_CHECKLIST.md](../10-best-practices/IMPLEMENTATION_CHECKLIST.md)** - Implementation checklist
|
||
|
||
### Reference
|
||
- **[MASTER_INDEX.md](../MASTER_INDEX.md)** - Complete documentation index
|
||
|
||
---
|
||
|
||
**Document Status:** Complete (v1.1)
|
||
**Maintained By:** Infrastructure Team
|
||
**Review Cycle:** Monthly
|
||
**Last Updated:** 2025-01-20
|
||
|
||
---
|
||
|
||
## Change Log
|
||
|
||
### Version 1.1 (2025-01-20)
|
||
- Removed duplicate network architecture content
|
||
- Added references to NETWORK_ARCHITECTURE.md
|
||
- Added deployment workflow Mermaid diagram
|
||
- Added ASCII art process flow
|
||
- Added breadcrumb navigation
|
||
- Added status indicators
|
||
|
||
### Version 1.0 (2024-12-15)
|
||
- Initial version
|
||
- Complete deployment orchestration guide
|
||
|