# Cloudflare PoP to Physical Infrastructure Mapping Strategy ## Overview This document outlines the strategy for mapping Cloudflare Points of Presence (PoPs) as regional gateways and tunneling traffic to physical hardware infrastructure across the global Phoenix network. ## Architecture Principles 1. **Cloudflare PoPs as Edge Gateways**: Use Cloudflare's 300+ global PoPs as the entry point for all user traffic 2. **Zero Trust Tunneling**: All traffic from PoPs to physical infrastructure via Cloudflare Tunnels (cloudflared) 3. **Regional Aggregation**: Map multiple PoPs to regional datacenters 4. **Latency Optimization**: Route traffic to nearest physical infrastructure 5. **High Availability**: Multiple PoP paths to physical infrastructure ## Cloudflare PoP Mapping Strategy ### Tier 1: Core Datacenter Mapping **Mapping Logic**: - Each Core Datacenter (10-15 locations) serves as a regional hub - Multiple Cloudflare PoPs in the region route to the nearest Core Datacenter - Primary and backup tunnel paths for redundancy **Example Mapping**: ``` Core Datacenter: US-East (Virginia) ├── Cloudflare PoPs: │ ├── Washington, DC (primary) │ ├── New York, NY (primary) │ ├── Boston, MA (backup) │ └── Philadelphia, PA (backup) └── Tunnel Configuration: ├── Primary: cloudflared tunnel to VA datacenter └── Backup: Failover to alternate path ``` ### Tier 2: Regional Datacenter Mapping **Mapping Logic**: - Regional Datacenters (50-75 locations) aggregate PoP traffic - PoPs route to nearest Regional Datacenter - Load balancing across multiple regional paths **Example Mapping**: ``` Regional Datacenter: US-West (California) ├── Cloudflare PoPs: │ ├── San Francisco, CA │ ├── Los Angeles, CA │ ├── San Jose, CA │ └── Seattle, WA └── Tunnel Configuration: ├── Load balanced across multiple tunnels └── Health-check based routing ``` ### Tier 3: Edge Site Mapping **Mapping Logic**: - Edge Sites (250+ locations) connect to nearest PoP - Direct PoP-to-Edge tunneling for low latency - Edge sites can serve as backup paths **Example Mapping**: ``` Edge Site: Denver, CO ├── Cloudflare PoP: Denver, CO └── Tunnel Configuration: ├── Direct tunnel to edge site └── Backup via regional datacenter ``` ## Implementation Architecture ### 1. PoP-to-Region Mapping Service ```typescript interface PoPMapping { popId: string popLocation: { city: string country: string coordinates: { lat: number; lng: number } } primaryDatacenter: { id: string type: 'CORE' | 'REGIONAL' | 'EDGE' location: Location tunnelEndpoint: string } backupDatacenters: Array<{ id: string priority: number tunnelEndpoint: string }> routingRules: { latencyThreshold: number // ms failoverThreshold: number // ms loadBalancing: 'ROUND_ROBIN' | 'LEAST_CONNECTIONS' | 'GEOGRAPHIC' } } ``` ### 2. Tunnel Management Service ```typescript interface TunnelConfiguration { tunnelId: string popId: string targetDatacenter: string tunnelType: 'PRIMARY' | 'BACKUP' | 'LOAD_BALANCED' healthCheck: { endpoint: string interval: number timeout: number failureThreshold: number } routing: { path: string service: string loadBalancing: LoadBalancingConfig } } ``` ### 3. Geographic Routing Service **Distance Calculation**: - Calculate distance from PoP to all available datacenters - Select nearest datacenter within latency threshold - Consider network path, not just geographic distance **Latency-Based Routing**: - Measure actual latency from PoP to datacenter - Route to lowest latency path - Dynamic rerouting based on real-time latency ## Cloudflare Tunnel Configuration ### Tunnel Architecture ``` User Request ↓ Cloudflare PoP (Edge) ↓ Cloudflare Tunnel (cloudflared) ↓ Physical Infrastructure (Proxmox/K8s) ↓ Application ``` ### Tunnel Setup Process 1. **Tunnel Creation**: - Create Cloudflare Tunnel via API - Generate tunnel token - Deploy cloudflared agent on physical infrastructure 2. **Route Configuration**: - Configure DNS records to point to tunnel - Set up ingress rules for routing - Configure load balancing 3. **Health Monitoring**: - Monitor tunnel health - Automatic failover on tunnel failure - Alert on tunnel degradation ### Multi-Tunnel Strategy **Primary Tunnel**: - Direct path from PoP to primary datacenter - Lowest latency path - Active traffic routing **Backup Tunnel**: - Alternative path via backup datacenter - Activated on primary failure - Pre-established for fast failover **Load Balanced Tunnels**: - Multiple tunnels for high availability - Load distribution across tunnels - Health-based routing ## Regional Gateway Mapping ### Region Definition ```typescript interface Region { id: string name: string type: 'CORE' | 'REGIONAL' | 'EDGE' location: { city: string country: string coordinates: { lat: number; lng: number } } cloudflarePoPs: string[] // PoP IDs physicalInfrastructure: { datacenterId: string tunnelEndpoints: string[] capacity: { compute: number storage: number network: number } } routing: { primaryPath: string backupPaths: string[] loadBalancing: LoadBalancingConfig } } ``` ### PoP-to-Region Assignment Algorithm 1. **Geographic Proximity**: - Calculate distance from PoP to all regions - Assign to nearest region within threshold 2. **Capacity Consideration**: - Check region capacity - Distribute PoPs to balance load - Avoid overloading single region 3. **Network Topology**: - Consider network paths - Optimize for latency - Minimize hops 4. **Failover Planning**: - Ensure backup regions available - Geographic diversity for resilience - Multiple paths for redundancy ## Implementation Components ### 1. PoP Mapping Service **File**: `api/src/services/pop-mapping.ts` ```typescript class PoPMappingService { async mapPoPToRegion(popId: string): Promise async getOptimalDatacenter(popId: string): Promise async configureTunnel(popId: string, datacenterId: string): Promise async updateRouting(popId: string, routing: RoutingConfig): Promise } ``` ### 2. Tunnel Orchestration Service **File**: `api/src/services/tunnel-orchestration.ts` ```typescript class TunnelOrchestrationService { async createTunnel(config: TunnelConfiguration): Promise async monitorTunnel(tunnelId: string): Promise async failoverTunnel(tunnelId: string, backupTunnelId: string): Promise async loadBalanceTunnels(tunnelIds: string[]): Promise } ``` ### 3. Geographic Routing Engine **File**: `api/src/services/geographic-routing.ts` ```typescript class GeographicRoutingService { async findNearestDatacenter(popLocation: Location): Promise async calculateLatency(popId: string, datacenterId: string): Promise async optimizeRouting(popId: string): Promise } ``` ## Database Schema ### PoP Mappings Table ```sql CREATE TABLE pop_mappings ( id UUID PRIMARY KEY, pop_id VARCHAR(255) UNIQUE NOT NULL, pop_location JSONB NOT NULL, primary_datacenter_id UUID REFERENCES datacenters(id), region_id UUID REFERENCES regions(id), tunnel_configuration JSONB, routing_rules JSONB, created_at TIMESTAMP, updated_at TIMESTAMP ); ``` ### Tunnel Configurations Table ```sql CREATE TABLE tunnel_configurations ( id UUID PRIMARY KEY, tunnel_id VARCHAR(255) UNIQUE NOT NULL, pop_id VARCHAR(255) REFERENCES pop_mappings(pop_id), datacenter_id UUID REFERENCES datacenters(id), tunnel_type VARCHAR(50), health_status VARCHAR(50), configuration JSONB, created_at TIMESTAMP, updated_at TIMESTAMP ); ``` ## Monitoring and Observability ### Key Metrics 1. **Tunnel Health**: - Tunnel uptime - Latency from PoP to datacenter - Packet loss - Throughput 2. **Routing Performance**: - Request routing time - Failover time - Load distribution 3. **Geographic Distribution**: - PoP-to-datacenter mapping distribution - Regional load balancing - Capacity utilization ### Alerting - Tunnel failure alerts - High latency alerts - Capacity threshold alerts - Routing anomaly alerts ## Security Considerations 1. **Zero Trust Architecture**: - All traffic authenticated - No public IPs on physical infrastructure - Encrypted tunnel connections 2. **Access Control**: - PoP-based access policies - Geographic restrictions - IP allowlisting 3. **Audit Logging**: - All tunnel connections logged - Routing decisions logged - Access attempts logged ## Deployment Strategy ### Phase 1: Core Datacenter Mapping (30 days) - Map top 50 Cloudflare PoPs to Core Datacenters - Deploy primary tunnels - Implement basic routing ### Phase 2: Regional Expansion (60 days) - Map remaining PoPs to Regional Datacenters - Deploy backup tunnels - Implement failover ### Phase 3: Edge Integration (90 days) - Integrate Edge Sites - Optimize routing algorithms - Full monitoring and alerting