Files
proxmox/docs/04-configuration/NPMPLUS_NETWORK_ROUTING_ISSUE.md
defiQUG fbda1b4beb
Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
docs: Ledger Live integration, contract deploy learnings, NEXT_STEPS updates
- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands
- CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround
- CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check
- NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere
- MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates
- LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-12 15:46:57 -08:00

157 lines
4.8 KiB
Markdown

# NPMplus Network Routing Issue - Root Cause Analysis
**Last Updated:** 2026-01-31
**Document Version:** 1.0
**Status:** Active Documentation
---
**Date:** 2025-01-20
**Container:** 10233 (NPMplus)
**IP:** 192.168.11.166
**Issue:** Container cannot reach backend services on 192.168.11.0/24
---
## Current Status
### ✅ What's Working
- Container has correct IP address: `192.168.11.166/24`
- Container can reach gateway: `192.168.11.1` (UDM Pro)
- Routing table is correct: `192.168.11.0/24 dev eth0`
- Proxmox host CAN reach backend services
- Backend services are running and responding
### ❌ What's Not Working
- Container CANNOT ping backend services (all 7 services fail)
- All HTTPS domains return 502 errors
- Network connectivity from container to 192.168.11.0/24 is blocked
---
## Root Cause Analysis
### Finding 1: Proxmox Bridge VLAN Configuration
- **Container veth interface:** `veth10233i0` is configured with VLAN 1 (PVID), not VLAN 11
- **Container config:** Shows `tag=11` but veth interface doesn't reflect this
- **Bridge status:** `vmbr0` has VLAN 11 sub-interface (`vmbr0v11`) but container veth is on VLAN 1
### Finding 2: Network Isolation
- Container is on VLAN 11 network (192.168.11.166)
- Backend services are on VLAN 11 network (192.168.11.0/24)
- Both should be on same VLAN, but connectivity fails
- This suggests either:
1. UDM Pro firewall blocking inter-VLAN communication (even within same VLAN)
2. Proxmox bridge VLAN tagging not working correctly
3. ARP/neighbor discovery failing
### Finding 3: Proxmox Host Can Reach Backends
- Proxmox host (192.168.11.11) CAN ping backend services
- This confirms backend services are reachable
- Issue is container-specific networking
---
## Diagnostic Commands
### Check Container Network
```bash
ssh root@192.168.11.11 "pct exec 10233 -- ip addr show eth0"
ssh root@192.168.11.11 "pct exec 10233 -- ip route show"
ssh root@192.168.11.11 "pct exec 10233 -- ping -c 2 192.168.11.1"
ssh root@192.168.11.11 "pct exec 10233 -- ping -c 2 192.168.11.140"
```
### Check Proxmox Bridge VLAN
```bash
ssh root@192.168.11.11 "bridge vlan show vmbr0 | grep -E '11|10233'"
ssh root@192.168.11.11 "bridge vlan show veth10233i0"
```
### Check UDM Pro Firewall Rules
```bash
# Via API
curl -k -X GET "https://192.168.11.1/proxy/network/integration/v1/sites/88f7af54-98f8-306a-a1c7-c9349722b1f6/acl-rules" \
-H "X-API-KEY: <API_KEY>" \
-H 'Accept: application/json' | jq '.data[] | select(.enabled == true)'
```
---
## Potential Solutions
### Solution 1: Fix Proxmox Bridge VLAN Tagging (Recommended)
The container's veth interface needs to be properly configured for VLAN 11:
```bash
# Stop container
ssh root@192.168.11.11 "pct stop 10233"
# Remove VLAN 1 from veth interface
ssh root@192.168.11.11 "bridge vlan del vid 1 dev veth10233i0"
# Add VLAN 11 as PVID
ssh root@192.168.11.11 "bridge vlan add vid 11 pvid untagged dev veth10233i0"
# Start container
ssh root@192.168.11.11 "pct start 10233"
```
**Note:** This may not persist across container restarts. May need to configure in Proxmox network configuration.
### Solution 2: Check UDM Pro Firewall Rules
UDM Pro may have firewall rules blocking traffic even within the same VLAN:
1. Access UDM Pro web UI: `https://192.168.11.1`
2. Navigate to: **Settings → Firewall & Security → Firewall Rules**
3. Check for rules blocking:
- Source: `192.168.11.166` or `192.168.11.0/24`
- Destination: `192.168.11.0/24`
4. Ensure there's an ALLOW rule for same-VLAN communication
### Solution 3: Use Proxmox Network Configuration
Instead of manual bridge VLAN configuration, reconfigure container network:
```bash
# Remove current network config
ssh root@192.168.11.11 "pct set 10233 -delete net0"
# Add network with proper VLAN tagging
ssh root@192.168.11.11 "pct set 10233 -net0 name=eth0,bridge=vmbr0,tag=11,firewall=1,ip=192.168.11.166/24,gw=192.168.11.1"
# Restart container
ssh root@192.168.11.11 "pct stop 10233 && pct start 10233"
```
### Solution 4: Check ARP/Neighbor Discovery
Container may not be able to resolve MAC addresses:
```bash
# Check ARP table in container
ssh root@192.168.11.11 "pct exec 10233 -- arp -a"
# Try to resolve gateway MAC
ssh root@192.168.11.11 "pct exec 10233 -- arp -s 192.168.11.1 <GATEWAY_MAC>"
```
---
## Next Steps
1. **Immediate:** Check UDM Pro firewall rules via web UI
2. **If firewall is OK:** Fix Proxmox bridge VLAN configuration
3. **Verify:** Test connectivity after fixes
4. **Document:** Update configuration documentation
---
## Related Files
- `scripts/check-npmplus-network-connectivity.sh` - Diagnostic script
- `scripts/diagnose-npmplus-backend-services.sh` - Backend service check
- `docs/04-configuration/NPMPLUS_BACKEND_SERVICES_RESOLUTION.md` - Related documentation
---
**Status:** 🔴 **BLOCKED** - Network routing issue preventing backend connectivity