- Add comprehensive database migrations (001-024) for schema evolution - Enhance API schema with expanded type definitions and resolvers - Add new middleware: audit logging, rate limiting, MFA enforcement, security, tenant auth - Implement new services: AI optimization, billing, blockchain, compliance, marketplace - Add adapter layer for cloud integrations (Cloudflare, Kubernetes, Proxmox, storage) - Update Crossplane provider with enhanced VM management capabilities - Add comprehensive test suite for API endpoints and services - Update frontend components with improved GraphQL subscriptions and real-time updates - Enhance security configurations and headers (CSP, CORS, etc.) - Update documentation and configuration files - Add new CI/CD workflows and validation scripts - Implement design system improvements and UI enhancements
Infrastructure Monitoring
Comprehensive monitoring solutions for all infrastructure components in Sankofa Phoenix.
Overview
This directory contains monitoring components including custom Prometheus exporters, Grafana dashboards, and alerting rules for infrastructure monitoring.
Components
Exporters (exporters/)
Custom Prometheus exporters for:
- Proxmox VE metrics
- TP-Link Omada metrics
- Network switch/router metrics
- Infrastructure health checks
Dashboards (dashboards/)
Grafana dashboards for:
- Infrastructure overview
- Proxmox cluster health
- Network performance
- Omada controller status
- Site-level monitoring
Exporters
Proxmox Exporter
The Proxmox exporter (pve_exporter) provides metrics for:
- VM status and resource usage
- Node health and performance
- Storage pool utilization
- Network interface statistics
- Cluster status
Installation:
pip install pve_exporter
Configuration:
exporter:
listen_address: 0.0.0.0:9221
proxmox:
endpoint: https://pve1.sankofa.nexus:8006
username: monitoring@pam
password: ${PROXMOX_PASSWORD}
Omada Exporter
Custom exporter for TP-Link Omada Controller metrics:
- Access point status
- Client device counts
- Network throughput
- Controller health
See: exporters/omada_exporter/ for implementation
Network Exporter
SNMP-based exporter for network devices:
- Switch port statistics
- Router interface metrics
- VLAN utilization
- Network topology changes
See: exporters/network_exporter/ for implementation
Dashboards
Infrastructure Overview
Comprehensive dashboard showing:
- All sites status
- Resource utilization
- Health scores
- Alert summary
Location: dashboards/infrastructure-overview.json
Proxmox Cluster
Dashboard for Proxmox clusters:
- Cluster health
- Node performance
- VM resource usage
- Storage utilization
Location: dashboards/proxmox-cluster.json
Network Performance
Network performance dashboard:
- Bandwidth utilization
- Latency metrics
- Error rates
- Top talkers
Location: dashboards/network-performance.json
Omada Controller
Omada-specific dashboard:
- Controller status
- Access point health
- Client statistics
- Network policies
Location: dashboards/omada-controller.json
Installation
Deploy Exporters
# Deploy all exporters
kubectl apply -f exporters/manifests/
# Or deploy individually
kubectl apply -f exporters/manifests/proxmox-exporter.yaml
kubectl apply -f exporters/manifests/omada-exporter.yaml
Import Dashboards
# Import all dashboards to Grafana
./scripts/import-dashboards.sh
# Or import individually
grafana-cli admin import-dashboard dashboards/infrastructure-overview.json
Configuration
Prometheus Scrape Configuration
scrape_configs:
- job_name: 'proxmox'
static_configs:
- targets:
- 'pve-exporter.monitoring.svc.cluster.local:9221'
- job_name: 'omada'
static_configs:
- targets:
- 'omada-exporter.monitoring.svc.cluster.local:9222'
- job_name: 'network'
static_configs:
- targets:
- 'network-exporter.monitoring.svc.cluster.local:9223'
Alerting Rules
Alert rules are defined in exporters/alert-rules/:
proxmox-alerts.yaml: Proxmox cluster alertsomada-alerts.yaml: Omada controller alertsnetwork-alerts.yaml: Network infrastructure alerts
Metrics
Proxmox Metrics
pve_node_status: Node status (0=offline, 1=online)pve_vm_status: VM statuspve_storage_used_bytes: Storage usagepve_network_rx_bytes: Network receive bytespve_network_tx_bytes: Network transmit bytes
Omada Metrics
omada_ap_status: Access point statusomada_clients_total: Total client countomada_throughput_bytes: Network throughputomada_controller_status: Controller health
Network Metrics
network_port_status: Switch port statusnetwork_port_rx_bytes: Port receive bytesnetwork_port_tx_bytes: Port transmit bytesnetwork_vlan_utilization: VLAN utilization
Alerts
Critical Alerts
- Proxmox cluster node down
- Omada controller unreachable
- Network switch offline
- High resource utilization (>90%)
Warning Alerts
- High resource utilization (>80%)
- Network latency spikes
- Access point offline
- Storage pool >80% full
Troubleshooting
Exporter Issues
# Check exporter status
kubectl get pods -n monitoring -l app=proxmox-exporter
# View exporter logs
kubectl logs -n monitoring -l app=proxmox-exporter
# Test exporter endpoint
curl http://proxmox-exporter.monitoring.svc.cluster.local:9221/metrics
Dashboard Issues
# Verify dashboard import
grafana-cli admin ls-dashboard
# Check dashboard data sources
# In Grafana UI: Configuration > Data Sources