Apply Composer changes: comprehensive API updates, migrations, middleware, and infrastructure improvements

- Add comprehensive database migrations (001-024) for schema evolution
- Enhance API schema with expanded type definitions and resolvers
- Add new middleware: audit logging, rate limiting, MFA enforcement, security, tenant auth
- Implement new services: AI optimization, billing, blockchain, compliance, marketplace
- Add adapter layer for cloud integrations (Cloudflare, Kubernetes, Proxmox, storage)
- Update Crossplane provider with enhanced VM management capabilities
- Add comprehensive test suite for API endpoints and services
- Update frontend components with improved GraphQL subscriptions and real-time updates
- Enhance security configurations and headers (CSP, CORS, etc.)
- Update documentation and configuration files
- Add new CI/CD workflows and validation scripts
- Implement design system improvements and UI enhancements
This commit is contained in:
defiQUG
2025-12-12 18:01:35 -08:00
parent e01131efaf
commit 9daf1fd378
968 changed files with 160890 additions and 1092 deletions

186
scripts/PROXMOX_README.md Normal file
View File

@@ -0,0 +1,186 @@
# Proxmox Review and Planning Scripts
This directory contains scripts for connecting to Proxmox instances, reviewing configurations, checking status, and generating deployment plans.
## Scripts
### 1. `proxmox-review-and-plan.sh` (Bash)
A comprehensive bash script that:
- Connects to both Proxmox instances
- Reviews current configurations
- Checks cluster and node status
- Generates deployment plans
- Creates detailed task lists
**Requirements**:
- `curl` (required)
- `jq` (optional, for better JSON parsing)
**Usage**:
```bash
./scripts/proxmox-review-and-plan.sh
```
**Output**: Files are generated in `docs/proxmox-review/`
### 2. `proxmox-review-and-plan.py` (Python)
A Python script with more detailed API interactions:
- Better error handling
- More comprehensive status gathering
- Detailed node information collection
**Requirements**:
- Python 3.6+
- `requests` library: `pip install requests`
- `proxmoxer` (optional): `pip install proxmoxer`
**Usage**:
```bash
python3 ./scripts/proxmox-review-and-plan.py
```
**Output**: Files are generated in `docs/proxmox-review/`
## Environment Setup
Before running the scripts, ensure your `.env` file contains:
```env
# Proxmox Instance 1
PROXMOX_1_API_URL=https://192.168.11.10:8006
PROXMOX_1_USER=root
PROXMOX_1_PASS=your-password
PROXMOX_1_API_TOKEN= # Optional, preferred over password
PROXMOX_1_INSECURE_SKIP_TLS_VERIFY=false
# Proxmox Instance 2
PROXMOX_2_API_URL=https://192.168.11.11:8006
PROXMOX_2_USER=root
PROXMOX_2_PASS=your-password
PROXMOX_2_API_TOKEN= # Optional, preferred over password
PROXMOX_2_INSECURE_SKIP_TLS_VERIFY=false
```
## Output Files
After running either script, the following files will be generated in `docs/proxmox-review/`:
1. **Configuration Review** (`configuration-review-{timestamp}.md`)
- Current environment configuration
- Crossplane provider configuration
- Cloudflare tunnel configurations
2. **Deployment Plan** (`deployment-plan-{timestamp}.md`)
- Phased deployment approach
- Current status summary
- Deployment phases and steps
3. **Task List** (`task-list-{timestamp}.md`)
- Detailed task breakdown
- Priority levels
- Task descriptions and actions
4. **Status JSONs** (`proxmox-{1|2}-status-{timestamp}.json`)
- Cluster status
- Node information
- Storage configuration
- VM listings
## Quick Start
1. **Set up environment**:
```bash
cp ENV_EXAMPLES.md .env
# Edit .env with your Proxmox credentials
```
2. **Run review script**:
```bash
./scripts/proxmox-review-and-plan.sh
```
3. **Review output**:
```bash
ls -la docs/proxmox-review/
cat docs/proxmox-review/task-list-*.md
```
4. **Start with high-priority tasks**:
- Verify connectivity (TASK-001, TASK-002)
- Test authentication (TASK-003, TASK-004)
- Review configurations (TASK-005, TASK-006, TASK-007)
## Troubleshooting
### Connection Issues
If you can't connect to Proxmox instances:
1. **Check network connectivity**:
```bash
ping 192.168.11.10
ping 192.168.11.11
```
2. **Test API endpoint**:
```bash
curl -k https://192.168.11.10:8006/api2/json/version
```
3. **Verify firewall rules**:
- Port 8006 should be accessible
- Check if IP is whitelisted in Proxmox
### Authentication Issues
1. **Verify credentials**:
- Check username format: `user@realm` (e.g., `root@pam`)
- Verify password is correct
- Check if account is locked
2. **Use API tokens** (recommended):
- Create token in Proxmox Web UI
- Use format: `user@realm!token-name=token-secret`
- Set in `.env` as `PROXMOX_*_API_TOKEN`
3. **Test authentication manually**:
```bash
curl -k -X POST \
-d "username=root@pam&password=your-password" \
https://192.168.11.10:8006/api2/json/access/ticket
```
### Script Errors
1. **Missing dependencies**:
- Install `jq`: `sudo apt install jq` (Debian/Ubuntu)
- Install Python packages: `pip install requests proxmoxer`
2. **Permission errors**:
- Make scripts executable: `chmod +x scripts/*.sh`
3. **JSON parsing errors**:
- Install `jq` for better parsing
- Check if API responses are valid JSON
## Related Documentation
- [Proxmox Review Summary](../docs/proxmox/PROXMOX_REVIEW_SUMMARY.md)
- [Task List](../docs/proxmox/TASK_LIST.md)
- [Configuration Guide](../CONFIGURATION_GUIDE.md)
- [Environment Examples](../ENV_EXAMPLES.md)
## Next Steps
After running the review scripts:
1. Review the generated task list
2. Start with high-priority connection tasks
3. Update configurations based on findings
4. Proceed with Crossplane provider deployment
5. Set up monitoring and infrastructure
For detailed task information, see [TASK_LIST.md](../docs/proxmox/TASK_LIST.md).

View File

@@ -11,6 +11,8 @@ scripts/
├── setup-proxmox-agents.sh # Proxmox site agent setup
├── configure-cloudflare.sh # Cloudflare tunnel configuration
├── validate.sh # Post-install validation
├── enable-guest-agent-existing-vms.sh # Enable guest agent on all VMs
├── verify-guest-agent.sh # Verify guest agent status on all VMs
└── ansible/ # Ansible playbooks
├── site-playbook.yml # Multi-site deployment
├── inventory.example # Inventory template
@@ -29,7 +31,7 @@ scripts/
./install-components.sh
# 3. Setup Proxmox agents (run on each Proxmox node)
./setup-proxmox-agents.sh --site us-east-1 --node pve1
./setup-proxmox-agents.sh --site us-sfvalley --node ML110-01
# 4. Configure Cloudflare tunnels
./configure-cloudflare.sh
@@ -81,6 +83,92 @@ Installs all control plane components:
Configures Proxmox nodes:
- cloudflared installation
- Prometheus exporter installation
### enable-guest-agent-existing-vms.sh
Enables QEMU guest agent on all existing VMs:
- Automatically discovers all nodes on each Proxmox site
- Discovers all VMs on each node
- Checks if guest agent is already enabled
- Enables guest agent on VMs that need it
- Provides summary statistics
**Usage:**
```bash
./scripts/enable-guest-agent-existing-vms.sh
```
**Features:**
- Dynamic node and VM discovery (no hardcoded VMIDs)
- Supports API token and password authentication
- Skips VMs that already have guest agent enabled
- Provides detailed progress and summary reports
### verify-guest-agent.sh
Verifies guest agent status on all VMs:
- Lists all VMs with their guest agent status
- Shows which VMs have guest agent enabled/disabled
- Provides per-node and per-site summaries
**Usage:**
```bash
./scripts/verify-guest-agent.sh
```
**Note:** New VMs created with the updated Crossplane provider automatically have guest agent enabled in Proxmox configuration (`agent=1`). The guest agent package is also automatically installed via cloud-init userData.
### setup-dns-records.sh
Creates DNS records for Proxmox instances via Cloudflare API:
- A records for primary FQDNs
- CNAME records for API and metrics endpoints
- Automated record creation and verification
### create-proxmox-secret.sh
Creates Kubernetes secrets for Proxmox credentials:
- Interactive credential input
- Secret creation in crossplane-system namespace
- Verification of secret creation
### verify-provider-deployment.sh
Verifies Crossplane provider deployment:
- CRD existence check
- Provider deployment status
- Pod health and logs
- ProviderConfig status
- Credentials secret verification
### test-proxmox-connectivity.sh
Tests Proxmox instance connectivity:
- DNS resolution testing
- HTTP connectivity testing
- Authentication testing (with credentials)
- Version information retrieval
### deploy-crossplane-provider.sh
Automated deployment of Crossplane provider:
- Builds provider (optional)
- Installs CRDs
- Deploys provider to Kubernetes
- Verifies deployment status
### deploy-test-vms.sh
Deploys test VMs to both Proxmox instances:
- Deploys VM to Instance 1 (ML110-01)
- Deploys VM to Instance 2 (R630-01)
- Waits for VM creation
- Displays VM status
### setup-monitoring.sh
Sets up Prometheus and Grafana for Proxmox:
- Creates ServiceMonitor for Prometheus
- Configures scrape targets
- Creates alert rules
- Imports Grafana dashboards
### quick-deploy.sh
Interactive quick deployment script:
- Guides through all deployment steps
- Runs all deployment scripts in sequence
- Interactive prompts for each step
- Custom agent installation
- Service configuration

View File

@@ -0,0 +1,42 @@
#!/bin/bash
set -euo pipefail
# Add Cloud-Init to All VMs
# This script adds cloud-init drives to all SMOM-DBIS-138 VMs
echo "=========================================="
echo "Adding Cloud-Init to All VMs"
echo "=========================================="
echo ""
# Site 1 VMs
echo "=== Site 1 (ml110-01) ==="
echo "Connecting to root@192.168.11.10..."
ssh root@192.168.11.10 << 'EOF'
for vmid in 136 139 141 142 145 146 150 151; do
echo "Adding cloud-init to VMID $vmid..."
qm set $vmid --ide2 local-lvm:cloudinit --ciuser admin --ipconfig0 ip=dhcp
echo "✅ VMID $vmid done"
done
EOF
echo ""
echo "=== Site 2 (r630-01) ==="
echo "Connecting to root@192.168.11.11..."
ssh root@192.168.11.11 << 'EOF'
for vmid in 137 138 144 148 101 102 103 104; do
echo "Adding cloud-init to VMID $vmid..."
qm set $vmid --ide2 local-lvm:cloudinit --ciuser admin --ipconfig0 ip=dhcp
echo "✅ VMID $vmid done"
done
EOF
echo ""
echo "=========================================="
echo "✅ Cloud-init drives added to all VMs"
echo "=========================================="
echo ""
echo "Next step: Write userData to VMs via Proxmox Web UI"
echo " - Use all-vm-userdata.txt for the userData content"
echo " - Or use: ./scripts/get-all-userdata.sh"

View File

@@ -2,9 +2,9 @@
# Copy to inventory and customize with your hosts
[proxmox_site_1]
pve1 ansible_host=10.1.0.10 site=us-east-1
pve2 ansible_host=10.1.0.11 site=us-east-1
pve3 ansible_host=10.1.0.12 site=us-east-1
pve1 ansible_host=10.1.0.10 site=us-sfvalley
pve2 ansible_host=10.1.0.11 site=us-sfvalley
pve3 ansible_host=10.1.0.12 site=us-sfvalley
[proxmox_site_2]
pve4 ansible_host=10.2.0.10 site=eu-west-1

View File

@@ -0,0 +1,47 @@
#!/bin/bash
# apply-enhancements.sh
# Apply enhancements to remaining VM files using sed
set -euo pipefail
apply_enhancements() {
local file=$1
if grep -q "chrony" "$file"; then
echo " ⚠️ Already enhanced, skipping"
return 0
fi
# Create backup
cp "$file" "${file}.backup3"
# Add packages after lsb-release
sed -i '/- lsb-release$/a\ - chrony\n - unattended-upgrades\n - apt-listchanges' "$file"
# Add NTP configuration after package_upgrade
sed -i '/package_upgrade: true/a\ \n # Time synchronization (NTP)\n ntp:\n enabled: true\n ntp_client: chrony\n servers:\n - 0.pool.ntp.org\n - 1.pool.ntp.org\n - 2.pool.ntp.org\n - 3.pool.ntp.org' "$file"
# Update package verification
sed -i 's/for pkg in qemu-guest-agent curl wget net-tools; do/for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades; do/' "$file"
# Add security config before final_message (complex, will do manually for key files)
# This requires careful insertion
echo " ✅ Enhanced (partial - manual final_message update needed)"
}
echo "Applying enhancements to remaining files..."
echo ""
# Process remaining SMOM-DBIS-138 files
for file in examples/production/smom-dbis-138/{sentry-{02,03,04},rpc-node-{01,02,03,04},services,blockscout,monitoring,management}.yaml; do
if [ -f "$file" ]; then
echo "Processing $(basename $file)..."
apply_enhancements "$file"
fi
done
echo ""
echo "Note: final_message and security configs need manual update"
echo "Use sentry-01.yaml as template"

View File

@@ -0,0 +1,155 @@
#!/bin/bash
# Automated Database Backup Script
# Runs as a Kubernetes CronJob for daily backups
set -e
# Configuration
BACKUP_DIR="${BACKUP_DIR:-/backups/postgres}"
DB_NAME="${DB_NAME:-sankofa}"
DB_HOST="${DB_HOST:-postgres}"
DB_PORT="${DB_PORT:-5432}"
DB_USER="${DB_USER:-postgres}"
RETENTION_DAYS="${RETENTION_DAYS:-7}"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/${DB_NAME}_${TIMESTAMP}.sql"
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"
# Perform backup
perform_backup() {
log_info "Starting database backup..."
log_info "Database: ${DB_NAME}"
log_info "Backup file: ${BACKUP_FILE}"
# Check if pg_dump is available
if ! command -v pg_dump &> /dev/null; then
log_error "pg_dump not found. Install PostgreSQL client tools."
exit 1
fi
# Perform backup
if pg_dump -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" \
-F p -f "$BACKUP_FILE" 2>&1; then
log_info "Backup completed successfully"
# Compress backup
log_info "Compressing backup..."
gzip "$BACKUP_FILE"
BACKUP_FILE="${BACKUP_FILE}.gz"
# Get backup size
BACKUP_SIZE=$(du -h "$BACKUP_FILE" | cut -f1)
log_info "Backup size: ${BACKUP_SIZE}"
# Verify backup integrity
if gzip -t "$BACKUP_FILE" 2>/dev/null; then
log_info "Backup integrity verified"
else
log_error "Backup integrity check failed"
exit 1
fi
return 0
else
log_error "Backup failed"
exit 1
fi
}
# Clean up old backups
cleanup_old_backups() {
log_info "Cleaning up backups older than ${RETENTION_DAYS} days..."
local deleted_count=0
while IFS= read -r -d '' backup_file; do
log_info "Deleting old backup: $(basename "$backup_file")"
rm -f "$backup_file"
((deleted_count++))
done < <(find "$BACKUP_DIR" -name "${DB_NAME}_*.sql.gz" -type f -mtime +$RETENTION_DAYS -print0)
if [ $deleted_count -gt 0 ]; then
log_info "Deleted ${deleted_count} old backup(s)"
else
log_info "No old backups to delete"
fi
}
# Upload to S3 (optional)
upload_to_s3() {
if [ -n "$S3_BUCKET" ] && command -v aws &> /dev/null; then
log_info "Uploading backup to S3..."
S3_PATH="s3://${S3_BUCKET}/postgres-backups/$(basename "$BACKUP_FILE")"
if aws s3 cp "$BACKUP_FILE" "$S3_PATH"; then
log_info "Backup uploaded to S3: ${S3_PATH}"
else
log_warn "Failed to upload backup to S3"
fi
fi
}
# Send notification (optional)
send_notification() {
if [ -n "$WEBHOOK_URL" ]; then
local status="$1"
local message="Database backup ${status}: ${BACKUP_FILE}"
curl -X POST "$WEBHOOK_URL" \
-H "Content-Type: application/json" \
-d "{\"text\": \"${message}\"}" \
> /dev/null 2>&1 || true
fi
}
# Main execution
main() {
log_info "=========================================="
log_info "Database Backup Automation"
log_info "=========================================="
log_info ""
# Perform backup
if perform_backup; then
# Clean up old backups
cleanup_old_backups
# Upload to S3 if configured
upload_to_s3
# Send success notification
send_notification "succeeded"
log_info "Backup process completed successfully"
exit 0
else
# Send failure notification
send_notification "failed"
log_error "Backup process failed"
exit 1
fi
}
main "$@"

62
scripts/backup-database.sh Executable file
View File

@@ -0,0 +1,62 @@
#!/bin/bash
set -euo pipefail
# Database Backup Script
# Creates automated backups of PostgreSQL database
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
BACKUP_DIR="${BACKUP_DIR:-/backups}"
RETENTION_DAYS="${RETENTION_DAYS:-30}"
NAMESPACE="${NAMESPACE:-api}"
log_info() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] [INFO] $1"
}
log_error() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] [ERROR] $1" >&2
}
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Get database credentials
DB_NAME="${DB_NAME:-sankofa}"
DB_USER="${DB_USER:-sankofa}"
# Generate backup filename
BACKUP_FILE="$BACKUP_DIR/sankofa-backup-$(date +%Y%m%d-%H%M%S).sql.gz"
log_info "Starting database backup..."
# Create backup
if kubectl get deployment postgres -n "$NAMESPACE" &>/dev/null; then
# Backup from Kubernetes deployment
kubectl exec -n "$NAMESPACE" deployment/postgres -- \
pg_dump -U "$DB_USER" "$DB_NAME" | gzip > "$BACKUP_FILE"
elif kubectl get statefulset postgres -n "$NAMESPACE" &>/dev/null; then
# Backup from StatefulSet
kubectl exec -n "$NAMESPACE" statefulset/postgres -- \
pg_dump -U "$DB_USER" "$DB_NAME" | gzip > "$BACKUP_FILE"
else
log_error "PostgreSQL deployment not found in namespace $NAMESPACE"
exit 1
fi
# Verify backup
if [ ! -f "$BACKUP_FILE" ] || [ ! -s "$BACKUP_FILE" ]; then
log_error "Backup file is missing or empty"
exit 1
fi
BACKUP_SIZE=$(du -h "$BACKUP_FILE" | cut -f1)
log_info "Backup completed: $BACKUP_FILE ($BACKUP_SIZE)"
# Cleanup old backups
log_info "Cleaning up backups older than $RETENTION_DAYS days..."
find "$BACKUP_DIR" -name "sankofa-backup-*.sql.gz" -mtime +$RETENTION_DAYS -delete
log_info "Backup process completed successfully"

View File

@@ -0,0 +1,43 @@
#!/bin/bash
# batch-enhance-all-vms.sh
# Batch enhance all VM YAML files with Python script
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
ENHANCE_SCRIPT="$SCRIPT_DIR/enhance-vm-yaml.py"
log() {
echo -e "\033[0;34m[$(date +'%Y-%m-%d %H:%M:%S')]\033[0m $*"
}
log_success() {
echo -e "\033[0;32m[$(date +'%Y-%m-%d %H:%M:%S')] ✅\033[0m $*"
}
echo "=========================================="
echo "Batch Enhancing All VM YAML Files"
echo "=========================================="
echo ""
# Find all VM YAML files (exclude backups and templates if needed)
VM_FILES=$(find "$PROJECT_ROOT/examples/production" -name "*.yaml" -type f | grep -v ".backup" | sort)
TOTAL=$(echo "$VM_FILES" | wc -l)
log "Found $TOTAL VM YAML files to process"
echo ""
# Process each file
for file in $VM_FILES; do
python3 "$ENHANCE_SCRIPT" "$file"
done
echo ""
echo "=========================================="
log_success "Batch enhancement complete!"
echo "=========================================="
echo ""
log "Backup files created with .backup extension"
log "Review changes and remove backups when satisfied"

View File

@@ -0,0 +1,70 @@
#!/bin/bash
# Build and push Crossplane provider
# DEPLOY-020: Build and deploy Crossplane provider
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
PROVIDER_DIR="$PROJECT_ROOT/crossplane-provider-proxmox"
echo "=== Building Crossplane Provider ==="
echo ""
# Check if Go is installed
if ! command -v go &> /dev/null; then
echo "✗ Go is not installed or not in PATH"
echo " Please install Go 1.21 or later"
exit 1
fi
GO_VERSION=$(go version | awk '{print $3}' | sed 's/go//')
echo "✓ Found Go $GO_VERSION"
echo ""
cd "$PROVIDER_DIR"
# Verify module path
echo "Verifying Go module..."
go mod verify
go mod tidy
# Generate code
echo ""
echo "Generating code and CRDs..."
make generate
make manifests
# Run tests
echo ""
echo "Running tests..."
make test || echo "⚠ Some tests failed, but continuing..."
# Build binary
echo ""
echo "Building provider binary..."
make build
# Build Docker image (if Docker is available)
if command -v docker &> /dev/null; then
echo ""
echo "Building Docker image..."
IMG="${IMG:-ghcr.io/sankofa/crossplane-provider-proxmox:latest}"
make docker-build IMG="$IMG"
echo ""
echo "✓ Docker image built: $IMG"
echo ""
echo "To push the image:"
echo " docker push $IMG"
echo " or"
echo " make docker-push IMG=$IMG"
else
echo ""
echo "⚠ Docker is not available, skipping Docker build"
echo " Binary built in: $PROVIDER_DIR/bin/provider"
fi
echo ""
echo "=== Build Complete ==="

173
scripts/check-all-vm-ips.sh Executable file
View File

@@ -0,0 +1,173 @@
#!/bin/bash
# check-all-vm-ips.sh
# Check IP addresses for all VMs via guest agent and ARP table
set -euo pipefail
PROXMOX_1_HOST="192.168.11.10"
PROXMOX_2_HOST="192.168.11.11"
PROXMOX_PASS="L@kers2010"
SITE1_VMS="136 139 141 142 145 146 150 151"
SITE2_VMS="101 104 137 138 144 148"
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
check_vm_ip() {
local host=$1
local vmid=$2
local vmname=$3
printf "%-8s %-30s " "$vmid" "$vmname"
# Method 1: Try guest agent
guest_ip=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm guest exec $vmid -- 'hostname -I' 2>&1" 2>/dev/null || true)
if echo "$guest_ip" | grep -qE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+'; then
ips=$(echo "$guest_ip" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | tr '\n' ',' | sed 's/,$//')
echo -e "${GREEN}$ips${NC} (via guest agent)"
return 0
fi
# Method 2: Try ARP table lookup
mac=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm config $vmid | grep -oP 'virtio=\K[^,]+' | head -1 | tr '[:upper:]' '[:lower:]'" 2>/dev/null || echo "")
if [ -n "$mac" ]; then
arp_ip=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"ip neigh show | grep -i \"$mac\" | awk '{print \$1}' | head -1" 2>/dev/null || echo "")
if [ -n "$arp_ip" ] && [ "$arp_ip" != "NOT_FOUND" ]; then
echo -e "${YELLOW}⚠️ $arp_ip${NC} (via ARP table - guest agent not running)"
return 1
fi
fi
# Check guest agent status
if echo "$guest_ip" | grep -q "QEMU guest agent is not running"; then
echo -e "${YELLOW}⚠️ Guest agent not running${NC}"
elif echo "$guest_ip" | grep -q "No QEMU guest agent configured"; then
echo -e "${RED}❌ Guest agent not configured${NC}"
else
echo -e "${YELLOW}⚠️ No IP found${NC}"
fi
return 1
}
echo "=========================================="
echo "VM IP Address Check - All Methods"
echo "=========================================="
echo ""
printf "%-8s %-30s %s\n" "VMID" "Name" "IP Address(es) & Method"
printf "%-8s %-30s %s\n" "----" "----" "----------------------"
echo ""
echo "Site 1 (ml110-01):"
site1_guest=0
site1_arp=0
site1_none=0
for vmid in $SITE1_VMS; do
name=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_1_HOST \
"qm config $vmid | grep '^name:' | cut -d' ' -f2" || echo "unknown")
result=$(check_vm_ip "$PROXMOX_1_HOST" "$vmid" "$name" 2>&1)
echo "$result"
if echo "$result" | grep -q "via guest agent"; then
site1_guest=$((site1_guest + 1))
elif echo "$result" | grep -q "via ARP table"; then
site1_arp=$((site1_arp + 1))
else
site1_none=$((site1_none + 1))
fi
done
echo ""
echo "Site 2 (r630-01):"
site2_guest=0
site2_arp=0
site2_none=0
for vmid in $SITE2_VMS; do
name=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_2_HOST \
"qm config $vmid | grep '^name:' | cut -d' ' -f2" || echo "unknown")
result=$(check_vm_ip "$PROXMOX_2_HOST" "$vmid" "$name" 2>&1)
echo "$result"
if echo "$result" | grep -q "via guest agent"; then
site2_guest=$((site2_guest + 1))
elif echo "$result" | grep -q "via ARP table"; then
site2_arp=$((site2_arp + 1))
else
site2_none=$((site2_none + 1))
fi
done
echo ""
echo "=========================================="
echo "Summary"
echo "=========================================="
total_guest=$((site1_guest + site2_guest))
total_arp=$((site1_arp + site2_arp))
total_none=$((site1_none + site2_none))
total=$((total_guest + total_arp + total_none))
echo "Total VMs: $total"
echo -e "${GREEN}VMs with IP via guest agent: $total_guest${NC}"
echo -e "${YELLOW}VMs with IP via ARP table: $total_arp${NC}"
echo -e "${RED}VMs without IP: $total_none${NC}"
echo ""
# Check Proxmox IP assignment capability
echo "=========================================="
echo "Proxmox IP Assignment Capability"
echo "=========================================="
if [ $total_guest -gt 0 ]; then
echo -e "${GREEN}✅ Proxmox CAN assign IP addresses via guest agent${NC}"
echo " Guest agent is working for $total_guest VM(s)"
else
echo -e "${YELLOW}⚠️ Proxmox CANNOT assign IP addresses via guest agent yet${NC}"
echo " Guest agent service is not running in VMs"
fi
if [ $total_arp -gt 0 ]; then
echo -e "${YELLOW}⚠️ $total_arp VM(s) have IPs visible in ARP table${NC}"
echo " These VMs have network connectivity but guest agent not running"
fi
if [ $total_none -gt 0 ]; then
echo -e "${RED}$total_none VM(s) have no IP addresses${NC}"
echo " These VMs may not be fully booted or have network issues"
fi
echo ""
echo "=========================================="
echo "Recommendations"
echo "=========================================="
if [ $total_guest -eq 0 ]; then
echo "1. Guest agent package needs to be installed/started in VMs"
echo "2. Wait for cloud-init to complete, or manually install:"
echo " sudo apt-get install -y qemu-guest-agent"
echo " sudo systemctl enable qemu-guest-agent"
echo " sudo systemctl start qemu-guest-agent"
echo "3. Once guest agent is running, Proxmox will automatically"
echo " detect and assign IP addresses"
fi
if [ $total_arp -gt 0 ]; then
echo "4. VMs with ARP table IPs have network connectivity"
echo " but need guest agent for Proxmox to manage them properly"
fi
echo ""

196
scripts/check-cluster-status.sh Executable file
View File

@@ -0,0 +1,196 @@
#!/bin/bash
# check-cluster-status.sh
# Checks the status of a Proxmox cluster
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
# Configuration
CLUSTER_NAME="${1:-sankofa-sfv-01}"
NODE1_IP="192.168.11.10"
NODE1_NAME="ML110-01"
NODE1_TOKEN="${PROXMOX_TOKEN_ML110_01:-}"
NODE2_IP="192.168.11.11"
NODE2_NAME="R630-01"
NODE2_TOKEN="${PROXMOX_TOKEN_R630_01:-}"
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_cluster_status() {
local endpoint=$1
local token=$2
local node_name=$3
echo ""
info "=== ${node_name} Cluster Status ==="
# Check cluster status
local status_response=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" \
"${endpoint}/api2/json/cluster/status" 2>/dev/null)
if echo "$status_response" | jq -e '.data' >/dev/null 2>&1; then
local cluster_name=$(echo "$status_response" | jq -r '.data[0].name // "unknown"')
local node_count=$(echo "$status_response" | jq -r '.data | length')
if [ "$cluster_name" != "null" ] && [ -n "$cluster_name" ]; then
log "✓ Cluster found: ${cluster_name}"
log " Node count: ${node_count}"
echo "$status_response" | jq -r '.data[] | " • \(.name) - Type: \(.type) - Status: \(.status // "unknown")"'
else
warn "Cluster status endpoint accessible but no cluster data"
fi
else
local error_msg=$(echo "$status_response" | jq -r '.message // "Unknown error"')
if echo "$error_msg" | grep -q "Permission check failed"; then
warn "Permission denied (may need Sys.Audit permission)"
else
warn "Not in cluster or cluster not accessible: ${error_msg}"
fi
fi
}
check_cluster_nodes() {
local endpoint=$1
local token=$2
local node_name=$3
echo ""
info "=== ${node_name} Cluster Nodes ==="
local nodes_response=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" \
"${endpoint}/api2/json/cluster/config/nodes" 2>/dev/null)
if echo "$nodes_response" | jq -e '.data' >/dev/null 2>&1; then
local node_count=$(echo "$nodes_response" | jq -r '.data | length')
if [ "$node_count" -gt 0 ]; then
log "✓ Cluster nodes found: ${node_count}"
echo "$nodes_response" | jq -r '.data[] | " • Node: \(.node) - Node ID: \(.nodeid) - Votes: \(.votes)"'
else
warn "No nodes found in cluster configuration"
fi
else
local error_msg=$(echo "$nodes_response" | jq -r '.message // "Unknown error"')
if echo "$error_msg" | grep -q "Permission check failed"; then
warn "Permission denied (may need Sys.Audit permission)"
else
warn "Cluster nodes not accessible: ${error_msg}"
fi
fi
}
check_cluster_config() {
local endpoint=$1
local token=$2
local node_name=$3
echo ""
info "=== ${node_name} Cluster Configuration ==="
local config_response=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" \
"${endpoint}/api2/json/cluster/config" 2>/dev/null)
if echo "$config_response" | jq -e '.data' >/dev/null 2>&1; then
local cluster_name=$(echo "$config_response" | jq -r '.data.clustername // "unknown"')
if [ "$cluster_name" != "null" ] && [ -n "$cluster_name" ]; then
log "✓ Cluster name: ${cluster_name}"
if [ "$cluster_name" = "$CLUSTER_NAME" ]; then
log " ✓ Matches expected cluster name: ${CLUSTER_NAME}"
else
warn " ⚠ Cluster name mismatch. Expected: ${CLUSTER_NAME}, Found: ${cluster_name}"
fi
else
warn "Cluster config accessible but no cluster name found"
fi
else
local error_msg=$(echo "$config_response" | jq -r '.message // "Unknown error"')
if echo "$error_msg" | grep -q "Permission check failed"; then
warn "Permission denied (may need Sys.Audit permission)"
else
warn "Cluster config not accessible: ${error_msg}"
fi
fi
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Proxmox Cluster Status Check ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
info "Checking cluster: ${CLUSTER_NAME}"
echo ""
# Check Node 1
check_cluster_status "https://${NODE1_IP}:8006" "${NODE1_TOKEN}" "${NODE1_NAME}"
check_cluster_nodes "https://${NODE1_IP}:8006" "${NODE1_TOKEN}" "${NODE1_NAME}"
check_cluster_config "https://${NODE1_IP}:8006" "${NODE1_TOKEN}" "${NODE1_NAME}"
# Check Node 2
check_cluster_status "https://${NODE2_IP}:8006" "${NODE2_TOKEN}" "${NODE2_NAME}"
check_cluster_nodes "https://${NODE2_IP}:8006" "${NODE2_TOKEN}" "${NODE2_NAME}"
check_cluster_config "https://${NODE2_IP}:8006" "${NODE2_TOKEN}" "${NODE2_NAME}"
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Summary ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
# Try to determine overall status
local node1_status=$(curl -k -s -H "Authorization: PVEAPIToken ${NODE1_TOKEN}" \
"https://${NODE1_IP}:8006/api2/json/cluster/status" 2>/dev/null | \
jq -r 'if .data then "in-cluster" else "standalone" end' 2>/dev/null || echo "unknown")
local node2_status=$(curl -k -s -H "Authorization: PVEAPIToken ${NODE2_TOKEN}" \
"https://${NODE2_IP}:8006/api2/json/cluster/status" 2>/dev/null | \
jq -r 'if .data then "in-cluster" else "standalone" end' 2>/dev/null || echo "unknown")
if [ "$node1_status" = "in-cluster" ] && [ "$node2_status" = "in-cluster" ]; then
log "✓ Both nodes appear to be in a cluster"
elif [ "$node1_status" = "standalone" ] && [ "$node2_status" = "standalone" ]; then
warn "Both nodes are standalone (not clustered)"
else
warn "Mixed status: Node 1: ${node1_status}, Node 2: ${node2_status}"
fi
echo ""
info "Note: Some checks may require additional API permissions (Sys.Audit)"
info "For full cluster status, use Proxmox web UI or SSH: pvecm status"
echo ""
}
main "$@"

87
scripts/check-dependencies.sh Executable file
View File

@@ -0,0 +1,87 @@
#!/bin/bash
# check-dependencies.sh
# Checks if all required dependencies are installed
set -euo pipefail
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
NC='\033[0m'
MISSING=0
OPTIONAL_MISSING=0
check_required() {
local cmd=$1
local name=$2
if command -v "$cmd" &> /dev/null; then
echo -e "${GREEN}${NC} $name"
return 0
else
echo -e "${RED}${NC} $name (REQUIRED)"
((MISSING++))
return 1
fi
}
check_optional() {
local cmd=$1
local name=$2
if command -v "$cmd" &> /dev/null; then
echo -e "${GREEN}${NC} $name (optional)"
return 0
else
echo -e "${YELLOW}${NC} $name (optional, not installed)"
((OPTIONAL_MISSING++))
return 1
fi
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Dependency Check ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
echo "Required Dependencies:"
echo "----------------------"
check_required "kubectl" "kubectl (Kubernetes CLI)"
check_required "curl" "curl"
check_required "jq" "jq (JSON processor)"
echo ""
echo "Optional Dependencies:"
echo "----------------------"
check_optional "go" "Go (for building provider)"
check_optional "make" "make (for building)"
check_optional "docker" "Docker (for container builds)"
check_optional "kind" "kind (for local Kubernetes)"
check_optional "terraform" "Terraform (for infrastructure)"
check_optional "yamllint" "yamllint (for YAML validation)"
check_optional "dig" "dig (for DNS testing)"
check_optional "nslookup" "nslookup (for DNS testing)"
echo ""
if [ $MISSING -eq 0 ]; then
echo -e "${GREEN}✓ All required dependencies are installed${NC}"
if [ $OPTIONAL_MISSING -gt 0 ]; then
echo -e "${YELLOW}⚠ Some optional dependencies are missing (not critical)${NC}"
fi
exit 0
else
echo -e "${RED}✗ Missing $MISSING required dependency/dependencies${NC}"
echo ""
echo "Install missing dependencies:"
echo " Ubuntu/Debian: sudo apt-get install -y kubectl curl jq"
echo " macOS: brew install kubectl curl jq"
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,138 @@
#!/bin/bash
# Check if qemu-guest-agent is installed inside VM 100
# Run on Proxmox node: root@ml110-01
VMID=100
echo "=========================================="
echo "Checking qemu-guest-agent in VM $VMID"
echo "=========================================="
echo ""
# Step 1: Check VM status
echo "Step 1: VM Status"
echo "--------------------------------------"
VM_STATUS=$(qm status $VMID | awk '{print $2}')
echo "VM Status: $VM_STATUS"
if [ "$VM_STATUS" != "running" ]; then
echo "⚠️ VM is not running. Start it first:"
echo " qm start $VMID"
exit 1
fi
echo ""
# Step 2: Check guest agent config in Proxmox
echo "Step 2: Guest Agent Configuration (Proxmox)"
echo "--------------------------------------"
AGENT_CONFIG=$(qm config $VMID | grep '^agent:' || echo "")
if [ -z "$AGENT_CONFIG" ]; then
echo "❌ Guest agent NOT configured in Proxmox"
echo " This needs to be set first: qm set $VMID --agent 1"
else
echo "✅ Guest agent configured: $AGENT_CONFIG"
fi
echo ""
# Step 3: Try to check if package is installed via guest exec
echo "Step 3: Checking if qemu-guest-agent Package is Installed"
echo "--------------------------------------"
echo "Attempting to check via qm guest exec..."
echo ""
# Try to check package installation
PACKAGE_CHECK=$(qm guest exec $VMID -- dpkg -l | grep qemu-guest-agent 2>&1)
EXEC_EXIT_CODE=$?
if [ $EXEC_EXIT_CODE -eq 0 ] && echo "$PACKAGE_CHECK" | grep -q "qemu-guest-agent"; then
echo "✅ qemu-guest-agent package IS installed"
echo ""
echo "Package details:"
echo "$PACKAGE_CHECK" | grep qemu-guest-agent
echo ""
# Check if service is running
echo "Checking service status..."
SERVICE_STATUS=$(qm guest exec $VMID -- systemctl status qemu-guest-agent --no-pager 2>&1)
if echo "$SERVICE_STATUS" | grep -q "active (running)"; then
echo "✅ qemu-guest-agent service IS running"
elif echo "$SERVICE_STATUS" | grep -q "inactive"; then
echo "⚠️ qemu-guest-agent service is installed but NOT running"
echo ""
echo "To start it:"
echo " qm guest exec $VMID -- systemctl enable --now qemu-guest-agent"
else
echo "⚠️ Could not determine service status"
echo "Service status output:"
echo "$SERVICE_STATUS"
fi
elif echo "$PACKAGE_CHECK" | grep -q "No QEMU guest agent configured"; then
echo "❌ Guest agent not configured in Proxmox"
echo " Run: qm set $VMID --agent 1"
elif echo "$PACKAGE_CHECK" | grep -q "QEMU guest agent is not running"; then
echo "⚠️ Guest agent configured but service not running"
echo " The package may not be installed, or the service isn't started"
echo ""
echo "Try installing via console or SSH:"
echo " sudo apt-get update"
echo " sudo apt-get install -y qemu-guest-agent"
echo " sudo systemctl enable --now qemu-guest-agent"
else
echo "❌ qemu-guest-agent package is NOT installed"
echo ""
echo "Error details:"
echo "$PACKAGE_CHECK"
echo ""
echo "To install, you need to access the VM via:"
echo " 1. SSH (if you have the IP and SSH access)"
echo " 2. Proxmox console (qm terminal $VMID or via web UI)"
echo ""
echo "Then run inside the VM:"
echo " sudo apt-get update"
echo " sudo apt-get install -y qemu-guest-agent"
echo " sudo systemctl enable --now qemu-guest-agent"
fi
echo ""
# Step 4: Alternative check methods
echo "Step 4: Alternative Check Methods"
echo "--------------------------------------"
echo "If qm guest exec doesn't work, try these:"
echo ""
echo "1. Get VM IP address (if guest agent working):"
echo " qm guest exec $VMID -- hostname -I"
echo ""
echo "2. Check via SSH (if you have IP and access):"
echo " ssh admin@<VM_IP> 'dpkg -l | grep qemu-guest-agent'"
echo ""
echo "3. Use Proxmox console:"
echo " - Open Proxmox web UI"
echo " - Go to VM 100 > Console"
echo " - Login and run: dpkg -l | grep qemu-guest-agent"
echo ""
echo "4. Check cloud-init logs (if available):"
echo " qm guest exec $VMID -- cat /var/log/cloud-init-output.log | grep -i 'qemu\|guest'"
echo ""
# Step 5: Summary
echo "=========================================="
echo "Summary"
echo "=========================================="
if [ -n "$AGENT_CONFIG" ]; then
echo "✅ Proxmox config: Guest agent enabled"
else
echo "❌ Proxmox config: Guest agent NOT enabled"
fi
if [ $EXEC_EXIT_CODE -eq 0 ] && echo "$PACKAGE_CHECK" | grep -q "qemu-guest-agent"; then
echo "✅ Package: qemu-guest-agent IS installed"
echo "✅ Status: Ready to use"
else
echo "❌ Package: qemu-guest-agent NOT installed or not accessible"
echo ""
echo "Next steps:"
echo " 1. Ensure agent=1 is set: qm set $VMID --agent 1"
echo " 2. Install package inside VM (via SSH or console)"
echo " 3. Enable and start service inside VM"
fi
echo ""

View File

@@ -0,0 +1,176 @@
#!/bin/bash
# Check Proxmox resources via SSH
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
# Load environment
load_env() {
if [ -f "${PROJECT_ROOT}/.env" ]; then
source "${PROJECT_ROOT}/.env"
fi
PROXMOX_1_HOST="${PROXMOX_1_HOST:-192.168.11.10}"
PROXMOX_2_HOST="${PROXMOX_2_HOST:-192.168.11.11}"
PROXMOX_PASS="${PROXMOX_ROOT_PASS:-L@kers2010}"
}
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
log_info() {
echo -e "${CYAN}[$(date +'%Y-%m-%d %H:%M:%S')] ${NC} $*"
}
check_node_via_ssh() {
local host=$1
local node_name=$2
local site_name=$3
local password=$4
log "Checking ${site_name} (${host})..."
if ! command -v sshpass &> /dev/null; then
log_error "sshpass is required. Install with: apt-get install sshpass"
return 1
fi
# Get CPU info
local cpu_total cpu_avail
cpu_total=$(sshpass -p "${password}" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${host} "nproc" 2>/dev/null || echo "0")
# Get memory info
local mem_total mem_available mem_used
mem_total=$(sshpass -p "${password}" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${host} "free -g | awk '/^Mem:/ {print \$2}'" 2>/dev/null || echo "0")
mem_available=$(sshpass -p "${password}" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${host} "free -g | awk '/^Mem:/ {print \$7}'" 2>/dev/null || echo "0")
mem_used=$(sshpass -p "${password}" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${host} "free -g | awk '/^Mem:/ {print \$3}'" 2>/dev/null || echo "0")
# Get VM resource usage
local vm_cpu_total vm_mem_total
vm_cpu_total=$(sshpass -p "${password}" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${host} "qm list 2>/dev/null | tail -n +2 | awk '{sum+=\$3} END {print sum+0}'" 2>/dev/null || echo "0")
vm_mem_total=$(sshpass -p "${password}" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${host} "qm list 2>/dev/null | tail -n +2 | awk '{sum+=\$4} END {print sum+0}'" 2>/dev/null || echo "0")
# Calculate available
local cpu_avail mem_avail_gib
cpu_avail=$((cpu_total - vm_cpu_total))
mem_avail_gib=${mem_available}
log_info "=== Node: ${node_name} ==="
echo " CPU: ${cpu_avail} / ${cpu_total} cores available (${vm_cpu_total} used by VMs)"
echo " Memory: ${mem_avail_gib} GiB / ${mem_total} GiB available (${vm_mem_total} GiB used by VMs, ${mem_used} GiB system used)"
# Check storage
log_info "=== Storage Pools ==="
sshpass -p "${password}" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${host} "pvesm status 2>/dev/null" | tail -n +2 | while read -r line; do
if [ -n "$line" ]; then
echo " - $line"
fi
done
echo "${cpu_avail}|${cpu_total}|${mem_avail_gib}|${mem_total}"
}
# Required resources
REQUIRED_CPU=72
REQUIRED_RAM=140
REQUIRED_DISK=278
main() {
log "=========================================="
log "Proxmox Resource Quota Check (SSH Method)"
log "=========================================="
log ""
log_info "Required Resources:"
log " CPU: ${REQUIRED_CPU} cores"
log " RAM: ${REQUIRED_RAM} GiB"
log " Disk: ${REQUIRED_DISK} GiB"
log ""
load_env
# Check Site 1
site1_result=$(check_node_via_ssh "${PROXMOX_1_HOST}" "ml110-01" "Site 1" "${PROXMOX_PASS}" 2>&1)
site1_status=$?
if [ $site1_status -eq 0 ]; then
IFS='|' read -r cpu1_avail cpu1_total mem1_avail mem1_total <<< "$(echo "$site1_result" | tail -1)"
else
log_warning "Could not check Site 1"
cpu1_avail=0
cpu1_total=0
mem1_avail=0
mem1_total=0
fi
echo ""
# Check Site 2
site2_result=$(check_node_via_ssh "${PROXMOX_2_HOST}" "r630-01" "Site 2" "${PROXMOX_PASS}" 2>&1)
site2_status=$?
if [ $site2_status -eq 0 ]; then
IFS='|' read -r cpu2_avail cpu2_total mem2_avail mem2_total <<< "$(echo "$site2_result" | tail -1)"
else
log_warning "Could not check Site 2"
cpu2_avail=0
cpu2_total=0
mem2_avail=0
mem2_total=0
fi
echo ""
log "=========================================="
log_info "=== Summary ==="
total_cpu_avail=$((cpu1_avail + cpu2_avail))
total_cpu=$((cpu1_total + cpu2_total))
total_mem_avail=$((mem1_avail + mem2_avail))
total_mem=$((mem1_total + mem2_total))
echo " Total CPU Available: ${total_cpu_avail} / ${total_cpu} cores"
echo " Total Memory Available: ${total_mem_avail} / ${total_mem} GiB"
echo ""
# Compare with requirements
log_info "=== Resource Comparison ==="
if [ ${total_cpu_avail} -ge ${REQUIRED_CPU} ]; then
log_success "CPU: ${total_cpu_avail} >= ${REQUIRED_CPU}"
else
log_error "CPU: ${total_cpu_avail} < ${REQUIRED_CPU} ❌ (Shortfall: $((REQUIRED_CPU - total_cpu_avail)) cores)"
fi
if [ ${total_mem_avail} -ge ${REQUIRED_RAM} ]; then
log_success "RAM: ${total_mem_avail} GiB >= ${REQUIRED_RAM} GiB ✅"
else
log_error "RAM: ${total_mem_avail} GiB < ${REQUIRED_RAM} GiB ❌ (Shortfall: $((REQUIRED_RAM - total_mem_avail)) GiB)"
fi
echo ""
log "=========================================="
log_success "Quota check completed!"
}
main "$@"

69
scripts/check-proxmox-vms.sh Executable file
View File

@@ -0,0 +1,69 @@
#!/bin/bash
# check-proxmox-vms.sh
# Check all VMs in Proxmox
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Load environment
if [ -f "${PROJECT_ROOT}/.env" ]; then
set -a
source "${PROJECT_ROOT}/.env"
set +a
fi
PROXMOX_PASS="${PROXMOX_ROOT_PASS:-L@kers2010}"
PROXMOX_1_URL="https://192.168.11.10:8006"
PROXMOX_2_URL="https://192.168.11.11:8006"
# Get ticket
get_ticket() {
local api_url=$1
local response
response=$(curl -k -s -X POST \
-d "username=root@pam&password=${PROXMOX_PASS}" \
"${api_url}/api2/json/access/ticket" 2>/dev/null)
if command -v jq &> /dev/null; then
echo "${response}" | jq -r '.data.ticket // empty' 2>/dev/null
else
echo "${response}" | grep -o '"ticket":"[^"]*' | head -1 | cut -d'"' -f4
fi
}
# List VMs
list_vms() {
local api_url=$1
local node=$2
local ticket=$3
curl -k -s -b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu" 2>/dev/null | \
jq -r '.data[] | "\(.vmid) - \(.name) - \(.status)"' 2>/dev/null | sort -n
}
main() {
echo "=== Site 1 (ml110-01) ==="
local ticket1
ticket1=$(get_ticket "${PROXMOX_1_URL}")
if [ -n "${ticket1}" ]; then
list_vms "${PROXMOX_1_URL}" "ml110-01" "${ticket1}"
else
echo "Failed to authenticate"
fi
echo ""
echo "=== Site 2 (r630-01) ==="
local ticket2
ticket2=$(get_ticket "${PROXMOX_2_URL}")
if [ -n "${ticket2}" ]; then
list_vms "${PROXMOX_2_URL}" "r630-01" "${ticket2}"
else
echo "Failed to authenticate"
fi
}
main "$@"

View File

@@ -0,0 +1,159 @@
#!/bin/bash
# Check VM 100 configuration before starting
# Run on Proxmox node: root@ml110-01
VMID=100
echo "=== Pre-Start Check for VM $VMID ==="
echo ""
# 1. VM Status
echo "1. VM Status:"
qm status $VMID 2>&1
echo ""
# 2. Boot Configuration
echo "2. Boot Configuration:"
BOOT_CONFIG=$(qm config $VMID 2>&1 | grep -E "^boot:|^scsi0:|^ide2:")
if [ -z "$BOOT_CONFIG" ]; then
echo " ⚠️ No boot configuration found"
else
echo "$BOOT_CONFIG" | while read line; do
echo " $line"
done
fi
echo ""
# 3. Disk Configuration
echo "3. Disk Configuration:"
DISK_CONFIG=$(qm config $VMID 2>&1 | grep -E "^scsi0:|^scsi1:")
if [ -z "$DISK_CONFIG" ]; then
echo " ⚠️ No disk configuration found"
else
echo "$DISK_CONFIG" | while read line; do
echo " $line"
# Check if disk exists
if echo "$line" | grep -q "local-lvm:vm-$VMID-disk"; then
DISK_NAME=$(echo "$line" | sed -n 's/.*local-lvm:\(vm-[0-9]*-disk-[0-9]*\).*/\1/p')
if [ -n "$DISK_NAME" ]; then
if lvs | grep -q "$DISK_NAME"; then
DISK_SIZE=$(lvs | grep "$DISK_NAME" | awk '{print $4}')
echo " ✅ Disk exists: $DISK_NAME ($DISK_SIZE)"
else
echo " ⚠️ Disk not found: $DISK_NAME"
fi
fi
fi
done
fi
echo ""
# 4. Check for Image Import
echo "4. Checking for imported image:"
IMPORTED_DISKS=$(lvs | grep "vm-$VMID-disk" | awk '{print $1}')
if [ -z "$IMPORTED_DISKS" ]; then
echo " ⚠️ No imported disks found for VM $VMID"
echo " ⚠️ VM may have blank disk - will not boot!"
else
echo " ✅ Found disks:"
echo "$IMPORTED_DISKS" | while read disk; do
SIZE=$(lvs | grep "$disk" | awk '{print $4}')
echo " - $disk ($SIZE)"
done
fi
echo ""
# 5. Cloud-init Configuration
echo "5. Cloud-init Configuration:"
CLOUDINIT_CONFIG=$(qm config $VMID 2>&1 | grep -E "^ide2:|^ciuser:|^ipconfig0:")
if [ -z "$CLOUDINIT_CONFIG" ]; then
echo " ⚠️ No cloud-init configuration found"
else
echo "$CLOUDINIT_CONFIG" | while read line; do
echo " $line"
done
fi
echo ""
# 6. Network Configuration
echo "6. Network Configuration:"
NETWORK_CONFIG=$(qm config $VMID 2>&1 | grep -E "^net0:")
if [ -z "$NETWORK_CONFIG" ]; then
echo " ⚠️ No network configuration found"
else
echo " $NETWORK_CONFIG"
fi
echo ""
# 7. Guest Agent
echo "7. Guest Agent Configuration:"
AGENT_CONFIG=$(qm config $VMID 2>&1 | grep -E "^agent:")
if [ -z "$AGENT_CONFIG" ]; then
echo " ⚠️ Guest agent not configured"
else
echo " $AGENT_CONFIG"
fi
echo ""
# 8. Summary and Recommendations
echo "=== Summary ==="
echo ""
# Check critical issues
ISSUES=0
# Check if boot order is set
if ! qm config $VMID 2>&1 | grep -q "^boot:"; then
echo "⚠️ ISSUE: Boot order not set"
ISSUES=$((ISSUES + 1))
fi
# Check if scsi0 disk exists and has data
SCSI0_DISK=$(qm config $VMID 2>&1 | grep "^scsi0:" | sed -n 's/.*local-lvm:\(vm-[0-9]*-disk-[0-9]*\).*/\1/p')
if [ -n "$SCSI0_DISK" ]; then
if ! lvs | grep -q "$SCSI0_DISK"; then
echo "⚠️ ISSUE: scsi0 disk not found: $SCSI0_DISK"
ISSUES=$((ISSUES + 1))
else
# Check disk size (should be > 0)
DISK_SIZE_RAW=$(lvs --units g | grep "$SCSI0_DISK" | awk '{print $4}' | sed 's/g//')
if [ -n "$DISK_SIZE_RAW" ] && [ "$(echo "$DISK_SIZE_RAW < 1" | bc 2>/dev/null || echo 0)" = "1" ]; then
echo "⚠️ ISSUE: scsi0 disk appears empty (< 1GB)"
ISSUES=$((ISSUES + 1))
fi
fi
else
echo "⚠️ ISSUE: scsi0 disk not configured"
ISSUES=$((ISSUES + 1))
fi
# Check if image was imported
if [ -z "$IMPORTED_DISKS" ]; then
echo "⚠️ ISSUE: No image imported - VM will not boot"
ISSUES=$((ISSUES + 1))
fi
if [ $ISSUES -eq 0 ]; then
echo "✅ No critical issues found"
echo ""
echo "VM appears ready to start. Run:"
echo " qm start $VMID"
else
echo ""
echo "⚠️ Found $ISSUES critical issue(s)"
echo ""
echo "Do NOT start the VM until these are resolved!"
echo ""
echo "Possible fixes:"
echo "1. If image not imported, check if image exists:"
echo " find /var/lib/vz/template/iso -name 'ubuntu-22.04-cloud.img'"
echo ""
echo "2. If boot order not set:"
echo " qm set $VMID --boot order=scsi0"
echo ""
echo "3. If disk not attached:"
echo " qm set $VMID --scsi0 local-lvm:vm-$VMID-disk-X,format=qcow2"
fi
echo ""

163
scripts/cleanup-orphaned-vms.sh Executable file
View File

@@ -0,0 +1,163 @@
#!/bin/bash
# Script to clean up orphaned VMs created during the failed VM creation loop
# VMs that were created: 234, 235, 100, 101, 102
set -e
PROXMOX_ENDPOINT="${PROXMOX_ENDPOINT:-https://192.168.11.10:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-ml110-01}"
PROXMOX_USER="${PROXMOX_USER:-}"
PROXMOX_PASS="${PROXMOX_PASS:-}"
# Orphaned VM IDs from the logs
ORPHANED_VMS=(234 235 100 101 102)
if [ -z "$PROXMOX_USER" ] || [ -z "$PROXMOX_PASS" ]; then
echo "Error: PROXMOX_USER and PROXMOX_PASS must be set"
echo "Usage: PROXMOX_USER=user PROXMOX_PASS=pass ./cleanup-orphaned-vms.sh"
exit 1
fi
echo "Connecting to Proxmox at $PROXMOX_ENDPOINT..."
echo "Node: $PROXMOX_NODE"
echo ""
# Get authentication ticket
TICKET=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.ticket // empty')
if [ -z "$TICKET" ]; then
echo "Error: Failed to authenticate with Proxmox"
exit 1
fi
CSRF_TOKEN=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.CSRFPreventionToken // empty')
echo "Authentication successful"
echo ""
# List all VMs on the node first
echo "Listing VMs on node $PROXMOX_NODE..."
VMS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu" | \
jq -r '.data[] | "\(.vmid) \(.name) \(.status)"')
echo "Current VMs:"
echo "$VMS"
echo ""
# Delete orphaned VMs
for VMID in "${ORPHANED_VMS[@]}"; do
echo "Checking VM $VMID..."
# Check if VM exists
VM_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -n "$VM_EXISTS" ] && [ "$VM_EXISTS" != "null" ]; then
echo " Found VM $VMID, checking status..."
# Get VM status
STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" | \
jq -r '.data.status // "unknown"')
echo " Status: $STATUS"
# Stop VM if running
if [ "$STATUS" = "running" ]; then
echo " Stopping VM $VMID..."
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/stop" > /dev/null
# Wait for VM to stop (up to 30 seconds)
echo " Waiting for VM to stop..."
for i in {1..30}; do
sleep 1
CURRENT_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" | \
jq -r '.data.status // "unknown"')
if [ "$CURRENT_STATUS" = "stopped" ]; then
echo " VM stopped"
break
fi
done
fi
# Unlock VM if locked (common issue with failed VM creation)
echo " Unlocking VM $VMID..."
UNLOCK_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/unlock" 2>/dev/null)
if echo "$UNLOCK_RESULT" | jq -e '.data // empty' > /dev/null 2>&1; then
echo " VM unlocked"
sleep 1
fi
# Delete VM with purge option to force cleanup
echo " Deleting VM $VMID (with purge)..."
DELETE_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X DELETE \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}?purge=1")
# Check if we got a task ID (UPID)
TASK_UPID=$(echo "$DELETE_RESULT" | jq -r '.data // empty' 2>/dev/null)
if [ -n "$TASK_UPID" ] && [ "$TASK_UPID" != "null" ]; then
echo " Delete task started: $TASK_UPID"
echo " Waiting for deletion to complete..."
# Wait for task to complete (up to 60 seconds)
for i in {1..60}; do
sleep 1
TASK_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" | \
jq -r '.data.status // "unknown"')
if [ "$TASK_STATUS" = "stopped" ]; then
EXIT_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" | \
jq -r '.data.exitstatus // "unknown"')
if [ "$EXIT_STATUS" = "OK" ] || [ "$EXIT_STATUS" = "0" ]; then
echo " ✅ Successfully deleted VM $VMID"
else
echo " ⚠️ Delete task completed with status: $EXIT_STATUS"
fi
break
fi
done
# Verify VM is actually gone
sleep 2
VM_STILL_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_STILL_EXISTS" ] || [ "$VM_STILL_EXISTS" = "null" ]; then
echo " ✅ Verified: VM $VMID is deleted"
else
echo " ⚠️ Warning: VM $VMID may still exist"
fi
else
echo " ⚠️ Warning: May have failed to delete VM $VMID"
echo " Response: $DELETE_RESULT"
fi
else
echo " VM $VMID not found (may have been already deleted)"
fi
echo ""
done
echo "Cleanup complete!"

View File

@@ -0,0 +1,78 @@
#!/bin/bash
# Automated cleanup script for VM 100 on Proxmox node
# Run this script on the Proxmox node: root@ml110-01
VMID=100
echo "=== Automated Cleanup for VM $VMID ==="
echo ""
# 1. Kill all stuck processes
echo "1. Killing stuck processes..."
pkill -9 -f "task.*$VMID" 2>/dev/null && echo " ✅ Killed task processes" || echo " No task processes found"
pkill -9 -f "qm.*$VMID" 2>/dev/null && echo " ✅ Killed qm processes" || echo " No qm processes found"
pkill -9 -f "qemu.*$VMID" 2>/dev/null && echo " ✅ Killed qemu processes" || echo " No qemu processes found"
# Wait for processes to die
sleep 3
# 2. Remove lock file
echo ""
echo "2. Removing lock file..."
if [ -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
rm -f "/var/lock/qemu-server/lock-$VMID.conf"
if [ ! -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
echo " ✅ Lock file removed"
else
echo " ⚠️ Failed to remove lock file"
exit 1
fi
else
echo " Lock file already removed"
fi
# 3. Verify cleanup
echo ""
echo "3. Verifying cleanup..."
REMAINING_PROCS=$(ps aux | grep -E "task.*$VMID|qm.*$VMID|qemu.*$VMID" | grep -v grep)
if [ -z "$REMAINING_PROCS" ]; then
echo " ✅ No processes remaining"
else
echo " ⚠️ Some processes still running:"
echo "$REMAINING_PROCS" | while read line; do
echo " $line"
done
fi
LOCK_EXISTS=$(ls -la /var/lock/qemu-server/lock-$VMID.conf 2>&1)
if echo "$LOCK_EXISTS" | grep -q "No such file"; then
echo " ✅ Lock file confirmed removed"
else
echo " ⚠️ Lock file still exists:"
echo " $LOCK_EXISTS"
fi
# 4. Attempt unlock
echo ""
echo "4. Attempting unlock..."
if qm unlock $VMID 2>&1; then
echo " ✅ VM unlocked successfully"
else
UNLOCK_RESULT=$?
echo " ⚠️ Unlock returned exit code: $UNLOCK_RESULT"
echo " (This may be normal if lock was already cleared)"
fi
# 5. Final status
echo ""
echo "5. Final VM status..."
qm status $VMID 2>&1 | head -5
echo ""
echo "=== Cleanup Complete ==="
echo ""
echo "Next steps:"
echo "1. Monitor from Kubernetes: kubectl get proxmoxvm basic-vm-001 -w"
echo "2. Provider will automatically retry within 1 minute"
echo "3. VM should complete configuration and boot"

View File

@@ -0,0 +1,115 @@
# Complete Enhancement Template
# Copy these sections into each VM YAML file
# 1. Add to packages list (after lsb-release):
- chrony
- unattended-upgrades
- apt-listchanges
# 2. Add NTP configuration (after package_upgrade: true):
# Time synchronization (NTP)
ntp:
enabled: true
ntp_client: chrony
servers:
- 0.pool.ntp.org
- 1.pool.ntp.org
- 2.pool.ntp.org
- 3.pool.ntp.org
# 3. Update package verification (replace the for loop):
for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades; do
# 4. Add before final_message (after guest agent verification):
# Configure automatic security updates
- |
echo "Configuring automatic security updates..."
cat > /etc/apt/apt.conf.d/50unattended-upgrades <<'EOF'
Unattended-Upgrade::Allowed-Origins {
"${distro_id}:${distro_codename}-security";
"${distro_id}ESMApps:${distro_codename}-apps-security";
"${distro_id}ESM:${distro_codename}-infra-security";
};
Unattended-Upgrade::AutoFixInterruptedDpkg "true";
Unattended-Upgrade::MinimalSteps "true";
Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
Unattended-Upgrade::Automatic-Reboot "false";
Unattended-Upgrade::Automatic-Reboot-Time "02:00";
EOF
systemctl enable unattended-upgrades
systemctl start unattended-upgrades
echo "Automatic security updates configured"
# Configure NTP (Chrony)
- |
echo "Configuring NTP (Chrony)..."
systemctl enable chrony
systemctl restart chrony
sleep 3
if systemctl is-active --quiet chrony; then
echo "NTP (Chrony) is running"
chronyc tracking | head -1 || true
else
echo "WARNING: NTP (Chrony) may not be running"
fi
# SSH hardening
- |
echo "Hardening SSH configuration..."
if ! grep -q "^PermitRootLogin no" /etc/ssh/sshd_config; then
sed -i 's/^#PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sed -i 's/^PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config
fi
if ! grep -q "^PasswordAuthentication no" /etc/ssh/sshd_config; then
sed -i 's/^#PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
sed -i 's/^PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
fi
if ! grep -q "^PubkeyAuthentication yes" /etc/ssh/sshd_config; then
sed -i 's/^#PubkeyAuthentication.*/PubkeyAuthentication yes/' /etc/ssh/sshd_config
fi
systemctl restart sshd
echo "SSH hardening completed"
# Write files for security configuration
write_files:
- path: /etc/apt/apt.conf.d/20auto-upgrades
content: |
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "7";
APT::Periodic::Unattended-Upgrade "1";
permissions: '0644'
owner: root:root
# Final message
final_message: |
==========================================
System Boot Completed Successfully!
==========================================
Services Status:
- QEMU Guest Agent: $(systemctl is-active qemu-guest-agent)
- NTP (Chrony): $(systemctl is-active chrony)
- Automatic Security Updates: $(systemctl is-active unattended-upgrades)
System Information:
- Hostname: $(hostname)
- IP Address: $(hostname -I | awk '{print $1}')
- Time: $(date)
Packages Installed:
- qemu-guest-agent, curl, wget, net-tools
- chrony (NTP), unattended-upgrades (Security)
Security Configuration:
- SSH: Root login disabled, Password auth disabled
- Automatic security updates: Enabled
- NTP synchronization: Enabled
Next Steps:
1. Verify all services are running
2. Check cloud-init logs: /var/log/cloud-init-output.log
3. Test SSH access
==========================================

View File

@@ -0,0 +1,129 @@
#!/bin/bash
# Complete VM 100 deployment process
# This script guides through checking, fixing, and starting VM 100
set -e
VMID=100
VM_NAME="basic-vm-001"
echo "=========================================="
echo "VM 100 Deployment Completion Script"
echo "=========================================="
echo ""
# Step 1: Check current status from Kubernetes
echo "Step 1: Checking Kubernetes status..."
echo "--------------------------------------"
kubectl get proxmoxvm $VM_NAME -o wide 2>/dev/null || echo "VM not found in Kubernetes"
echo ""
# Step 2: Instructions for Proxmox node
echo "Step 2: Run these commands on Proxmox node (root@ml110-01)"
echo "--------------------------------------"
echo ""
echo "Copy and paste this block into your Proxmox SSH session:"
echo ""
echo "--- START PROXMOX COMMANDS ---"
echo "VMID=100"
echo ""
echo "# 1. Check VM status"
echo "qm status \$VMID"
echo ""
echo "# 2. Check full configuration"
echo "qm config \$VMID"
echo ""
echo "# 3. Check boot order"
echo "qm config \$VMID | grep '^boot:' || echo '⚠️ Boot order not set'"
echo ""
echo "# 4. Check disk configuration"
echo "qm config \$VMID | grep '^scsi0:' || echo '⚠️ Disk not configured'"
echo ""
echo "# 5. Check if disk exists"
echo "lvs | grep vm-\$VMID-disk || echo '⚠️ No disk found'"
echo ""
echo "# 6. Check cloud-init"
echo "qm config \$VMID | grep -E '^ide2:|^ciuser:' || echo '⚠️ Cloud-init not configured'"
echo ""
echo "# 7. Check network"
echo "qm config \$VMID | grep '^net0:' || echo '⚠️ Network not configured'"
echo ""
echo "# 8. Check guest agent"
echo "qm config \$VMID | grep '^agent:' || echo '⚠️ Guest agent not enabled'"
echo ""
echo "--- END PROXMOX COMMANDS ---"
echo ""
# Step 3: Fix commands (if needed)
echo "Step 3: Fix commands (run if issues found)"
echo "--------------------------------------"
echo ""
echo "If boot order missing:"
echo " qm set $VMID --boot order=scsi0"
echo ""
echo "If guest agent missing:"
echo " qm set $VMID --agent 1"
echo ""
echo "If cloud-init missing:"
echo " qm set $VMID --ide2 local-lvm:cloudinit"
echo " qm set $VMID --ciuser admin"
echo " qm set $VMID --ipconfig0 ip=dhcp"
echo ""
echo "If disk not configured or blank, image import may have failed."
echo "Check if image exists:"
echo " find /var/lib/vz/template/iso -name 'ubuntu-22.04-cloud.img'"
echo ""
echo ""
# Step 4: Start VM
echo "Step 4: Start VM (after verification)"
echo "--------------------------------------"
echo ""
echo "Once all checks pass, start the VM:"
echo " qm start $VMID"
echo ""
echo ""
# Step 5: Monitoring commands
echo "Step 5: Monitoring commands"
echo "--------------------------------------"
echo ""
echo "From Kubernetes (run in another terminal):"
echo " kubectl get proxmoxvm $VM_NAME -w"
echo ""
echo "From Proxmox node:"
echo " watch -n 2 'qm status $VMID'"
echo ""
echo "Check provider logs:"
echo " kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=50 -f"
echo ""
echo ""
# Step 6: Verification after start
echo "Step 6: Verification after VM starts"
echo "--------------------------------------"
echo ""
echo "Once VM shows 'running' status:"
echo ""
echo "1. Get VM IP:"
echo " kubectl get proxmoxvm $VM_NAME -o jsonpath='{.status.networkInterfaces[0].ipAddress}'"
echo ""
echo "2. Check cloud-init logs (once IP is available):"
echo " ssh admin@<VM_IP> 'cat /var/log/cloud-init-output.log | tail -50'"
echo ""
echo "3. Verify services:"
echo " ssh admin@<VM_IP> 'systemctl status qemu-guest-agent chrony unattended-upgrades'"
echo ""
echo ""
echo "=========================================="
echo "Next Actions:"
echo "=========================================="
echo "1. SSH to Proxmox node: root@ml110-01"
echo "2. Run the check commands from Step 2"
echo "3. Fix any issues found (Step 3)"
echo "4. Start the VM: qm start $VMID"
echo "5. Monitor from Kubernetes (Step 5)"
echo "6. Verify services once VM is running (Step 6)"
echo ""

View File

@@ -0,0 +1,187 @@
#!/bin/bash
# Complete VM 100 Guest Agent Verification and Fix
# Run on Proxmox node: root@ml110-01
set -euo pipefail
VMID=100
# Check if running on Proxmox node
if ! command -v qm &> /dev/null; then
echo "=========================================="
echo "ERROR: This script must be run on a Proxmox node"
echo "=========================================="
echo ""
echo "The 'qm' command is not available on this machine."
echo ""
echo "This script must be run on the Proxmox node (root@ml110-01)"
echo ""
echo "To run this script:"
echo " 1. SSH into the Proxmox node:"
echo " ssh root@ml110-01"
echo ""
echo " 2. Copy the script to the Proxmox node, or"
echo ""
echo " 3. Run commands directly on Proxmox node:"
echo " VMID=100"
echo " qm status \$VMID"
echo " qm config \$VMID | grep '^agent:'"
echo " qm guest exec \$VMID -- dpkg -l | grep qemu-guest-agent"
echo ""
exit 1
fi
echo "=========================================="
echo "Complete VM 100 Guest Agent Check & Fix"
echo "=========================================="
echo ""
# Step 1: Check VM status
echo "Step 1: Checking VM Status"
echo "--------------------------------------"
VM_STATUS=$(qm status $VMID | awk '{print $2}')
echo "VM Status: $VM_STATUS"
if [ "$VM_STATUS" != "running" ]; then
echo "⚠️ VM is not running. Starting VM..."
qm start $VMID
echo "Waiting 30 seconds for VM to boot..."
sleep 30
VM_STATUS=$(qm status $VMID | awk '{print $2}')
if [ "$VM_STATUS" != "running" ]; then
echo "❌ VM failed to start"
exit 1
fi
fi
echo "✅ VM is running"
echo ""
# Step 2: Check Proxmox guest agent config
echo "Step 2: Checking Proxmox Guest Agent Configuration"
echo "--------------------------------------"
AGENT_CONFIG=$(qm config $VMID | grep '^agent:' || echo "")
if [ -z "$AGENT_CONFIG" ]; then
echo "❌ Guest agent NOT configured in Proxmox"
echo "Setting agent=1..."
qm set $VMID --agent 1
echo "✅ Guest agent configured"
else
echo "✅ Guest agent configured: $AGENT_CONFIG"
fi
echo ""
# Step 3: Check if package is installed
echo "Step 3: Checking qemu-guest-agent Package Installation"
echo "--------------------------------------"
echo "Attempting to check via qm guest exec..."
echo ""
PACKAGE_CHECK=$(qm guest exec $VMID -- dpkg -l | grep qemu-guest-agent 2>&1)
EXEC_EXIT_CODE=$?
if [ $EXEC_EXIT_CODE -eq 0 ] && echo "$PACKAGE_CHECK" | grep -q "qemu-guest-agent"; then
echo "✅ qemu-guest-agent package IS installed"
echo ""
echo "Package details:"
echo "$PACKAGE_CHECK" | grep qemu-guest-agent
echo ""
# Step 4: Check service status
echo "Step 4: Checking Service Status"
echo "--------------------------------------"
SERVICE_STATUS=$(qm guest exec $VMID -- systemctl status qemu-guest-agent --no-pager 2>&1)
if echo "$SERVICE_STATUS" | grep -q "active (running)"; then
echo "✅ qemu-guest-agent service IS running"
echo ""
echo "Service status:"
echo "$SERVICE_STATUS" | head -10
echo ""
echo "=========================================="
echo "✅ SUCCESS: Guest Agent is fully configured and running"
echo "=========================================="
exit 0
elif echo "$SERVICE_STATUS" | grep -q "inactive"; then
echo "⚠️ qemu-guest-agent service is installed but NOT running"
echo "Attempting to start..."
qm guest exec $VMID -- systemctl enable --now qemu-guest-agent
sleep 3
SERVICE_STATUS=$(qm guest exec $VMID -- systemctl status qemu-guest-agent --no-pager 2>&1)
if echo "$SERVICE_STATUS" | grep -q "active (running)"; then
echo "✅ Service started successfully"
exit 0
else
echo "❌ Service failed to start"
echo "$SERVICE_STATUS"
exit 1
fi
else
echo "⚠️ Could not determine service status"
echo "Service status output:"
echo "$SERVICE_STATUS"
exit 1
fi
elif echo "$PACKAGE_CHECK" | grep -q "No QEMU guest agent configured"; then
echo "❌ Guest agent not configured in Proxmox"
echo "This should have been fixed in Step 2. Please check manually:"
echo " qm config $VMID | grep '^agent:'"
exit 1
elif echo "$PACKAGE_CHECK" | grep -q "QEMU guest agent is not running"; then
echo "⚠️ Guest agent configured but service not running"
echo "The package may not be installed, or the service isn't started"
echo ""
echo "=========================================="
echo "PACKAGE INSTALLATION REQUIRED"
echo "=========================================="
echo ""
echo "The qemu-guest-agent package needs to be installed inside the VM."
echo ""
echo "Options:"
echo "1. SSH into the VM (if you have the IP and access)"
echo "2. Use Proxmox console (qm terminal $VMID or via web UI)"
echo ""
echo "Then run inside the VM:"
echo " sudo apt-get update"
echo " sudo apt-get install -y qemu-guest-agent"
echo " sudo systemctl enable --now qemu-guest-agent"
echo ""
# Try to get VM IP
echo "Attempting to get VM IP address..."
VM_IP=$(qm guest exec $VMID -- hostname -I 2>&1 | awk '{print $1}' || echo "")
if [ -n "$VM_IP" ] && [ "$VM_IP" != "QEMU" ] && [ "$VM_IP" != "No" ]; then
echo "VM IP: $VM_IP"
echo ""
echo "You can try SSH:"
echo " ssh admin@$VM_IP"
else
echo "Could not get VM IP via guest exec"
echo "Use Proxmox console or check network configuration"
fi
exit 1
else
echo "❌ qemu-guest-agent package is NOT installed"
echo ""
echo "Error details:"
echo "$PACKAGE_CHECK"
echo ""
echo "=========================================="
echo "PACKAGE INSTALLATION REQUIRED"
echo "=========================================="
echo ""
echo "The qemu-guest-agent package needs to be installed inside the VM."
echo ""
echo "Options:"
echo "1. SSH into the VM (if you have the IP and access)"
echo "2. Use Proxmox console (qm terminal $VMID or via web UI)"
echo ""
echo "Then run inside the VM:"
echo " sudo apt-get update"
echo " sudo apt-get install -y qemu-guest-agent"
echo " sudo systemctl enable --now qemu-guest-agent"
echo ""
exit 1
fi

View File

@@ -0,0 +1,150 @@
#!/bin/bash
# Complete verification and start for VM 100
# Run on Proxmox node: root@ml110-01
set -e
VMID=100
VM_NAME="basic-vm-001"
echo "=========================================="
echo "VM 100 Complete Verification and Start"
echo "=========================================="
echo ""
# Step 1: Check VM status
echo "Step 1: VM Status"
echo "--------------------------------------"
qm status $VMID
echo ""
# Step 2: Verify all critical configurations
echo "Step 2: Critical Configuration Checks"
echo "--------------------------------------"
# Boot order
BOOT_ORDER=$(qm config $VMID | grep '^boot:' || echo "")
if [ -z "$BOOT_ORDER" ]; then
echo "⚠️ Boot order not set - fixing..."
qm set $VMID --boot order=scsi0
echo "✅ Boot order set"
else
echo "✅ Boot order: $BOOT_ORDER"
fi
# Disk configuration
SCSI0=$(qm config $VMID | grep '^scsi0:' || echo "")
if [ -z "$SCSI0" ]; then
echo "❌ ERROR: scsi0 disk not configured!"
exit 1
else
echo "✅ Disk: $SCSI0"
# Check if disk exists
DISK_NAME=$(echo "$SCSI0" | sed -n 's/.*local-lvm:\(vm-[0-9]*-disk-[0-9]*\).*/\1/p')
if [ -n "$DISK_NAME" ]; then
if lvs | grep -q "$DISK_NAME"; then
DISK_SIZE=$(lvs | grep "$DISK_NAME" | awk '{print $4}')
echo " ✅ Disk exists: $DISK_NAME ($DISK_SIZE)"
else
echo " ❌ ERROR: Disk $DISK_NAME not found!"
exit 1
fi
fi
fi
# Cloud-init
IDE2=$(qm config $VMID | grep '^ide2:' || echo "")
CIUSER=$(qm config $VMID | grep '^ciuser:' || echo "")
if [ -z "$IDE2" ]; then
echo "⚠️ Cloud-init drive not configured - fixing..."
qm set $VMID --ide2 local-lvm:cloudinit
echo "✅ Cloud-init drive configured"
else
echo "✅ Cloud-init drive: $IDE2"
fi
if [ -z "$CIUSER" ]; then
echo "⚠️ Cloud-init user not configured - fixing..."
qm set $VMID --ciuser admin
echo "✅ Cloud-init user configured"
else
echo "✅ Cloud-init user: $CIUSER"
fi
IPCONFIG=$(qm config $VMID | grep '^ipconfig0:' || echo "")
if [ -z "$IPCONFIG" ]; then
echo "⚠️ IP config not set - fixing..."
qm set $VMID --ipconfig0 ip=dhcp
echo "✅ IP config set"
else
echo "✅ IP config: $IPCONFIG"
fi
# Network
NET0=$(qm config $VMID | grep '^net0:' || echo "")
if [ -z "$NET0" ]; then
echo "⚠️ Network not configured - fixing..."
qm set $VMID --net0 virtio,bridge=vmbr0
echo "✅ Network configured"
else
echo "✅ Network: $NET0"
fi
# Guest agent (already fixed, but verify)
AGENT=$(qm config $VMID | grep '^agent:' || echo "")
if [ -z "$AGENT" ]; then
echo "⚠️ Guest agent not enabled - fixing..."
qm set $VMID --agent 1
echo "✅ Guest agent enabled"
else
echo "✅ Guest agent: $AGENT"
fi
echo ""
# Step 3: Final configuration summary
echo "Step 3: Final Configuration Summary"
echo "--------------------------------------"
qm config $VMID | grep -E '^agent:|^boot:|^scsi0:|^ide2:|^net0:|^ciuser:' | while read line; do
echo " $line"
done
echo ""
# Step 4: Start VM
echo "Step 4: Starting VM"
echo "--------------------------------------"
CURRENT_STATUS=$(qm status $VMID | awk '{print $2}')
if [ "$CURRENT_STATUS" = "running" ]; then
echo "✅ VM is already running"
else
echo "Current status: $CURRENT_STATUS"
echo "Starting VM..."
qm start $VMID
echo ""
echo "Waiting 5 seconds for initialization..."
sleep 5
echo ""
echo "VM status after start:"
qm status $VMID
fi
echo ""
# Step 5: Monitoring instructions
echo "=========================================="
echo "VM Started - Monitoring Instructions"
echo "=========================================="
echo ""
echo "Monitor from Proxmox node:"
echo " watch -n 2 'qm status $VMID'"
echo ""
echo "Monitor from Kubernetes:"
echo " kubectl get proxmoxvm $VM_NAME -w"
echo ""
echo "Check provider logs:"
echo " kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=50 -f"
echo ""
echo "Once VM has IP, verify services:"
echo " IP=\$(kubectl get proxmoxvm $VM_NAME -o jsonpath='{.status.networkInterfaces[0].ipAddress}')"
echo " ssh admin@\$IP 'systemctl status qemu-guest-agent chrony unattended-upgrades'"
echo ""

View File

@@ -0,0 +1,189 @@
#!/bin/bash
# configure-cloudflare-tunnel.sh
# Configuration script for Cloudflare Tunnel VM
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
# Get VM IP address
get_vm_ip() {
local vm_name=$1
local ip
ip=$(kubectl get proxmoxvm "${vm_name}" -n default -o jsonpath='{.status.ipAddress}' 2>/dev/null || echo "")
if [ -z "${ip}" ] || [ "${ip}" = "<none>" ]; then
log_warning "VM IP not yet assigned. Waiting..."
return 1
fi
echo "${ip}"
}
# Wait for VM to be ready
wait_for_vm() {
local vm_name=$1
local max_attempts=30
local attempt=0
log "Waiting for ${vm_name} to be ready..."
while [ ${attempt} -lt ${max_attempts} ]; do
local ip
ip=$(get_vm_ip "${vm_name}" 2>/dev/null || echo "")
if [ -n "${ip}" ] && [ "${ip}" != "<none>" ]; then
log_success "${vm_name} is ready at ${ip}"
echo "${ip}"
return 0
fi
attempt=$((attempt + 1))
sleep 10
done
log_error "${vm_name} did not become ready in time"
return 1
}
# Generate Cloudflare Tunnel configuration
generate_tunnel_config() {
local config_file=$1
local tunnel_name=$2
local credentials_file=$3
cat > "${config_file}" <<EOF
# Cloudflare Tunnel Configuration for SMOM-DBIS-138
# Generated: $(date +'%Y-%m-%d %H:%M:%S')
tunnel: ${tunnel_name}
credentials-file: ${credentials_file}
ingress:
# Nginx Proxy
- hostname: nginx-proxy.sankofa.nexus
service: http://nginx-proxy-vm:80
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
tcpKeepAlive: 30s
keepAliveConnections: 100
keepAliveTimeout: 90s
# SMOM-DBIS-138 Services
- hostname: smom-api.sankofa.nexus
service: http://smom-services:8080
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
- hostname: smom-blockscout.sankofa.nexus
service: http://smom-blockscout:4000
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
- hostname: smom-monitoring.sankofa.nexus
service: http://smom-monitoring:3000
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
# RPC Nodes
- hostname: smom-rpc-01.sankofa.nexus
service: http://smom-rpc-node-01:8545
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
- hostname: smom-rpc-02.sankofa.nexus
service: http://smom-rpc-node-02:8545
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
# Catch-all rule (must be last)
- service: http_status:404
# Logging
loglevel: info
logfile: /var/log/cloudflared/tunnel.log
# Metrics
metrics: 0.0.0.0:9090
# Health check
health-probe:
enabled: true
path: /health
port: 8080
EOF
}
main() {
log "=========================================="
log "Cloudflare Tunnel Configuration Script"
log "=========================================="
log ""
# Check if VM exists
if ! kubectl get proxmoxvm cloudflare-tunnel-vm -n default &>/dev/null; then
log_error "cloudflare-tunnel-vm not found. Please deploy it first."
exit 1
fi
# Wait for VM to be ready
local vm_ip
vm_ip=$(wait_for_vm "cloudflare-tunnel-vm")
if [ -z "${vm_ip}" ]; then
log_error "Failed to get VM IP address"
exit 1
fi
log_success "Cloudflare Tunnel VM is ready at ${vm_ip}"
log ""
log "Next steps:"
log "1. Create a Cloudflare Tunnel in the Cloudflare dashboard"
log "2. Copy the tunnel token/credentials"
log "3. SSH into the VM: ssh admin@${vm_ip}"
log "4. Place tunnel credentials at: /etc/cloudflared/tunnel-credentials.json"
log "5. Update tunnel configuration at: /etc/cloudflared/config.yaml"
log "6. Start the tunnel service: sudo systemctl start cloudflared"
log "7. Enable auto-start: sudo systemctl enable cloudflared"
log ""
log "Example tunnel configuration:"
log " ${PROJECT_ROOT}/docs/configs/cloudflare/tunnel-config.yaml"
log ""
log "To create a tunnel via API, use:"
log " ${PROJECT_ROOT}/scripts/configure-cloudflare.sh"
log ""
}
main "$@"

View File

@@ -3,9 +3,19 @@ set -euo pipefail
# Cloudflare Tunnel Configuration Script
# Load environment variables from .env if it exists
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
CLOUDFLARE_API_TOKEN="${CLOUDFLARE_API_TOKEN:-}"
ZONE_ID="${ZONE_ID:-}"
ACCOUNT_ID="${ACCOUNT_ID:-}"
CLOUDFLARE_API_KEY="${CLOUDFLARE_API_KEY:-}"
CLOUDFLARE_EMAIL="${CLOUDFLARE_EMAIL:-}"
ZONE_ID="${CLOUDFLARE_ZONE_ID:-${ZONE_ID:-}}"
ACCOUNT_ID="${CLOUDFLARE_ACCOUNT_ID:-${ACCOUNT_ID:-}}"
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" >&2
@@ -17,16 +27,18 @@ error() {
}
check_prerequisites() {
if [ -z "${CLOUDFLARE_API_TOKEN}" ]; then
error "CLOUDFLARE_API_TOKEN environment variable is required"
# Check authentication method
if [ -z "${CLOUDFLARE_API_TOKEN}" ] && [ -z "${CLOUDFLARE_API_KEY}" ]; then
error "Either CLOUDFLARE_API_TOKEN or CLOUDFLARE_API_KEY must be set"
fi
if [ -z "${ZONE_ID}" ]; then
error "ZONE_ID environment variable is required"
if [ -z "${CLOUDFLARE_API_TOKEN}" ] && [ -z "${CLOUDFLARE_EMAIL}" ]; then
error "If using CLOUDFLARE_API_KEY, CLOUDFLARE_EMAIL must also be set"
fi
if [ -z "${ACCOUNT_ID}" ]; then
error "ACCOUNT_ID environment variable is required"
warn "ACCOUNT_ID not set, attempting to get from API..."
get_account_id
fi
if ! command -v cloudflared &> /dev/null; then
@@ -34,18 +46,53 @@ check_prerequisites() {
fi
}
get_account_id() {
if [ -n "${CLOUDFLARE_API_TOKEN}" ]; then
ACCOUNT_ID=$(curl -s -X GET \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts" | \
jq -r '.result[0].id')
elif [ -n "${CLOUDFLARE_API_KEY}" ] && [ -n "${CLOUDFLARE_EMAIL}" ]; then
ACCOUNT_ID=$(curl -s -X GET \
-H "X-Auth-Email: ${CLOUDFLARE_EMAIL}" \
-H "X-Auth-Key: ${CLOUDFLARE_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts" | \
jq -r '.result[0].id')
fi
if [ -n "${ACCOUNT_ID}" ] && [ "${ACCOUNT_ID}" != "null" ]; then
log "Account ID: ${ACCOUNT_ID}"
export CLOUDFLARE_ACCOUNT_ID="${ACCOUNT_ID}"
else
error "Failed to get Account ID"
fi
}
create_tunnel() {
local tunnel_name=$1
log "Creating Cloudflare tunnel: ${tunnel_name}"
# Create tunnel via API
TUNNEL_ID=$(curl -s -X POST \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cfd_tunnel" \
-d "{\"name\":\"${tunnel_name}\",\"config_src\":\"local\"}" \
| jq -r '.result.id')
local auth_header
if [ -n "${CLOUDFLARE_API_TOKEN}" ]; then
TUNNEL_ID=$(curl -s -X POST \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cfd_tunnel" \
-d "{\"name\":\"${tunnel_name}\",\"config_src\":\"local\"}" \
| jq -r '.result.id')
else
TUNNEL_ID=$(curl -s -X POST \
-H "X-Auth-Email: ${CLOUDFLARE_EMAIL}" \
-H "X-Auth-Key: ${CLOUDFLARE_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cfd_tunnel" \
-d "{\"name\":\"${tunnel_name}\",\"config_src\":\"local\"}" \
| jq -r '.result.id')
fi
if [ -z "${TUNNEL_ID}" ] || [ "${TUNNEL_ID}" = "null" ]; then
error "Failed to create tunnel ${tunnel_name}"

170
scripts/configure-nginx-proxy.sh Executable file
View File

@@ -0,0 +1,170 @@
#!/bin/bash
# configure-nginx-proxy.sh
# Configuration script for Nginx Proxy VM
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
# Get VM IP address
get_vm_ip() {
local vm_name=$1
local ip
ip=$(kubectl get proxmoxvm "${vm_name}" -n default -o jsonpath='{.status.ipAddress}' 2>/dev/null || echo "")
if [ -z "${ip}" ] || [ "${ip}" = "<none>" ]; then
log_warning "VM IP not yet assigned. Waiting..."
return 1
fi
echo "${ip}"
}
# Wait for VM to be ready
wait_for_vm() {
local vm_name=$1
local max_attempts=30
local attempt=0
log "Waiting for ${vm_name} to be ready..."
while [ ${attempt} -lt ${max_attempts} ]; do
local ip
ip=$(get_vm_ip "${vm_name}" 2>/dev/null || echo "")
if [ -n "${ip}" ] && [ "${ip}" != "<none>" ]; then
log_success "${vm_name} is ready at ${ip}"
echo "${ip}"
return 0
fi
attempt=$((attempt + 1))
sleep 10
done
log_error "${vm_name} did not become ready in time"
return 1
}
# Generate Nginx configuration
generate_nginx_config() {
local config_file=$1
local domain=$2
local backend_ip=$3
local backend_port=${4:-80}
cat > "${config_file}" <<EOF
server {
listen 80;
listen [::]:80;
server_name ${domain};
# Redirect HTTP to HTTPS
return 301 https://\$server_name\$request_uri;
}
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name ${domain};
# SSL Configuration
ssl_certificate /etc/letsencrypt/live/${domain}/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/${domain}/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers on;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
# Security Headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
# Logging
access_log /var/log/nginx/${domain}-access.log;
error_log /var/log/nginx/${domain}-error.log;
# Proxy Settings
location / {
proxy_pass http://${backend_ip}:${backend_port};
proxy_http_version 1.1;
proxy_set_header Upgrade \$http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host \$host;
proxy_set_header X-Real-IP \$remote_addr;
proxy_set_header X-Forwarded-For \$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto \$scheme;
proxy_cache_bypass \$http_upgrade;
proxy_read_timeout 300s;
proxy_connect_timeout 75s;
}
}
EOF
}
main() {
log "=========================================="
log "Nginx Proxy Configuration Script"
log "=========================================="
log ""
# Check if VM exists
if ! kubectl get proxmoxvm nginx-proxy-vm -n default &>/dev/null; then
log_error "nginx-proxy-vm not found. Please deploy it first."
exit 1
fi
# Wait for VM to be ready
local vm_ip
vm_ip=$(wait_for_vm "nginx-proxy-vm")
if [ -z "${vm_ip}" ]; then
log_error "Failed to get VM IP address"
exit 1
fi
log_success "Nginx Proxy VM is ready at ${vm_ip}"
log ""
log "Next steps:"
log "1. SSH into the VM: ssh admin@${vm_ip}"
log "2. Install SSL certificates using certbot:"
log " sudo certbot --nginx -d your-domain.com"
log "3. Configure backend services in /etc/nginx/sites-available/"
log "4. Test configuration: sudo nginx -t"
log "5. Reload nginx: sudo systemctl reload nginx"
log ""
log "Example Nginx configuration files are available in:"
log " ${PROJECT_ROOT}/docs/configs/nginx/"
log ""
}
main "$@"

View File

@@ -0,0 +1,91 @@
#!/bin/bash
# Configure ProviderConfig for Crossplane
# DEPLOY-018: Review and update Proxmox configuration
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
echo "=== Configuring ProviderConfig ==="
echo ""
# Check prerequisites
if ! command -v kubectl &> /dev/null; then
echo "✗ kubectl is not installed"
exit 1
fi
if ! kubectl cluster-info &> /dev/null; then
echo "✗ Cannot connect to Kubernetes cluster"
exit 1
fi
# Prompt for credentials
echo "Enter Proxmox credentials:"
read -p "Username (default: root@pam): " USERNAME
USERNAME=${USERNAME:-root@pam}
read -sp "Password or API Token: " PASSWORD
echo ""
read -p "Instance 1 Endpoint (default: https://ml110-01.sankofa.nexus:8006): " INSTANCE1_ENDPOINT
INSTANCE1_ENDPOINT=${INSTANCE1_ENDPOINT:-https://ml110-01.sankofa.nexus:8006}
read -p "Instance 2 Endpoint (default: https://r630-01.sankofa.nexus:8006): " INSTANCE2_ENDPOINT
INSTANCE2_ENDPOINT=${INSTANCE2_ENDPOINT:-https://r630-01.sankofa.nexus:8006}
read -p "Skip TLS verification? (y/N): " SKIP_TLS
SKIP_TLS=${SKIP_TLS:-N}
# Create credentials JSON
CREDS_JSON=$(cat <<EOF
{
"username": "$USERNAME",
"password": "$PASSWORD"
}
EOF
)
# Create or update secret
echo ""
echo "Creating/updating secret..."
kubectl create secret generic proxmox-credentials \
--from-literal=credentials.json="$CREDS_JSON" \
--dry-run=client -o yaml | \
kubectl apply -n crossplane-system -f -
# Create ProviderConfig
echo ""
echo "Creating ProviderConfig..."
cat <<EOF | kubectl apply -f -
apiVersion: proxmox.sankofa.nexus/v1alpha1
kind: ProviderConfig
metadata:
name: proxmox-provider-config
namespace: crossplane-system
spec:
credentials:
source: Secret
secretRef:
name: proxmox-credentials
namespace: crossplane-system
key: credentials.json
sites:
- name: us-sfvalley
endpoint: $INSTANCE1_ENDPOINT
node: ML110-01
insecureSkipTLSVerify: $([ "$SKIP_TLS" = "y" ] && echo "true" || echo "false")
- name: us-sfvalley-2
endpoint: $INSTANCE2_ENDPOINT
node: R630-01
insecureSkipTLSVerify: $([ "$SKIP_TLS" = "y" ] && echo "true" || echo "false")
EOF
echo ""
echo "=== ProviderConfig configured ==="
echo ""
echo "Verify configuration:"
echo " kubectl get providerconfig proxmox-provider-config -n crossplane-system"
echo " kubectl describe providerconfig proxmox-provider-config -n crossplane-system"

View File

@@ -0,0 +1,96 @@
#!/bin/bash
# Manual copy script with diagnostic information
# This script provides the exact commands to copy the script manually
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
# Source .env file
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source "$PROJECT_ROOT/.env"
set +a
fi
# Get password (with fallback)
PROXMOX_PASS="${PROXMOX_ROOT_PASS:-L@kers2010}"
# Get hostnames/IPs
PROXMOX_1_HOST="${PROXMOX_1_HOST:-192.168.11.10}"
PROXMOX_2_HOST="${PROXMOX_2_HOST:-192.168.11.11}"
SCRIPT_NAME="complete-vm-100-guest-agent-check.sh"
SCRIPT_PATH="$SCRIPT_DIR/$SCRIPT_NAME"
echo "=========================================="
echo "Manual Copy Instructions"
echo "=========================================="
echo ""
echo "Script: $SCRIPT_NAME"
echo "Source: $SCRIPT_PATH"
echo ""
# Check if script exists
if [ ! -f "$SCRIPT_PATH" ]; then
echo "❌ Error: Script not found: $SCRIPT_PATH"
exit 1
fi
echo "Copy and paste these commands to copy the script:"
echo ""
echo "=========================================="
echo "For ml110-01 (Site 1):"
echo "=========================================="
echo ""
echo "# Copy script to ml110-01"
echo "sshpass -p '$PROXMOX_PASS' scp -o StrictHostKeyChecking=no \\"
echo " $SCRIPT_PATH \\"
echo " root@$PROXMOX_1_HOST:/usr/local/bin/$SCRIPT_NAME"
echo ""
echo "# Make executable"
echo "sshpass -p '$PROXMOX_PASS' ssh -o StrictHostKeyChecking=no root@$PROXMOX_1_HOST \\"
echo " 'chmod +x /usr/local/bin/$SCRIPT_NAME'"
echo ""
echo "# Verify"
echo "sshpass -p '$PROXMOX_PASS' ssh -o StrictHostKeyChecking=no root@$PROXMOX_1_HOST \\"
echo " '/usr/local/bin/$SCRIPT_NAME'"
echo ""
echo "=========================================="
echo "For r630-01 (Site 2):"
echo "=========================================="
echo ""
echo "# Copy script to r630-01"
echo "sshpass -p '$PROXMOX_PASS' scp -o StrictHostKeyChecking=no \\"
echo " $SCRIPT_PATH \\"
echo " root@$PROXMOX_2_HOST:/usr/local/bin/$SCRIPT_NAME"
echo ""
echo "# Make executable"
echo "sshpass -p '$PROXMOX_PASS' ssh -o StrictHostKeyChecking=no root@$PROXMOX_2_HOST \\"
echo " 'chmod +x /usr/local/bin/$SCRIPT_NAME'"
echo ""
echo "# Verify"
echo "sshpass -p '$PROXMOX_PASS' ssh -o StrictHostKeyChecking=no root@$PROXMOX_2_HOST \\"
echo " '/usr/local/bin/$SCRIPT_NAME'"
echo ""
echo "=========================================="
echo "Alternative: Copy script content directly"
echo "=========================================="
echo ""
echo "If SCP doesn't work, you can copy the script content:"
echo ""
echo "1. Display the script:"
echo " cat $SCRIPT_PATH"
echo ""
echo "2. SSH to the node and create the file:"
echo " sshpass -p '$PROXMOX_PASS' ssh -o StrictHostKeyChecking=no root@$PROXMOX_1_HOST"
echo ""
echo "3. Then on the Proxmox node, create the file:"
echo " cat > /usr/local/bin/$SCRIPT_NAME << 'SCRIPT_EOF'"
echo " [paste script content here]"
echo " SCRIPT_EOF"
echo " chmod +x /usr/local/bin/$SCRIPT_NAME"
echo ""

View File

@@ -0,0 +1,139 @@
#!/bin/bash
# Copy complete-vm-100-guest-agent-check.sh to both Proxmox nodes
# Uses password from .env file
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
# Source .env file
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source "$PROJECT_ROOT/.env"
set +a
fi
# Get password (with fallback)
PROXMOX_PASS="${PROXMOX_ROOT_PASS:-L@kers2010}"
# Use exact password format provided by user
PROXMOX_PASS="L@kers2010"
# Get hostnames/IPs (with defaults from other scripts)
PROXMOX_1_HOST="${PROXMOX_1_HOST:-192.168.11.10}"
PROXMOX_2_HOST="${PROXMOX_2_HOST:-192.168.11.11}"
# Also try hostnames if IPs don't work
PROXMOX_1_HOSTNAME="${PROXMOX_1_HOSTNAME:-ml110-01}"
PROXMOX_2_HOSTNAME="${PROXMOX_2_HOSTNAME:-r630-01}"
SCRIPT_NAME="complete-vm-100-guest-agent-check.sh"
SCRIPT_PATH="$SCRIPT_DIR/$SCRIPT_NAME"
REMOTE_PATH="/usr/local/bin/$SCRIPT_NAME"
echo "=========================================="
echo "Copying Script to Proxmox Nodes"
echo "=========================================="
echo ""
echo "Script: $SCRIPT_NAME"
echo "Source: $SCRIPT_PATH"
echo "Target: $REMOTE_PATH"
echo ""
# Check if script exists
if [ ! -f "$SCRIPT_PATH" ]; then
echo "❌ Error: Script not found: $SCRIPT_PATH"
exit 1
fi
# Check if sshpass is available
if ! command -v sshpass &> /dev/null; then
echo "❌ Error: sshpass is not installed"
echo "Install it with: sudo apt-get install sshpass"
exit 1
fi
# Function to copy to a node
copy_to_node() {
local host=$1
local node_name=$2
echo "--------------------------------------"
echo "Copying to $node_name ($host)..."
echo "--------------------------------------"
# Test connection first
if ! sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@$host "echo 'Connected'" > /dev/null 2>&1; then
echo "❌ Failed to connect to $host"
return 1
fi
# Copy script
if sshpass -p "$PROXMOX_PASS" scp -o StrictHostKeyChecking=no "$SCRIPT_PATH" root@$host:"$REMOTE_PATH"; then
echo "✅ Script copied successfully"
# Make executable
if sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host "chmod +x $REMOTE_PATH"; then
echo "✅ Script made executable"
# Verify
if sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host "test -x $REMOTE_PATH"; then
echo "✅ Script verified as executable"
echo ""
echo "You can now run on $node_name:"
echo " $REMOTE_PATH"
echo ""
return 0
else
echo "⚠️ Warning: Script may not be executable"
return 1
fi
else
echo "❌ Failed to make script executable"
return 1
fi
else
echo "❌ Failed to copy script"
return 1
fi
}
# Copy to both nodes
SUCCESS_COUNT=0
# Try IP first, then hostname
if copy_to_node "$PROXMOX_1_HOST" "ml110-01 (Site 1)"; then
((SUCCESS_COUNT++))
elif copy_to_node "$PROXMOX_1_HOSTNAME" "ml110-01 (Site 1 - hostname)"; then
((SUCCESS_COUNT++))
fi
if copy_to_node "$PROXMOX_2_HOST" "r630-01 (Site 2)"; then
((SUCCESS_COUNT++))
elif copy_to_node "$PROXMOX_2_HOSTNAME" "r630-01 (Site 2 - hostname)"; then
((SUCCESS_COUNT++))
fi
echo "=========================================="
echo "Summary"
echo "=========================================="
echo "✅ Successfully copied to $SUCCESS_COUNT/2 nodes"
echo ""
if [ $SUCCESS_COUNT -eq 2 ]; then
echo "✅ All nodes updated successfully!"
echo ""
echo "To run the script on ml110-01:"
echo " ssh root@$PROXMOX_1_HOST"
echo " $REMOTE_PATH"
echo ""
echo "To run the script on r630-01:"
echo " ssh root@$PROXMOX_2_HOST"
echo " $REMOTE_PATH"
exit 0
else
echo "⚠️ Some nodes failed. Check the output above."
exit 1
fi

View File

@@ -0,0 +1,158 @@
#!/bin/bash
# This file contains the exact heredoc command to copy to Proxmox node
# Copy everything from "bash << 'SCRIPT_EOF'" to "SCRIPT_EOF" into your Proxmox SSH session
cat << 'HEREDOC_EOF'
bash << 'SCRIPT_EOF'
#!/bin/bash
# Complete verification and start for VM 100
# Run on Proxmox node: root@ml110-01
set -e
VMID=100
VM_NAME="basic-vm-001"
echo "=========================================="
echo "VM 100 Complete Verification and Start"
echo "=========================================="
echo ""
# Step 1: Check VM status
echo "Step 1: VM Status"
echo "--------------------------------------"
qm status $VMID
echo ""
# Step 2: Verify all critical configurations
echo "Step 2: Critical Configuration Checks"
echo "--------------------------------------"
# Boot order
BOOT_ORDER=$(qm config $VMID | grep '^boot:' || echo "")
if [ -z "$BOOT_ORDER" ]; then
echo "⚠️ Boot order not set - fixing..."
qm set $VMID --boot order=scsi0
echo "✅ Boot order set"
else
echo "✅ Boot order: $BOOT_ORDER"
fi
# Disk configuration
SCSI0=$(qm config $VMID | grep '^scsi0:' || echo "")
if [ -z "$SCSI0" ]; then
echo "❌ ERROR: scsi0 disk not configured!"
exit 1
else
echo "✅ Disk: $SCSI0"
# Check if disk exists
DISK_NAME=$(echo "$SCSI0" | sed -n 's/.*local-lvm:\(vm-[0-9]*-disk-[0-9]*\).*/\1/p')
if [ -n "$DISK_NAME" ]; then
if lvs | grep -q "$DISK_NAME"; then
DISK_SIZE=$(lvs | grep "$DISK_NAME" | awk '{print $4}')
echo " ✅ Disk exists: $DISK_NAME ($DISK_SIZE)"
else
echo " ❌ ERROR: Disk $DISK_NAME not found!"
exit 1
fi
fi
fi
# Cloud-init
IDE2=$(qm config $VMID | grep '^ide2:' || echo "")
CIUSER=$(qm config $VMID | grep '^ciuser:' || echo "")
if [ -z "$IDE2" ]; then
echo "⚠️ Cloud-init drive not configured - fixing..."
qm set $VMID --ide2 local-lvm:cloudinit
echo "✅ Cloud-init drive configured"
else
echo "✅ Cloud-init drive: $IDE2"
fi
if [ -z "$CIUSER" ]; then
echo "⚠️ Cloud-init user not configured - fixing..."
qm set $VMID --ciuser admin
echo "✅ Cloud-init user configured"
else
echo "✅ Cloud-init user: $CIUSER"
fi
IPCONFIG=$(qm config $VMID | grep '^ipconfig0:' || echo "")
if [ -z "$IPCONFIG" ]; then
echo "⚠️ IP config not set - fixing..."
qm set $VMID --ipconfig0 ip=dhcp
echo "✅ IP config set"
else
echo "✅ IP config: $IPCONFIG"
fi
# Network
NET0=$(qm config $VMID | grep '^net0:' || echo "")
if [ -z "$NET0" ]; then
echo "⚠️ Network not configured - fixing..."
qm set $VMID --net0 virtio,bridge=vmbr0
echo "✅ Network configured"
else
echo "✅ Network: $NET0"
fi
# Guest agent (already fixed, but verify)
AGENT=$(qm config $VMID | grep '^agent:' || echo "")
if [ -z "$AGENT" ]; then
echo "⚠️ Guest agent not enabled - fixing..."
qm set $VMID --agent 1
echo "✅ Guest agent enabled"
else
echo "✅ Guest agent: $AGENT"
fi
echo ""
# Step 3: Final configuration summary
echo "Step 3: Final Configuration Summary"
echo "--------------------------------------"
qm config $VMID | grep -E '^agent:|^boot:|^scsi0:|^ide2:|^net0:|^ciuser:' | while read line; do
echo " $line"
done
echo ""
# Step 4: Start VM
echo "Step 4: Starting VM"
echo "--------------------------------------"
CURRENT_STATUS=$(qm status $VMID | awk '{print $2}')
if [ "$CURRENT_STATUS" = "running" ]; then
echo "✅ VM is already running"
else
echo "Current status: $CURRENT_STATUS"
echo "Starting VM..."
qm start $VMID
echo ""
echo "Waiting 5 seconds for initialization..."
sleep 5
echo ""
echo "VM status after start:"
qm status $VMID
fi
echo ""
# Step 5: Monitoring instructions
echo "=========================================="
echo "VM Started - Monitoring Instructions"
echo "=========================================="
echo ""
echo "Monitor from Proxmox node:"
echo " watch -n 2 'qm status $VMID'"
echo ""
echo "Monitor from Kubernetes:"
echo " kubectl get proxmoxvm $VM_NAME -w"
echo ""
echo "Check provider logs:"
echo " kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=50 -f"
echo ""
echo "Once VM has IP, verify services:"
echo " IP=\$(kubectl get proxmoxvm $VM_NAME -o jsonpath='{.status.networkInterfaces[0].ipAddress}')"
echo " ssh admin@\$IP 'systemctl status qemu-guest-agent chrony unattended-upgrades'"
echo ""
SCRIPT_EOF
HEREDOC_EOF

View File

@@ -0,0 +1,225 @@
#!/bin/bash
# create-proxmox-cluster-ssh.sh
# Creates a Proxmox cluster using SSH access to nodes
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Configuration
CLUSTER_NAME="${CLUSTER_NAME:-sankofa-cluster}"
NODE1_IP="192.168.11.10"
NODE1_NAME="ML110-01"
NODE2_IP="192.168.11.11"
NODE2_NAME="R630-01"
# SSH configuration (if available)
SSH_USER="${SSH_USER:-root}"
SSH_KEY="${SSH_KEY:-}"
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_ssh_access() {
log "Checking SSH access to nodes..."
local ssh_cmd="ssh"
if [ -n "$SSH_KEY" ]; then
ssh_cmd="ssh -i $SSH_KEY"
fi
# Test Node 1
if $ssh_cmd -o ConnectTimeout=5 -o StrictHostKeyChecking=no ${SSH_USER}@${NODE1_IP} "echo 'Connected'" >/dev/null 2>&1; then
log "✓ SSH access to ${NODE1_IP} working"
else
error "SSH access to ${NODE1_IP} failed. Please ensure SSH is configured."
fi
# Test Node 2
if $ssh_cmd -o ConnectTimeout=5 -o StrictHostKeyChecking=no ${SSH_USER}@${NODE2_IP} "echo 'Connected'" >/dev/null 2>&1; then
log "✓ SSH access to ${NODE2_IP} working"
else
error "SSH access to ${NODE2_IP} failed. Please ensure SSH is configured."
fi
}
create_cluster_node1() {
log "Creating cluster on ${NODE1_NAME}..."
local ssh_cmd="ssh"
if [ -n "$SSH_KEY" ]; then
ssh_cmd="ssh -i $SSH_KEY"
fi
# Check if already in cluster
local cluster_status=$($ssh_cmd ${SSH_USER}@${NODE1_IP} "pvecm status 2>/dev/null || echo 'not-in-cluster'")
if echo "$cluster_status" | grep -q "Cluster name"; then
warn "${NODE1_NAME} is already in a cluster"
echo "$cluster_status"
return 1
fi
# Create cluster
log "Creating cluster '${CLUSTER_NAME}'..."
$ssh_cmd ${SSH_USER}@${NODE1_IP} "pvecm create ${CLUSTER_NAME}" || {
error "Failed to create cluster on ${NODE1_NAME}"
}
log "✓ Cluster created on ${NODE1_NAME}"
# Verify
$ssh_cmd ${SSH_USER}@${NODE1_IP} "pvecm status"
}
add_node2_to_cluster() {
log "Adding ${NODE2_NAME} to cluster..."
local ssh_cmd="ssh"
if [ -n "$SSH_KEY" ]; then
ssh_cmd="ssh -i $SSH_KEY"
fi
# Check if already in cluster
local cluster_status=$($ssh_cmd ${SSH_USER}@${NODE2_IP} "pvecm status 2>/dev/null || echo 'not-in-cluster'")
if echo "$cluster_status" | grep -q "Cluster name"; then
warn "${NODE2_NAME} is already in a cluster"
echo "$cluster_status"
return 1
fi
# Add to cluster
log "Joining ${NODE2_NAME} to cluster..."
$ssh_cmd ${SSH_USER}@${NODE2_IP} "pvecm add ${NODE1_IP}" || {
error "Failed to add ${NODE2_NAME} to cluster"
}
log "${NODE2_NAME} added to cluster"
# Verify
$ssh_cmd ${SSH_USER}@${NODE2_IP} "pvecm status"
}
configure_quorum() {
log "Configuring quorum for 2-node cluster..."
local ssh_cmd="ssh"
if [ -n "$SSH_KEY" ]; then
ssh_cmd="ssh -i $SSH_KEY"
fi
# Set expected votes to 2
$ssh_cmd ${SSH_USER}@${NODE1_IP} "pvecm expected 2" || {
warn "Failed to set expected votes on ${NODE1_NAME}"
}
$ssh_cmd ${SSH_USER}@${NODE2_IP} "pvecm expected 2" || {
warn "Failed to set expected votes on ${NODE2_NAME}"
}
log "✓ Quorum configured"
}
verify_cluster() {
log "Verifying cluster status..."
local ssh_cmd="ssh"
if [ -n "$SSH_KEY" ]; then
ssh_cmd="ssh -i $SSH_KEY"
fi
echo ""
info "Cluster status on ${NODE1_NAME}:"
$ssh_cmd ${SSH_USER}@${NODE1_IP} "pvecm status && echo '' && pvecm nodes"
echo ""
info "Cluster status on ${NODE2_NAME}:"
$ssh_cmd ${SSH_USER}@${NODE2_IP} "pvecm status && echo '' && pvecm nodes"
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Proxmox Cluster Creation (SSH Method) ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
check_ssh_access
echo ""
info "Cluster Configuration:"
echo " Cluster Name: ${CLUSTER_NAME}"
echo " Node 1: ${NODE1_NAME} (${NODE1_IP})"
echo " Node 2: ${NODE2_NAME} (${NODE2_IP})"
echo ""
read -p "Create cluster? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
log "Cluster creation cancelled"
exit 0
fi
echo ""
# Create cluster on node 1
if create_cluster_node1; then
log "Cluster created successfully"
else
error "Failed to create cluster"
fi
echo ""
# Add node 2
if add_node2_to_cluster; then
log "Node 2 added successfully"
else
error "Failed to add node 2"
fi
echo ""
# Configure quorum
configure_quorum
echo ""
# Verify
verify_cluster
echo ""
log "✓ Cluster creation complete!"
echo ""
info "Next steps:"
info "1. Verify cluster in Proxmox web UI"
info "2. Test VM creation and migration"
info "3. Configure shared storage (if needed)"
}
main "$@"

231
scripts/create-proxmox-cluster.sh Executable file
View File

@@ -0,0 +1,231 @@
#!/bin/bash
# create-proxmox-cluster.sh
# Creates a Proxmox cluster between two instances
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Configuration
CLUSTER_NAME="${CLUSTER_NAME:-sankofa-cluster}"
NODE1_IP="192.168.11.10"
NODE1_NAME="ML110-01"
NODE1_TOKEN="${PROXMOX_TOKEN_ML110_01:-}"
NODE2_IP="192.168.11.11"
NODE2_NAME="R630-01"
NODE2_TOKEN="${PROXMOX_TOKEN_R630_01:-}"
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_prerequisites() {
if [ -z "$NODE1_TOKEN" ] || [ -z "$NODE2_TOKEN" ]; then
error "Proxmox API tokens not found in .env file"
fi
log "Prerequisites check passed"
}
check_cluster_status() {
log "Checking current cluster status..."
local node1_cluster=$(curl -k -s -H "Authorization: PVEAPIToken ${NODE1_TOKEN}" \
"https://${NODE1_IP}:8006/api2/json/cluster/config/nodes" 2>/dev/null | jq -r '.data // null')
local node2_cluster=$(curl -k -s -H "Authorization: PVEAPIToken ${NODE2_TOKEN}" \
"https://${NODE2_IP}:8006/api2/json/cluster/config/nodes" 2>/dev/null | jq -r '.data // null')
if [ "$node1_cluster" != "null" ] && [ -n "$node1_cluster" ]; then
warn "Node 1 is already in a cluster"
echo "$node1_cluster" | jq '.'
return 1
fi
if [ "$node2_cluster" != "null" ] && [ -n "$node2_cluster" ]; then
warn "Node 2 is already in a cluster"
echo "$node2_cluster" | jq '.'
return 1
fi
log "Both nodes are standalone - ready for clustering"
return 0
}
create_cluster_on_node1() {
log "Creating cluster '${CLUSTER_NAME}' on ${NODE1_NAME}..."
# Create cluster via API
local response=$(curl -k -s -X POST \
-H "Authorization: PVEAPIToken ${NODE1_TOKEN}" \
-H "Content-Type: application/json" \
"https://${NODE1_IP}:8006/api2/json/cluster/config" \
-d "{\"clustername\":\"${CLUSTER_NAME}\",\"link0\":\"${NODE1_IP}\"}" 2>/dev/null)
local success=$(echo "$response" | jq -r '.data // null')
if [ "$success" != "null" ] && [ -n "$success" ]; then
log "✓ Cluster created on ${NODE1_NAME}"
return 0
else
local error_msg=$(echo "$response" | jq -r '.errors[0].message // "Unknown error"')
error "Failed to create cluster: ${error_msg}"
fi
}
get_cluster_fingerprint() {
log "Getting cluster fingerprint from ${NODE1_NAME}..."
local fingerprint=$(curl -k -s -H "Authorization: PVEAPIToken ${NODE1_TOKEN}" \
"https://${NODE1_IP}:8006/api2/json/cluster/config/totem" 2>/dev/null | \
jq -r '.data.fingerprint // empty')
if [ -n "$fingerprint" ]; then
echo "$fingerprint"
return 0
else
warn "Could not get cluster fingerprint"
return 1
fi
}
add_node2_to_cluster() {
log "Adding ${NODE2_NAME} to cluster..."
# Get cluster fingerprint
local fingerprint=$(get_cluster_fingerprint)
if [ -z "$fingerprint" ]; then
warn "Fingerprint not available, trying without it"
fi
# Add node to cluster
local response=$(curl -k -s -X POST \
-H "Authorization: PVEAPIToken ${NODE2_TOKEN}" \
-H "Content-Type: application/json" \
"https://${NODE2_IP}:8006/api2/json/cluster/config/nodes" \
-d "{\"hostname\":\"${NODE1_NAME}\",\"nodeid\":1,\"votes\":1,\"link0\":\"${NODE1_IP}\"}" 2>/dev/null)
local success=$(echo "$response" | jq -r '.data // null')
if [ "$success" != "null" ] && [ -n "$success" ]; then
log "${NODE2_NAME} added to cluster"
return 0
else
local error_msg=$(echo "$response" | jq -r '.errors[0].message // "Unknown error"')
warn "API method failed: ${error_msg}"
warn "Cluster creation may require SSH access or manual setup"
return 1
fi
}
verify_cluster() {
log "Verifying cluster status..."
sleep 5 # Wait for cluster to stabilize
local node1_nodes=$(curl -k -s -H "Authorization: PVEAPIToken ${NODE1_TOKEN}" \
"https://${NODE1_IP}:8006/api2/json/cluster/config/nodes" 2>/dev/null | \
jq -r '.data | length // 0')
local node2_nodes=$(curl -k -s -H "Authorization: PVEAPIToken ${NODE2_TOKEN}" \
"https://${NODE2_IP}:8006/api2/json/cluster/config/nodes" 2>/dev/null | \
jq -r '.data | length // 0')
if [ "$node1_nodes" -ge 2 ] && [ "$node2_nodes" -ge 2 ]; then
log "✓ Cluster verified - both nodes see 2+ members"
return 0
else
warn "Cluster verification incomplete"
warn "Node 1 sees ${node1_nodes} members"
warn "Node 2 sees ${node2_nodes} members"
return 1
fi
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Proxmox Cluster Creation ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
check_prerequisites
echo ""
if ! check_cluster_status; then
error "One or both nodes are already in a cluster"
fi
echo ""
info "Cluster Configuration:"
echo " Cluster Name: ${CLUSTER_NAME}"
echo " Node 1: ${NODE1_NAME} (${NODE1_IP})"
echo " Node 2: ${NODE2_NAME} (${NODE2_IP})"
echo ""
read -p "Create cluster? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
log "Cluster creation cancelled"
exit 0
fi
echo ""
# Try API-based cluster creation
if create_cluster_on_node1; then
log "Cluster created on ${NODE1_NAME}"
else
error "Failed to create cluster via API"
fi
echo ""
# Try to add second node
if add_node2_to_cluster; then
log "Node 2 added to cluster"
else
warn "Could not add node 2 via API"
warn "You may need to add it manually via SSH or web UI"
fi
echo ""
# Verify cluster
verify_cluster
echo ""
log "Cluster creation process complete!"
echo ""
info "Next steps:"
info "1. Verify cluster: Check both nodes in Proxmox web UI"
info "2. Test cluster: Create a VM and verify it's visible on both nodes"
info "3. Configure quorum: For 2-node cluster, set expected votes: pvecm expected 2"
}
main "$@"

125
scripts/create-proxmox-secret.sh Executable file
View File

@@ -0,0 +1,125 @@
#!/bin/bash
# create-proxmox-secret.sh
# Creates Kubernetes secret for Proxmox credentials
set -euo pipefail
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Configuration
NAMESPACE="${NAMESPACE:-crossplane-system}"
SECRET_NAME="${SECRET_NAME:-proxmox-credentials}"
KEY_NAME="${KEY_NAME:-credentials.json}"
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
check_prerequisites() {
if ! command -v kubectl &> /dev/null; then
error "kubectl is required but not installed"
fi
if ! kubectl cluster-info &> /dev/null; then
error "Cannot connect to Kubernetes cluster"
fi
}
prompt_credentials() {
echo ""
echo "Enter Proxmox credentials:"
echo ""
read -p "Username (e.g., root@pam): " USERNAME
read -sp "Token (format: user@realm!token-id=token-secret): " TOKEN
echo ""
if [ -z "$USERNAME" ] || [ -z "$TOKEN" ]; then
error "Username and token are required"
fi
CREDENTIALS_JSON=$(cat <<EOF
{
"username": "${USERNAME}",
"token": "${TOKEN}"
}
EOF
)
}
create_secret() {
log "Creating Kubernetes secret: ${SECRET_NAME} in namespace ${NAMESPACE}"
# Create namespace if it doesn't exist
kubectl create namespace "${NAMESPACE}" --dry-run=client -o yaml | kubectl apply -f -
# Check if secret already exists
if kubectl get secret "${SECRET_NAME}" -n "${NAMESPACE}" &> /dev/null; then
warn "Secret ${SECRET_NAME} already exists in namespace ${NAMESPACE}"
read -p "Do you want to update it? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
log "Skipping secret creation"
return 0
fi
kubectl delete secret "${SECRET_NAME}" -n "${NAMESPACE}"
fi
# Create secret
echo "${CREDENTIALS_JSON}" | kubectl create secret generic "${SECRET_NAME}" \
--from-file="${KEY_NAME}=/dev/stdin" \
-n "${NAMESPACE}" \
--dry-run=client -o yaml | kubectl apply -f -
log "✓ Secret created successfully"
}
verify_secret() {
log "Verifying secret..."
if kubectl get secret "${SECRET_NAME}" -n "${NAMESPACE}" &> /dev/null; then
log "✓ Secret exists"
# Show secret metadata (not the actual content)
kubectl get secret "${SECRET_NAME}" -n "${NAMESPACE}" -o jsonpath='{.metadata.name}' | xargs echo " Name:"
kubectl get secret "${SECRET_NAME}" -n "${NAMESPACE}" -o jsonpath='{.data}' | jq -r 'keys[]' | while read key; do
echo " Key: ${key}"
done
else
error "Secret verification failed"
fi
}
main() {
log "Proxmox Credentials Secret Creator"
log "=================================="
check_prerequisites
prompt_credentials
create_secret
verify_secret
log ""
log "Secret created successfully!"
log ""
log "Next steps:"
log "1. Apply ProviderConfig: kubectl apply -f crossplane-provider-proxmox/examples/provider-config.yaml"
log "2. Verify ProviderConfig status: kubectl get providerconfig proxmox-provider-config"
log "3. Check provider logs: kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox"
}
main "$@"

101
scripts/deploy-all.sh Executable file
View File

@@ -0,0 +1,101 @@
#!/bin/bash
# Deploy all components in parallel where possible
# Full parallel deployment script
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
echo "=== Sankofa Phoenix - Parallel Deployment ==="
echo ""
# Color codes
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Function to run command and capture output
run_parallel() {
local name=$1
shift
local cmd="$@"
echo -e "${YELLOW}[$name]${NC} Starting..."
if eval "$cmd" > "/tmp/deploy_${name}.log" 2>&1; then
echo -e "${GREEN}[$name]${NC} ✓ Completed"
return 0
else
echo -e "${RED}[$name]${NC} ✗ Failed (check /tmp/deploy_${name}.log)"
return 1
fi
}
# Check prerequisites
echo "Checking prerequisites..."
MISSING_DEPS=()
if ! command -v kubectl &> /dev/null; then MISSING_DEPS+=("kubectl"); fi
if ! command -v go &> /dev/null; then MISSING_DEPS+=("go"); fi
if ! command -v docker &> /dev/null; then
echo "⚠ Docker not found, will skip Docker builds"
fi
if [ ${#MISSING_DEPS[@]} -gt 0 ]; then
echo "✗ Missing dependencies: ${MISSING_DEPS[*]}"
exit 1
fi
echo "✓ All prerequisites met"
echo ""
# Step 1: Build provider (can run independently)
echo "=== Step 1: Building Crossplane Provider ==="
"$SCRIPT_DIR/build-crossplane-provider.sh" &
BUILD_PID=$!
# Step 2: Deploy Keycloak (can run independently)
echo "=== Step 2: Deploying Keycloak ==="
"$SCRIPT_DIR/deploy-keycloak.sh" &
KEYCLOAK_PID=$!
# Wait for builds to complete
echo ""
echo "Waiting for parallel tasks to complete..."
wait $BUILD_PID && echo "✓ Provider build complete" || echo "✗ Provider build failed"
wait $KEYCLOAK_PID && echo "✓ Keycloak deployment complete" || echo "✗ Keycloak deployment failed"
# Step 3: Deploy provider (requires build to complete)
echo ""
echo "=== Step 3: Deploying Crossplane Provider ==="
"$SCRIPT_DIR/deploy-crossplane-provider.sh"
# Step 4: Test connectivity
echo ""
echo "=== Step 4: Testing Proxmox Connectivity ==="
if "$SCRIPT_DIR/test-proxmox-connectivity.sh"; then
echo "✓ Proxmox connectivity verified"
else
echo "⚠ Proxmox connectivity test failed (may be expected if instances are not reachable)"
fi
echo ""
echo "=== Deployment Summary ==="
echo ""
echo "Keycloak:"
kubectl get pods -n keycloak 2>/dev/null || echo " Not deployed"
echo ""
echo "Crossplane Provider:"
kubectl get providers 2>/dev/null || echo " Not deployed"
kubectl get pods -n crossplane-system -l app=crossplane-provider-proxmox 2>/dev/null || echo " Not deployed"
echo ""
echo "=== Next Steps ==="
echo "1. Configure Keycloak clients (see deploy-keycloak.sh output)"
echo "2. Create ProviderConfig with Proxmox credentials"
echo "3. Test VM provisioning via Crossplane"
echo ""
echo "For detailed logs, check: /tmp/deploy_*.log"

View File

@@ -0,0 +1,76 @@
#!/bin/bash
# Deploy Crossplane provider to Kubernetes
# DEPLOY-021: Deploy provider to Kubernetes
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
PROVIDER_DIR="$PROJECT_ROOT/crossplane-provider-proxmox"
echo "=== Deploying Crossplane Provider ==="
echo ""
# Check prerequisites
if ! command -v kubectl &> /dev/null; then
echo "✗ kubectl is not installed or not in PATH"
exit 1
fi
if ! kubectl cluster-info &> /dev/null; then
echo "✗ Cannot connect to Kubernetes cluster"
exit 1
fi
echo "✓ Kubernetes cluster is accessible"
echo ""
# Check if Crossplane is installed
if ! kubectl get namespace crossplane-system &> /dev/null; then
echo "⚠ Crossplane namespace not found"
echo " Installing Crossplane..."
kubectl create namespace crossplane-system || true
echo " Please install Crossplane first:"
echo " helm repo add crossplane-stable https://charts.crossplane.io/stable"
echo " helm install crossplane --namespace crossplane-system crossplane-stable/crossplane"
echo ""
read -p "Press Enter once Crossplane is installed, or Ctrl+C to exit..."
fi
cd "$PROVIDER_DIR"
# Set image (can be overridden)
IMG="${IMG:-ghcr.io/sankofa/crossplane-provider-proxmox:latest}"
# Install CRDs
echo "Installing CRDs..."
make install
# Deploy provider
echo ""
echo "Deploying provider..."
make deploy IMG="$IMG"
# Wait for provider to be ready
echo ""
echo "Waiting for provider to be ready..."
kubectl wait --for=condition=healthy provider.pkg.crossplane.io \
crossplane-provider-proxmox --timeout=300s || true
# Show status
echo ""
echo "=== Provider Status ==="
kubectl get providers
kubectl get pods -n crossplane-system -l app=crossplane-provider-proxmox
echo ""
echo "=== Provider deployed successfully ==="
echo ""
echo "Next steps:"
echo "1. Create ProviderConfig secret:"
echo " kubectl create secret generic proxmox-credentials \\"
echo " --from-file=credentials.json=<path-to-credentials> \\"
echo " -n crossplane-system"
echo ""
echo "2. Apply ProviderConfig:"
echo " kubectl apply -f $PROJECT_ROOT/crossplane-provider-proxmox/examples/provider-config.yaml"

214
scripts/deploy-keycloak.sh Executable file
View File

@@ -0,0 +1,214 @@
#!/bin/bash
set -euo pipefail
# Keycloak Deployment Script
# This script deploys and configures Keycloak for Sankofa Phoenix
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Check prerequisites
check_prerequisites() {
log_info "Checking prerequisites..."
if ! command -v kubectl &> /dev/null; then
log_error "kubectl is not installed"
exit 1
fi
if ! command -v helm &> /dev/null; then
log_error "helm is not installed"
exit 1
fi
if ! kubectl cluster-info &> /dev/null; then
log_error "Cannot connect to Kubernetes cluster"
exit 1
fi
log_info "Prerequisites check passed"
}
# Generate random password
generate_password() {
openssl rand -base64 32 | tr -d "=+/" | cut -c1-25
}
# Deploy PostgreSQL for Keycloak
deploy_postgres() {
log_info "Deploying PostgreSQL for Keycloak..."
POSTGRES_PASSWORD="${KEYCLOAK_DB_PASSWORD:-$(generate_password)}"
kubectl create namespace keycloak --dry-run=client -o yaml | kubectl apply -f -
kubectl create secret generic keycloak-db-credentials \
--from-literal=username=keycloak \
--from-literal=password="$POSTGRES_PASSWORD" \
--namespace=keycloak \
--dry-run=client -o yaml | kubectl apply -f -
log_info "PostgreSQL secret created"
log_warn "PostgreSQL password saved in secret: keycloak-db-credentials"
}
# Deploy Keycloak
deploy_keycloak() {
log_info "Deploying Keycloak..."
ADMIN_PASSWORD="${KEYCLOAK_ADMIN_PASSWORD:-$(generate_password)}"
kubectl create secret generic keycloak-credentials \
--from-literal=username=admin \
--from-literal=password="$ADMIN_PASSWORD" \
--namespace=keycloak \
--dry-run=client -o yaml | kubectl apply -f -
log_info "Keycloak admin credentials created"
log_warn "Admin password saved in secret: keycloak-credentials"
# Apply Keycloak manifests
kubectl apply -f "$PROJECT_ROOT/gitops/apps/keycloak/namespace.yaml"
kubectl apply -f "$PROJECT_ROOT/gitops/apps/keycloak/postgres.yaml"
kubectl apply -f "$PROJECT_ROOT/gitops/apps/keycloak/deployment.yaml"
log_info "Waiting for Keycloak to be ready..."
kubectl wait --for=condition=available --timeout=300s \
deployment/keycloak -n keycloak || {
log_error "Keycloak deployment failed"
kubectl logs -n keycloak deployment/keycloak --tail=50
exit 1
}
log_info "Keycloak deployed successfully"
}
# Configure Keycloak clients
configure_clients() {
log_info "Configuring Keycloak clients..."
# Wait for Keycloak to be fully ready
log_info "Waiting for Keycloak API to be ready..."
for i in {1..30}; do
if kubectl exec -n keycloak deployment/keycloak -- \
curl -s http://localhost:8080/health/ready &>/dev/null; then
break
fi
sleep 2
done
# Get admin credentials
ADMIN_USER=$(kubectl get secret keycloak-credentials -n keycloak -o jsonpath='{.data.username}' | base64 -d)
ADMIN_PASS=$(kubectl get secret keycloak-credentials -n keycloak -o jsonpath='{.data.password}' | base64 -d)
# Port-forward for client configuration
log_info "Configuring clients via API..."
kubectl port-forward -n keycloak svc/keycloak 8080:8080 &
PF_PID=$!
sleep 3
# Get admin token
TOKEN=$(curl -s -X POST "http://localhost:8080/realms/master/protocol/openid-connect/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "username=$ADMIN_USER" \
-d "password=$ADMIN_PASS" \
-d "grant_type=password" \
-d "client_id=admin-cli" | jq -r '.access_token')
if [ "$TOKEN" == "null" ] || [ -z "$TOKEN" ]; then
log_error "Failed to get admin token"
kill $PF_PID 2>/dev/null || true
exit 1
fi
# Generate client secrets
API_SECRET="${SANKOFA_API_CLIENT_SECRET:-$(generate_password)}"
PORTAL_SECRET="${PORTAL_CLIENT_SECRET:-$(generate_password)}"
# Create API client
log_info "Creating API client..."
curl -s -X POST "http://localhost:8080/admin/realms/master/clients" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"clientId\": \"sankofa-api\",
\"name\": \"Sankofa API Client\",
\"enabled\": true,
\"clientAuthenticatorType\": \"client-secret\",
\"secret\": \"$API_SECRET\",
\"standardFlowEnabled\": false,
\"serviceAccountsEnabled\": true,
\"publicClient\": false,
\"protocol\": \"openid-connect\"
}" > /dev/null
# Create Portal client
log_info "Creating Portal client..."
curl -s -X POST "http://localhost:8080/admin/realms/master/clients" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"clientId\": \"portal-client\",
\"name\": \"Sankofa Portal Client\",
\"enabled\": true,
\"clientAuthenticatorType\": \"client-secret\",
\"secret\": \"$PORTAL_SECRET\",
\"standardFlowEnabled\": true,
\"directAccessGrantsEnabled\": true,
\"publicClient\": false,
\"protocol\": \"openid-connect\",
\"redirectUris\": [
\"http://localhost:3000/*\",
\"https://portal.sankofa.nexus/*\"
],
\"webOrigins\": [\"+\"]
}" > /dev/null
# Save secrets
kubectl create secret generic keycloak-client-secrets \
--from-literal=api-client-secret="$API_SECRET" \
--from-literal=portal-client-secret="$PORTAL_SECRET" \
--namespace=keycloak \
--dry-run=client -o yaml | kubectl apply -f -
kill $PF_PID 2>/dev/null || true
log_info "Keycloak clients configured successfully"
log_warn "Client secrets saved in secret: keycloak-client-secrets"
}
# Main deployment
main() {
log_info "Starting Keycloak deployment..."
check_prerequisites
deploy_postgres
deploy_keycloak
configure_clients
log_info "Keycloak deployment completed!"
log_info "Access Keycloak at: https://keycloak.sankofa.nexus"
log_warn "Admin credentials are in secret: keycloak-credentials"
log_warn "Client secrets are in secret: keycloak-client-secrets"
}
# Run main function
main "$@"

107
scripts/deploy-production.sh Executable file
View File

@@ -0,0 +1,107 @@
#!/bin/bash
# Production Deployment Script
# This script handles the complete production deployment process
set -e
echo "🚀 Starting Production Deployment..."
# Configuration
NAMESPACE=${NAMESPACE:-sankofa}
ENVIRONMENT=${ENVIRONMENT:-production}
KUBECONFIG=${KUBECONFIG:-~/.kube/config}
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Function to print colored output
print_status() {
echo -e "${GREEN}${NC} $1"
}
print_error() {
echo -e "${RED}${NC} $1"
}
print_warning() {
echo -e "${YELLOW}${NC} $1"
}
# Pre-deployment checks
echo "📋 Running pre-deployment checks..."
# Check kubectl
if ! command -v kubectl &> /dev/null; then
print_error "kubectl is not installed"
exit 1
fi
print_status "kubectl found"
# Check Kubernetes connection
if ! kubectl cluster-info &> /dev/null; then
print_error "Cannot connect to Kubernetes cluster"
exit 1
fi
print_status "Kubernetes cluster accessible"
# Check namespace
if ! kubectl get namespace $NAMESPACE &> /dev/null; then
print_warning "Namespace $NAMESPACE does not exist, creating..."
kubectl create namespace $NAMESPACE
fi
print_status "Namespace $NAMESPACE ready"
# Deploy database migrations
echo "📦 Deploying database migrations..."
kubectl apply -f gitops/apps/api/migrations.yaml -n $NAMESPACE
print_status "Database migrations applied"
# Deploy API
echo "📦 Deploying API..."
kubectl apply -f gitops/apps/api/ -n $NAMESPACE
kubectl rollout status deployment/api -n $NAMESPACE --timeout=5m
print_status "API deployed"
# Deploy Frontend
echo "📦 Deploying Frontend..."
kubectl apply -f gitops/apps/frontend/ -n $NAMESPACE
kubectl rollout status deployment/frontend -n $NAMESPACE --timeout=5m
print_status "Frontend deployed"
# Deploy Portal
echo "📦 Deploying Portal..."
kubectl apply -f gitops/apps/portal/ -n $NAMESPACE
kubectl rollout status deployment/portal -n $NAMESPACE --timeout=5m
print_status "Portal deployed"
# Run smoke tests
echo "🧪 Running smoke tests..."
SMOKE_TEST_URL=${SMOKE_TEST_URL:-http://api.sankofa.nexus/health}
if curl -f $SMOKE_TEST_URL > /dev/null 2>&1; then
print_status "Smoke tests passed"
else
print_error "Smoke tests failed"
exit 1
fi
# Verify deployments
echo "🔍 Verifying deployments..."
kubectl get deployments -n $NAMESPACE
kubectl get services -n $NAMESPACE
kubectl get pods -n $NAMESPACE
print_status "✅ Production deployment completed successfully!"
echo ""
echo "📊 Deployment Summary:"
echo " - Namespace: $NAMESPACE"
echo " - Environment: $ENVIRONMENT"
echo " - API: $(kubectl get deployment api -n $NAMESPACE -o jsonpath='{.status.readyReplicas}')/$(kubectl get deployment api -n $NAMESPACE -o jsonpath='{.spec.replicas}') replicas"
echo " - Frontend: $(kubectl get deployment frontend -n $NAMESPACE -o jsonpath='{.status.readyReplicas}')/$(kubectl get deployment frontend -n $NAMESPACE -o jsonpath='{.spec.replicas}') replicas"
echo " - Portal: $(kubectl get deployment portal -n $NAMESPACE -o jsonpath='{.status.readyReplicas}')/$(kubectl get deployment portal -n $NAMESPACE -o jsonpath='{.spec.replicas}') replicas"

View File

@@ -0,0 +1,180 @@
#!/bin/bash
set -euo pipefail
# Deploy Proxmox Crossplane Provider Script
# This script deploys the Crossplane provider to Kubernetes
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
PROVIDER_DIR="${PROJECT_ROOT}/crossplane-provider-proxmox"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
error() {
log_error "$*"
exit 1
}
check_prerequisites() {
log "Checking prerequisites..."
if ! command -v kubectl &> /dev/null; then
error "kubectl is required but not installed"
fi
if ! kubectl cluster-info &> /dev/null; then
error "Cannot connect to Kubernetes cluster"
fi
log_success "Prerequisites check passed"
}
deploy_crds() {
log "Deploying CRDs..."
CRD_DIR="${PROVIDER_DIR}/config/crd/bases"
if [ ! -d "${CRD_DIR}" ]; then
log_warning "CRD directory not found, generating CRDs..."
if [ -f "${PROVIDER_DIR}/Makefile" ]; then
cd "${PROVIDER_DIR}"
if command -v make &> /dev/null; then
make manifests || log_warning "Failed to generate CRDs with make"
fi
fi
fi
if [ -d "${CRD_DIR}" ] && [ "$(ls -A ${CRD_DIR}/*.yaml 2>/dev/null)" ]; then
kubectl apply -f "${CRD_DIR}" || error "Failed to apply CRDs"
log_success "CRDs deployed"
else
log_warning "No CRD files found, skipping CRD deployment"
log "Note: CRDs may need to be generated first with 'make manifests'"
fi
}
deploy_provider() {
log "Deploying provider..."
PROVIDER_MANIFEST="${PROVIDER_DIR}/config/provider.yaml"
if [ ! -f "${PROVIDER_MANIFEST}" ]; then
error "Provider manifest not found: ${PROVIDER_MANIFEST}"
fi
kubectl apply -f "${PROVIDER_MANIFEST}" || error "Failed to deploy provider"
log_success "Provider deployed"
}
wait_for_provider() {
log "Waiting for provider to be ready..."
local max_attempts=30
local attempt=0
while [ $attempt -lt $max_attempts ]; do
if kubectl get deployment -n crossplane-system crossplane-provider-proxmox &> /dev/null; then
if kubectl wait --for=condition=available --timeout=60s \
deployment/crossplane-provider-proxmox -n crossplane-system &> /dev/null; then
log_success "Provider is ready"
return 0
fi
fi
attempt=$((attempt + 1))
sleep 2
done
log_warning "Provider may not be ready yet"
return 1
}
check_provider_status() {
log "Checking provider status..."
if kubectl get deployment -n crossplane-system crossplane-provider-proxmox &> /dev/null; then
kubectl get deployment -n crossplane-system crossplane-provider-proxmox
echo ""
kubectl get pods -n crossplane-system -l app=crossplane-provider-proxmox
echo ""
log "Provider logs:"
kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=20 || true
else
log_warning "Provider deployment not found"
fi
}
create_providerconfig() {
log "Creating ProviderConfig..."
PROVIDER_CONFIG="${PROVIDER_DIR}/examples/provider-config.yaml"
if [ ! -f "${PROVIDER_CONFIG}" ]; then
log_warning "ProviderConfig example not found: ${PROVIDER_CONFIG}"
return
fi
log "Note: ProviderConfig requires credentials secret"
log "Create secret first with:"
echo " kubectl create secret generic proxmox-credentials \\"
echo " --from-literal=credentials.json='{\"username\":\"root@pam\",\"token\":\"...\"}' \\"
echo " -n crossplane-system"
echo ""
log "Then apply ProviderConfig:"
echo " kubectl apply -f ${PROVIDER_CONFIG}"
echo ""
read -p "Apply ProviderConfig now? (y/N) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
kubectl apply -f "${PROVIDER_CONFIG}" || log_warning "Failed to apply ProviderConfig"
log_success "ProviderConfig applied"
else
log "Skipping ProviderConfig creation"
fi
}
main() {
log "Starting Proxmox Provider Deployment..."
log "========================================"
check_prerequisites
deploy_crds
deploy_provider
wait_for_provider
check_provider_status
create_providerconfig
log ""
log "========================================"
log_success "Deployment completed!"
log ""
log "Next steps:"
log "1. Create credentials secret (see above)"
log "2. Apply ProviderConfig"
log "3. Verify provider connectivity"
log "4. Test VM creation"
}
main "$@"

172
scripts/deploy-test-vms.sh Executable file
View File

@@ -0,0 +1,172 @@
#!/bin/bash
# deploy-test-vms.sh
# Deploys test VMs to both Proxmox instances via Crossplane
set -euo pipefail
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Configuration
PROVIDER_DIR="${PROVIDER_DIR:-./crossplane-provider-proxmox}"
INSTANCE1_MANIFEST="${PROVIDER_DIR}/examples/test-vm-instance-1.yaml"
INSTANCE2_MANIFEST="${PROVIDER_DIR}/examples/test-vm-instance-2.yaml"
WAIT_TIMEOUT="${WAIT_TIMEOUT:-300}"
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_prerequisites() {
log "Checking prerequisites..."
if ! command -v kubectl &> /dev/null; then
error "kubectl is required but not installed"
fi
if ! kubectl cluster-info &> /dev/null; then
error "Cannot connect to Kubernetes cluster"
fi
# Check if provider is deployed
if ! kubectl get deployment crossplane-provider-proxmox -n crossplane-system &> /dev/null; then
error "Crossplane provider is not deployed. Run ./scripts/deploy-crossplane-provider.sh first"
fi
# Check if ProviderConfig exists
if ! kubectl get providerconfig proxmox-provider-config &> /dev/null; then
error "ProviderConfig not found. Create it first: kubectl apply -f ${PROVIDER_DIR}/examples/provider-config.yaml"
fi
log "✓ Prerequisites check passed"
}
deploy_vm() {
local manifest=$1
local vm_name=$2
if [ ! -f "$manifest" ]; then
error "VM manifest not found: ${manifest}"
fi
log "Deploying VM: ${vm_name}..."
# Apply manifest
if kubectl apply -f "$manifest"; then
log "✓ VM manifest applied: ${vm_name}"
else
error "Failed to apply VM manifest: ${vm_name}"
fi
# Wait for VM to be created
log "Waiting for VM to be ready (timeout: ${WAIT_TIMEOUT}s)..."
local start_time=$(date +%s)
while true; do
local current_time=$(date +%s)
local elapsed=$((current_time - start_time))
if [ $elapsed -gt $WAIT_TIMEOUT ]; then
warn "Timeout waiting for VM: ${vm_name}"
return 1
fi
local state=$(kubectl get proxmoxvm "$vm_name" -o jsonpath='{.status.state}' 2>/dev/null || echo "Unknown")
local vm_id=$(kubectl get proxmoxvm "$vm_name" -o jsonpath='{.status.vmId}' 2>/dev/null || echo "")
if [ "$state" = "running" ] && [ -n "$vm_id" ]; then
log "✓ VM is running: ${vm_name} (ID: ${vm_id})"
return 0
elif [ "$state" = "failed" ] || [ "$state" = "error" ]; then
error "VM deployment failed: ${vm_name} (state: ${state})"
fi
sleep 5
done
}
get_vm_status() {
local vm_name=$1
info "VM Status: ${vm_name}"
info "=================================="
kubectl get proxmoxvm "$vm_name" -o jsonpath='{
"Name": "{.metadata.name}",
"State": "{.status.state}",
"VM ID": "{.status.vmId}",
"IP Address": "{.status.ipAddress}",
"Site": "{.spec.forProvider.site}",
"Node": "{.spec.forProvider.node}"
}' 2>/dev/null | jq '.' || warn "Could not get VM status"
echo ""
}
test_vm_lifecycle() {
local vm_name=$1
log "Testing VM lifecycle operations for: ${vm_name}"
# Test stop (if supported)
info "Note: VM lifecycle operations (start/stop) would be tested here"
info "This requires VM controller implementation"
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Test VM Deployment via Crossplane ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
check_prerequisites
echo ""
# Deploy Instance 1 VM
if [ -f "$INSTANCE1_MANIFEST" ]; then
local vm1_name=$(grep -A1 "^metadata:" "$INSTANCE1_MANIFEST" | grep "name:" | awk '{print $2}')
deploy_vm "$INSTANCE1_MANIFEST" "$vm1_name"
get_vm_status "$vm1_name"
else
warn "Instance 1 manifest not found: ${INSTANCE1_MANIFEST}"
fi
echo ""
# Deploy Instance 2 VM
if [ -f "$INSTANCE2_MANIFEST" ]; then
local vm2_name=$(grep -A1 "^metadata:" "$INSTANCE2_MANIFEST" | grep "name:" | awk '{print $2}')
deploy_vm "$INSTANCE2_MANIFEST" "$vm2_name"
get_vm_status "$vm2_name"
else
warn "Instance 2 manifest not found: ${INSTANCE2_MANIFEST}"
fi
echo ""
log "Test VM deployment complete!"
echo ""
info "View all VMs: kubectl get proxmoxvm"
info "View VM details: kubectl describe proxmoxvm <vm-name>"
info "Delete VM: kubectl delete proxmoxvm <vm-name>"
}
main "$@"

86
scripts/dev-setup.sh Executable file
View File

@@ -0,0 +1,86 @@
#!/bin/bash
# Development Environment Setup Script
# This script sets up the development environment for Sankofa Phoenix
set -e
echo "🔥 Setting up Sankofa Phoenix development environment..."
# Check prerequisites
echo "Checking prerequisites..."
command -v node >/dev/null 2>&1 || { echo "Node.js is required but not installed. Aborting." >&2; exit 1; }
command -v pnpm >/dev/null 2>&1 || { echo "pnpm is required but not installed. Aborting." >&2; exit 1; }
# Install root dependencies
echo "Installing root dependencies..."
pnpm install
# Install API dependencies
echo "Installing API dependencies..."
cd api
if [ -f "package.json" ]; then
npm install || pnpm install
fi
cd ..
# Install Portal dependencies
echo "Installing Portal dependencies..."
cd portal
if [ -f "package.json" ]; then
npm install
fi
cd ..
# Create .env.local files if they don't exist
echo "Setting up environment files..."
if [ ! -f ".env.local" ]; then
cat > .env.local << EOF
NEXT_PUBLIC_GRAPHQL_ENDPOINT=http://localhost:4000/graphql
NEXT_PUBLIC_APP_URL=http://localhost:3000
NODE_ENV=development
EOF
echo "Created .env.local"
fi
if [ ! -f "api/.env.local" ]; then
cat > api/.env.local << EOF
DB_HOST=localhost
DB_PORT=5432
DB_NAME=sankofa
DB_USER=postgres
DB_PASSWORD=postgres
JWT_SECRET=dev-secret-change-in-production
NODE_ENV=development
PORT=4000
EOF
echo "Created api/.env.local"
fi
# Setup database (if PostgreSQL is available)
if command -v psql >/dev/null 2>&1; then
echo "Setting up database..."
createdb sankofa 2>/dev/null || echo "Database 'sankofa' may already exist or PostgreSQL not running"
if [ -f "api/src/db/migrate.ts" ]; then
echo "Running database migrations..."
cd api
npm run db:migrate up || pnpm db:migrate up || echo "Migrations may have already run"
cd ..
fi
else
echo "PostgreSQL not found. Skipping database setup."
echo "You can set up PostgreSQL later or use Docker: docker-compose up postgres"
fi
echo ""
echo "✅ Development environment setup complete!"
echo ""
echo "Next steps:"
echo " 1. Start PostgreSQL: docker-compose up -d postgres (or use your own instance)"
echo " 2. Start API: cd api && pnpm dev"
echo " 3. Start Frontend: pnpm dev"
echo " 4. Start Portal: cd portal && npm run dev"
echo ""

View File

@@ -0,0 +1,178 @@
#!/bin/bash
# discover-proxmox-resources.sh
# Discovers Proxmox resources (storage, networks, templates, nodes) and outputs JSON
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Configuration
INSTANCE1_ENDPOINT="${PROXMOX_INSTANCE1:-https://192.168.11.10:8006}"
INSTANCE2_ENDPOINT="${PROXMOX_INSTANCE2:-https://192.168.11.11:8006}"
OUTPUT_DIR="${OUTPUT_DIR:-./docs/proxmox-review}"
# Check if pvesh is available
if ! command -v pvesh &> /dev/null; then
echo -e "${YELLOW}Warning: pvesh not found. Install Proxmox VE tools or use API directly.${NC}"
echo "This script requires pvesh CLI tool or can be adapted to use curl/API."
exit 1
fi
# Function to discover resources for an instance
discover_instance() {
local endpoint=$1
local instance_name=$2
local output_file="${OUTPUT_DIR}/resources-${instance_name}.json"
echo -e "${GREEN}Discovering resources for ${instance_name}...${NC}"
# Create output directory
mkdir -p "${OUTPUT_DIR}"
# Discover nodes
echo " - Discovering nodes..."
local nodes
nodes=$(pvesh get /nodes --output-format json 2>/dev/null || echo "[]")
# Discover storage
echo " - Discovering storage pools..."
local storage
storage=$(pvesh get /storage --output-format json 2>/dev/null || echo "[]")
# Discover networks (for each node)
echo " - Discovering network bridges..."
local networks="[]"
if [ -n "$nodes" ] && [ "$nodes" != "[]" ]; then
# Extract node names from nodes JSON (simplified - would need jq for proper parsing)
# For now, we'll try common node names
for node in ML110-01 R630-01; do
local node_networks
node_networks=$(pvesh get "/nodes/${node}/network" --output-format json 2>/dev/null || echo "[]")
if [ "$node_networks" != "[]" ]; then
networks="$node_networks"
break
fi
done
fi
# Discover templates (for each storage pool)
echo " - Discovering OS templates..."
local templates="[]"
if [ -n "$storage" ] && [ "$storage" != "[]" ]; then
# Try to get templates from common storage pools
for storage_pool in local local-lvm; do
for node in ML110-01 R630-01; do
local storage_templates
storage_templates=$(pvesh get "/nodes/${node}/storage/${storage_pool}/content" --output-format json 2>/dev/null || echo "[]")
if [ "$storage_templates" != "[]" ]; then
templates="$storage_templates"
break 2
fi
done
done
fi
# Combine all resources
cat > "${output_file}" <<EOF
{
"instance": "${instance_name}",
"endpoint": "${endpoint}",
"discovered_at": "$(date -u +"%Y-%m-%dT%H:%M:%SZ")",
"nodes": ${nodes},
"storage": ${storage},
"networks": ${networks},
"templates": ${templates}
}
EOF
echo -e "${GREEN}✓ Resources saved to ${output_file}${NC}"
}
# Function to discover via API (if pvesh not available)
discover_via_api() {
local endpoint=$1
local instance_name=$2
local username="${PROXMOX_USERNAME:-}"
local password="${PROXMOX_PASSWORD:-}"
local token="${PROXMOX_TOKEN:-}"
if [ -z "$token" ] && [ -z "$username" ]; then
echo -e "${RED}Error: PROXMOX_TOKEN or PROXMOX_USERNAME/PROXMOX_PASSWORD required${NC}"
return 1
fi
echo -e "${GREEN}Discovering resources for ${instance_name} via API...${NC}"
# Authenticate and get ticket (if using username/password)
local auth_header=""
if [ -n "$token" ]; then
auth_header="Authorization: PVEAPIToken ${token}"
else
# Get ticket (simplified - would need proper auth flow)
echo -e "${YELLOW}Note: Username/password auth requires ticket-based authentication${NC}"
return 1
fi
# Discover resources via API
local output_file="${OUTPUT_DIR}/resources-${instance_name}.json"
mkdir -p "${OUTPUT_DIR}"
# Get nodes
local nodes
nodes=$(curl -s -k -H "${auth_header}" "${endpoint}/api2/json/nodes" | jq -r '.data // []' || echo "[]")
# Get storage
local storage
storage=$(curl -s -k -H "${auth_header}" "${endpoint}/api2/json/storage" | jq -r '.data // []' || echo "[]")
# Combine resources
cat > "${output_file}" <<EOF
{
"instance": "${instance_name}",
"endpoint": "${endpoint}",
"discovered_at": "$(date -u +"%Y-%m-%dT%H:%M:%SZ")",
"nodes": ${nodes},
"storage": ${storage}
}
EOF
echo -e "${GREEN}✓ Resources saved to ${output_file}${NC}"
}
# Main execution
main() {
echo -e "${GREEN}=== Proxmox Resource Discovery ===${NC}"
echo ""
# Try pvesh first, fall back to API
if command -v pvesh &> /dev/null; then
echo "Using pvesh CLI..."
discover_instance "${INSTANCE1_ENDPOINT}" "instance-1" || true
discover_instance "${INSTANCE2_ENDPOINT}" "instance-2" || true
elif command -v curl &> /dev/null && command -v jq &> /dev/null; then
echo "Using API with curl..."
discover_via_api "${INSTANCE1_ENDPOINT}" "instance-1" || true
discover_via_api "${INSTANCE2_ENDPOINT}" "instance-2" || true
else
echo -e "${RED}Error: pvesh, curl, or jq required${NC}"
exit 1
fi
echo ""
echo -e "${GREEN}=== Discovery Complete ===${NC}"
echo "Output files:"
ls -lh "${OUTPUT_DIR}"/resources-*.json 2>/dev/null || echo " (No files generated)"
echo ""
echo "Next steps:"
echo " 1. Review generated JSON files"
echo " 2. Update docs/proxmox/RESOURCE_INVENTORY.md with actual values"
echo " 3. Update example manifests with verified resource names"
}
# Run main
main "$@"

View File

@@ -0,0 +1,76 @@
#!/bin/bash
# Download Ubuntu 22.04 Cloud Image to Proxmox
# Run this on the Proxmox node or from a machine with SSH access
set -euo pipefail
PROXMOX_NODE="${1:-192.168.11.10}"
PROXMOX_PASS="${PROXMOX_PASS:-L@kers2010}"
STORAGE="${STORAGE:-local-lvm}"
IMAGE_NAME="ubuntu-22.04-server-cloudimg-amd64.img"
IMAGE_URL="https://cloud-images.ubuntu.com/releases/22.04/release/${IMAGE_NAME}"
echo "=========================================="
echo "Downloading Ubuntu 22.04 Cloud Image"
echo "=========================================="
echo ""
echo "Node: $PROXMOX_NODE"
echo "Storage: $STORAGE"
echo "Image: $IMAGE_NAME"
echo ""
# Check if image already exists
echo "Checking if image already exists..."
if sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_NODE "pvesm list $STORAGE | grep -q '$IMAGE_NAME'"; then
echo "✅ Image already exists in $STORAGE"
exit 0
fi
# Download image to Proxmox node
echo "Downloading image to Proxmox node..."
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_NODE "
cd /tmp
if [ ! -f $IMAGE_NAME ]; then
echo 'Downloading from $IMAGE_URL...'
wget -q --show-progress $IMAGE_URL -O $IMAGE_NAME
echo '✅ Download complete'
else
echo 'Image already downloaded locally'
fi
"
# Upload to Proxmox storage
echo ""
echo "Uploading image to Proxmox storage ($STORAGE)..."
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_NODE "
# Create storage path if needed
mkdir -p /var/lib/vz/template/iso
# Copy to storage
if [ -f /tmp/$IMAGE_NAME ]; then
echo 'Copying to storage...'
cp /tmp/$IMAGE_NAME /var/lib/vz/template/iso/$IMAGE_NAME
echo '✅ Image uploaded to /var/lib/vz/template/iso/$IMAGE_NAME'
# Verify
if pvesm list $STORAGE | grep -q '$IMAGE_NAME'; then
echo '✅ Image verified in storage'
else
echo '⚠️ Image copied but not showing in storage list'
echo 'You may need to refresh storage or use full path: local:$IMAGE_NAME'
fi
else
echo '❌ Image file not found after download'
exit 1
fi
"
echo ""
echo "=========================================="
echo "✅ Ubuntu 22.04 Cloud Image Ready"
echo "=========================================="
echo ""
echo "Image location: /var/lib/vz/template/iso/$IMAGE_NAME"
echo "Use in templates as: local:$IMAGE_NAME"
echo "Or: $STORAGE:$IMAGE_NAME"
echo ""

110
scripts/download-ubuntu-image.sh Executable file
View File

@@ -0,0 +1,110 @@
#!/bin/bash
# download-ubuntu-image.sh
# Downloads Ubuntu 22.04 cloud image to Proxmox storage
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
STORAGE="${STORAGE:-local}"
NODE1_IP="192.168.11.10"
NODE2_IP="192.168.11.11"
IMAGE_NAME="ubuntu-22.04-server-cloudimg-amd64.img"
IMAGE_URL="https://cloud-images.ubuntu.com/releases/22.04/release/${IMAGE_NAME}"
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
download_image() {
local node_ip=$1
local node_name=$2
local token=$3
log "Downloading Ubuntu 22.04 cloud image for ${node_name}..."
# Check if image already exists
local existing=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" \
"https://${node_ip}:8006/api2/json/storage/${STORAGE}/content" 2>/dev/null | \
jq -r ".data[]? | select(.volid | contains(\"${IMAGE_NAME}\")) | .volid")
if [ -n "$existing" ]; then
warn "Image already exists: ${existing}"
return 0
fi
info "Image not found. Download instructions:"
echo ""
echo "Option 1: Download via SSH (Recommended):"
echo " ssh root@${node_ip}"
echo " wget ${IMAGE_URL}"
echo " mv ${IMAGE_NAME} /var/lib/vz/template/iso/"
echo ""
echo "Option 2: Download via Proxmox Web UI:"
echo " 1. Log in to https://${node_ip}:8006"
echo " 2. Go to: Datacenter → Storage → ${STORAGE} → Content"
echo " 3. Click 'Upload' → Select file → Upload"
echo ""
echo "Option 3: Use pveam (if template available):"
echo " ssh root@${node_ip}"
echo " pveam available | grep ubuntu-22.04"
echo " pveam download ${STORAGE} ubuntu-22.04-standard_22.04-1_amd64.tar.gz"
echo ""
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Ubuntu Image Download Helper ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
info "Target Image: ${IMAGE_NAME}"
info "Storage: ${STORAGE}"
echo ""
if [ -z "${PROXMOX_TOKEN_ML110_01:-}" ] || [ -z "${PROXMOX_TOKEN_R630_01:-}" ]; then
warn "Proxmox API tokens not found in .env file"
warn "Providing manual download instructions instead"
echo ""
else
download_image "${NODE1_IP}" "ML110-01" "${PROXMOX_TOKEN_ML110_01}"
download_image "${NODE2_IP}" "R630-01" "${PROXMOX_TOKEN_R630_01}"
fi
echo ""
info "Image URL: ${IMAGE_URL}"
info "File Size: ~600MB"
echo ""
log "Download complete or instructions provided"
echo ""
}
main "$@"

View File

@@ -0,0 +1,429 @@
#!/bin/bash
# enable-guest-agent-existing-vms.sh
# Enable QEMU guest agent on existing VMs via Proxmox API
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Load environment
if [ -f "${PROJECT_ROOT}/.env" ]; then
set -a
source "${PROJECT_ROOT}/.env"
set +a
fi
# Try API tokens first, fall back to password
PROXMOX_1_TOKEN="${PROXMOX_TOKEN_ML110_01:-}"
PROXMOX_2_TOKEN="${PROXMOX_TOKEN_R630_01:-}"
PROXMOX_PASS="${PROXMOX_ROOT_PASS:-L@kers2010}"
PROXMOX_1_URL="https://192.168.11.10:8006"
PROXMOX_2_URL="https://192.168.11.11:8006"
# Colors
GREEN='\033[0;32m'
BLUE='\033[0;34m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
# Get auth - try API token first, fall back to password
get_auth() {
local api_url=$1
local api_token=$2
local response
# Try API token authentication first
if [ -n "${api_token}" ]; then
# Parse token format: root@pam!sankofa-instance-1-api-token=73c7e1a2-c969-409c-ae5b-68e83f012ee9
# For Proxmox API tokens, we use the full token string in Authorization header
response=$(curl -k -s -X GET \
-H "Authorization: PVEAuthCookie=${api_token}" \
"${api_url}/api2/json/version" 2>/dev/null)
# If token auth works (we get version info), return token for direct use
if echo "${response}" | grep -q "data\|version"; then
echo "${api_token}|TOKEN"
return 0
fi
fi
# Fall back to password authentication
response=$(curl -k -s -X POST \
-d "username=root@pam&password=${PROXMOX_PASS}" \
"${api_url}/api2/json/access/ticket" 2>/dev/null)
if echo "${response}" | grep -q "authentication failure"; then
echo ""
return 1
fi
local ticket csrf
if command -v jq &> /dev/null; then
ticket=$(echo "${response}" | jq -r '.data.ticket // empty' 2>/dev/null)
csrf=$(echo "${response}" | jq -r '.data.CSRFPreventionToken // empty' 2>/dev/null)
else
ticket=$(echo "${response}" | grep -o '"ticket":"[^"]*' | head -1 | cut -d'"' -f4)
csrf=$(echo "${response}" | grep -o '"CSRFPreventionToken":"[^"]*' | head -1 | cut -d'"' -f4)
fi
if [ -z "${ticket}" ] || [ -z "${csrf}" ]; then
echo ""
return 1
fi
echo "${ticket}|${csrf}"
}
# List all nodes in the cluster
list_nodes() {
local api_url=$1
local auth_token=$2
local auth_type=$3
local response
if [ "${auth_type}" = "TOKEN" ]; then
response=$(curl -k -s -X GET \
-H "Authorization: PVEAuthCookie=${auth_token}" \
"${api_url}/api2/json/nodes" 2>/dev/null)
else
local ticket csrf
IFS='|' read -r ticket csrf <<< "${auth_token}"
response=$(curl -k -s -X GET \
-H "CSRFPreventionToken: ${csrf}" \
-b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes" 2>/dev/null)
fi
# Extract node names from response
if command -v jq &> /dev/null; then
echo "${response}" | jq -r '.data[]?.node // empty' 2>/dev/null | grep -v '^$' | sort
else
# Fallback: extract node names using grep/sed
echo "${response}" | grep -o '"node":"[^"]*' | cut -d'"' -f4 | sort | uniq
fi
}
# List all VMs on a node
list_vms() {
local api_url=$1
local node=$2
local auth_token=$3
local auth_type=$4
local response
if [ "${auth_type}" = "TOKEN" ]; then
# Use API token directly
response=$(curl -k -s -X GET \
-H "Authorization: PVEAuthCookie=${auth_token}" \
"${api_url}/api2/json/nodes/${node}/qemu" 2>/dev/null)
else
# Use ticket and CSRF token
local ticket csrf
IFS='|' read -r ticket csrf <<< "${auth_token}"
response=$(curl -k -s -X GET \
-H "CSRFPreventionToken: ${csrf}" \
-b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu" 2>/dev/null)
fi
# Extract VMIDs from response
if command -v jq &> /dev/null; then
echo "${response}" | jq -r '.data[]?.vmid // empty' 2>/dev/null | grep -v '^$' | sort -n
else
# Fallback: extract VMIDs using grep/sed
echo "${response}" | grep -o '"vmid":[0-9]*' | grep -o '[0-9]*' | sort -n | uniq
fi
}
# Check if guest agent is already enabled
check_guest_agent() {
local api_url=$1
local node=$2
local vmid=$3
local auth_token=$4
local auth_type=$5
local response
if [ "${auth_type}" = "TOKEN" ]; then
response=$(curl -k -s -X GET \
-H "Authorization: PVEAuthCookie=${auth_token}" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}/config" 2>/dev/null)
else
local ticket csrf
IFS='|' read -r ticket csrf <<< "${auth_token}"
response=$(curl -k -s -X GET \
-H "CSRFPreventionToken: ${csrf}" \
-b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}/config" 2>/dev/null)
fi
# Check if agent is already enabled
if echo "${response}" | grep -q '"agent"[[:space:]]*:[[:space:]]*"1"'; then
return 0 # Already enabled
fi
return 1 # Not enabled
}
# Enable guest agent
enable_guest_agent() {
local api_url=$1
local node=$2
local vmid=$3
local auth_token=$4
local auth_type=$5
local response
if [ "${auth_type}" = "TOKEN" ]; then
# Use API token directly
response=$(curl -k -s -X PUT \
-H "Authorization: PVEAuthCookie=${auth_token}" \
-d "agent=1" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}/config" 2>/dev/null)
else
# Use ticket and CSRF token
local ticket csrf
IFS='|' read -r ticket csrf <<< "${auth_token}"
response=$(curl -k -s -X PUT \
-H "CSRFPreventionToken: ${csrf}" \
-b "PVEAuthCookie=${ticket}" \
-d "agent=1" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}/config" 2>/dev/null)
fi
if echo "${response}" | grep -q '"data":null'; then
return 0
fi
# Check if already enabled
if echo "${response}" | grep -q "already"; then
return 0
fi
return 1
}
process_node() {
local api_url=$1
local node=$2
local auth_token=$3
local auth_type=$4
log "Processing node: ${node}"
# Discover all VMs on this node
local vmids
vmids=$(list_vms "${api_url}" "${node}" "${auth_token}" "${auth_type}")
if [ -z "${vmids}" ]; then
log_warning " No VMs found on ${node}"
return 0
fi
local vm_count=0
local enabled_count=0
local skipped_count=0
local failed_count=0
while IFS= read -r vmid; do
[ -z "${vmid}" ] && continue
vm_count=$((vm_count + 1))
# Check if already enabled
if check_guest_agent "${api_url}" "${node}" "${vmid}" "${auth_token}" "${auth_type}"; then
log " VMID ${vmid}: guest agent already enabled"
skipped_count=$((skipped_count + 1))
continue
fi
log " Enabling guest agent on VMID ${vmid}..."
if enable_guest_agent "${api_url}" "${node}" "${vmid}" "${auth_token}" "${auth_type}"; then
log_success " VMID ${vmid} guest agent enabled"
enabled_count=$((enabled_count + 1))
else
log_error " Failed to enable guest agent on VMID ${vmid}"
failed_count=$((failed_count + 1))
fi
sleep 0.3
done <<< "${vmids}"
log " Summary for ${node}: ${vm_count} total, ${enabled_count} enabled, ${skipped_count} already enabled, ${failed_count} failed"
# Return counts via global variables or echo
echo "${vm_count}|${enabled_count}|${skipped_count}|${failed_count}"
}
process_site() {
local api_url=$1
local site_name=$2
local auth_token=$3
local auth_type=$4
log "=========================================="
log "Site: ${site_name}"
log "=========================================="
# Discover all nodes on this site
local nodes
nodes=$(list_nodes "${api_url}" "${auth_token}" "${auth_type}")
if [ -z "${nodes}" ]; then
log_error "Failed to discover nodes on ${site_name}"
return 1
fi
log_success "Discovered nodes: $(echo "${nodes}" | tr '\n' ' ')"
log ""
local site_vm_count=0
local site_enabled_count=0
local site_skipped_count=0
local site_failed_count=0
# Process each node
while IFS= read -r node; do
[ -z "${node}" ] && continue
local result
result=$(process_node "${api_url}" "${node}" "${auth_token}" "${auth_type}")
if [ -n "${result}" ]; then
IFS='|' read -r vm_count enabled_count skipped_count failed_count <<< "${result}"
site_vm_count=$((site_vm_count + vm_count))
site_enabled_count=$((site_enabled_count + enabled_count))
site_skipped_count=$((site_skipped_count + skipped_count))
site_failed_count=$((site_failed_count + failed_count))
fi
log ""
done <<< "${nodes}"
log "Site Summary for ${site_name}:"
log " Total VMs: ${site_vm_count}"
log " Enabled: ${site_enabled_count}"
log " Already enabled: ${site_skipped_count}"
log " Failed: ${site_failed_count}"
log ""
echo "${site_vm_count}|${site_enabled_count}|${site_skipped_count}|${site_failed_count}"
}
main() {
log "=========================================="
log "Enable QEMU Guest Agent on All VMs"
log "=========================================="
log ""
log "This script will:"
log "1. Discover all nodes on each Proxmox site"
log "2. Discover all VMs on each node"
log "3. Check if guest agent is already enabled"
log "4. Enable guest agent on VMs that need it"
log ""
local total_vm_count=0
local total_enabled=0
local total_skipped=0
local total_failed=0
# Site 1
local auth1
auth1=$(get_auth "${PROXMOX_1_URL}" "${PROXMOX_1_TOKEN}")
if [ -z "${auth1}" ]; then
log_error "Failed to authenticate to Site 1"
else
IFS='|' read -r auth_token1 auth_type1 <<< "${auth1}"
log_success "Authenticated to Site 1"
log ""
local result1
result1=$(process_site "${PROXMOX_1_URL}" "Site 1" "${auth_token1}" "${auth_type1}")
if [ -n "${result1}" ]; then
IFS='|' read -r vm_count enabled_count skipped_count failed_count <<< "${result1}"
total_vm_count=$((total_vm_count + vm_count))
total_enabled=$((total_enabled + enabled_count))
total_skipped=$((total_skipped + skipped_count))
total_failed=$((total_failed + failed_count))
fi
fi
# Site 2
local auth2
auth2=$(get_auth "${PROXMOX_2_URL}" "${PROXMOX_2_TOKEN}")
if [ -z "${auth2}" ]; then
log_error "Failed to authenticate to Site 2"
else
IFS='|' read -r auth_token2 auth_type2 <<< "${auth2}"
log_success "Authenticated to Site 2"
log ""
local result2
result2=$(process_site "${PROXMOX_2_URL}" "Site 2" "${auth_token2}" "${auth_type2}")
if [ -n "${result2}" ]; then
IFS='|' read -r vm_count enabled_count skipped_count failed_count <<< "${result2}"
total_vm_count=$((total_vm_count + vm_count))
total_enabled=$((total_enabled + enabled_count))
total_skipped=$((total_skipped + skipped_count))
total_failed=$((total_failed + failed_count))
fi
fi
log ""
log "=========================================="
log "Overall Summary"
log "=========================================="
log "Total VMs processed: ${total_vm_count}"
log_success "Guest agent enabled: ${total_enabled}"
log "Already enabled: ${total_skipped}"
if [ "${total_failed}" -gt 0 ]; then
log_error "Failed: ${total_failed}"
else
log_success "Failed: ${total_failed}"
fi
log ""
log "=========================================="
log_success "Guest agent enablement complete!"
log "=========================================="
log ""
log_warning "IMPORTANT: Guest agent must also be installed in the OS."
log ""
log "For existing VMs, you need to:"
log "1. Wait for VMs to get IP addresses"
log "2. SSH into each VM: ssh admin@<vm-ip>"
log "3. Install and enable guest agent:"
log " sudo apt-get update"
log " sudo apt-get install -y qemu-guest-agent"
log " sudo systemctl enable qemu-guest-agent"
log " sudo systemctl start qemu-guest-agent"
log ""
log "Note: New VMs created with the updated Crossplane provider will"
log "automatically have guest agent enabled in Proxmox config."
log ""
}
main "$@"

View File

@@ -0,0 +1,210 @@
#!/bin/bash
# enhance-all-vm-cloudinit.sh
# Enhances all VM YAML files with NTP, security hardening, and final_message
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
VM_DIR="$PROJECT_ROOT/examples/production"
log() {
echo -e "\033[0;34m[$(date +'%Y-%m-%d %H:%M:%S')]\033[0m $*"
}
log_success() {
echo -e "\033[0;32m[$(date +'%Y-%m-%d %H:%M:%S')] ✅\033[0m $*"
}
log_warning() {
echo -e "\033[1;33m[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️\033[0m $*"
}
log_error() {
echo -e "\033[0;31m[$(date +'%Y-%m-%d %H:%M:%S')] ❌\033[0m $*"
}
# Function to enhance a single VM YAML file
enhance_vm_file() {
local file=$1
local vm_name=$(basename "$file" .yaml)
log "Enhancing $vm_name..."
# Check if file already has chrony (already enhanced)
if grep -q "chrony" "$file" && grep -q "unattended-upgrades" "$file"; then
log_warning " $vm_name already enhanced, skipping"
return 0
fi
# Create backup
cp "$file" "${file}.backup"
# Add chrony and unattended-upgrades to packages list
if ! grep -q "chrony" "$file"; then
sed -i '/- lsb-release$/a\ - chrony\n - unattended-upgrades\n - apt-listchanges' "$file"
fi
# Add NTP configuration after package_upgrade
if ! grep -q "ntp:" "$file"; then
sed -i '/package_upgrade: true/a\ \n # Time synchronization (NTP)\n ntp:\n enabled: true\n ntp_client: chrony\n servers:\n - 0.pool.ntp.org\n - 1.pool.ntp.org\n - 2.pool.ntp.org\n - 3.pool.ntp.org' "$file"
fi
# Add security updates and NTP configuration to runcmd
# This is complex, so we'll use a Python script for this
python3 <<EOF
import yaml
import sys
with open("$file", 'r') as f:
content = f.read()
# Add security updates configuration before final_message
if "Configure automatic security updates" not in content:
security_updates = ''' # Configure automatic security updates
- |
echo "Configuring automatic security updates..."
cat > /etc/apt/apt.conf.d/50unattended-upgrades <<'EOF'
Unattended-Upgrade::Allowed-Origins {
"\\${distro_id}:\\${distro_codename}-security";
"\\${distro_id}ESMApps:\\${distro_codename}-apps-security";
"\\${distro_id}ESM:\\${distro_codename}-infra-security";
};
Unattended-Upgrade::AutoFixInterruptedDpkg "true";
Unattended-Upgrade::MinimalSteps "true";
Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
Unattended-Upgrade::Automatic-Reboot "false";
Unattended-Upgrade::Automatic-Reboot-Time "02:00";
EOF
systemctl enable unattended-upgrades
systemctl start unattended-upgrades
echo "Automatic security updates configured"
# Configure NTP (Chrony)
- |
echo "Configuring NTP (Chrony)..."
systemctl enable chrony
systemctl restart chrony
sleep 3
if systemctl is-active --quiet chrony; then
echo "NTP (Chrony) is running"
chronyc tracking | head -1 || true
else
echo "WARNING: NTP (Chrony) may not be running"
fi
# SSH hardening
- |
echo "Hardening SSH configuration..."
if ! grep -q "^PermitRootLogin no" /etc/ssh/sshd_config; then
sed -i 's/^#PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sed -i 's/^PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config
fi
if ! grep -q "^PasswordAuthentication no" /etc/ssh/sshd_config; then
sed -i 's/^#PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
sed -i 's/^PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
fi
if ! grep -q "^PubkeyAuthentication yes" /etc/ssh/sshd_config; then
sed -i 's/^#PubkeyAuthentication.*/PubkeyAuthentication yes/' /etc/ssh/sshd_config
fi
systemctl restart sshd
echo "SSH hardening completed"
'''
# Insert before final_message or at end of runcmd
if "final_message:" in content:
content = content.replace(" # Final message", security_updates + "\n # Final message")
elif "systemctl status qemu-guest-agent --no-pager || true" in content:
content = content.replace(" systemctl status qemu-guest-agent --no-pager || true",
" systemctl status qemu-guest-agent --no-pager || true" + security_updates)
# Add write_files section if not present
if "write_files:" not in content:
write_files = ''' # Write files for security configuration
write_files:
- path: /etc/apt/apt.conf.d/20auto-upgrades
content: |
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "7";
APT::Periodic::Unattended-Upgrade "1";
permissions: '0644'
owner: root:root
'''
if "final_message:" in content:
content = content.replace(" # Final message", write_files + " # Final message")
else:
content = content + "\n" + write_files
# Enhance final_message
if "final_message:" in content:
enhanced_final = ''' # Final message
final_message: |
==========================================
System Boot Completed Successfully!
==========================================
Services Status:
- QEMU Guest Agent: $(systemctl is-active qemu-guest-agent)
- NTP (Chrony): $(systemctl is-active chrony)
- Automatic Security Updates: $(systemctl is-active unattended-upgrades)
System Information:
- Hostname: $(hostname)
- IP Address: $(hostname -I | awk '{print $1}')
- Time: $(date)
Packages Installed:
- qemu-guest-agent, curl, wget, net-tools
- chrony (NTP), unattended-upgrades (Security)
Security Configuration:
- SSH: Root login disabled, Password auth disabled
- Automatic security updates: Enabled
- NTP synchronization: Enabled
Next Steps:
1. Verify all services are running
2. Check cloud-init logs: /var/log/cloud-init-output.log
3. Test SSH access
=========================================='''
# Replace existing final_message
import re
content = re.sub(r' # Final message.*?(?=\n providerConfigRef|\Z)', enhanced_final, content, flags=re.DOTALL)
with open("$file", 'w') as f:
f.write(content)
EOF
# Update package verification to include new packages
sed -i 's/for pkg in qemu-guest-agent curl wget net-tools; do/for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades; do/' "$file"
log_success " $vm_name enhanced"
}
echo "=========================================="
echo "Enhancing All VM Cloud-Init Configurations"
echo "=========================================="
echo ""
# Find all VM YAML files
VM_FILES=$(find "$VM_DIR" -name "*.yaml" -type f | grep -v ".backup" | sort)
TOTAL=$(echo "$VM_FILES" | wc -l)
COUNT=0
for file in $VM_FILES; do
COUNT=$((COUNT + 1))
enhance_vm_file "$file"
done
echo ""
echo "=========================================="
log_success "Enhanced $COUNT VM files"
echo "=========================================="
echo ""
log "Backup files created with .backup extension"
log "Review changes and remove backups when satisfied"

View File

@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
Enhance guest agent verification in all VM YAML templates.
Adds detailed verification commands matching the check script.
"""
import re
import sys
import os
from pathlib import Path
from datetime import datetime
# Enhanced verification block
ENHANCED_VERIFICATION = ''' # Verify packages are installed
- |
echo "=========================================="
echo "Verifying required packages are installed..."
echo "=========================================="
for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades; do
if ! dpkg -l | grep -q "^ii.*$pkg"; then
echo "ERROR: Package $pkg is not installed"
exit 1
fi
echo "✅ Package $pkg is installed"
done
echo "All required packages verified"
# Verify qemu-guest-agent package details
- |
echo "=========================================="
echo "Checking qemu-guest-agent package details..."
echo "=========================================="
if dpkg -l | grep -q "^ii.*qemu-guest-agent"; then
echo "✅ qemu-guest-agent package IS installed"
dpkg -l | grep qemu-guest-agent
else
echo "❌ qemu-guest-agent package is NOT installed"
echo "Attempting to install..."
apt-get update
apt-get install -y qemu-guest-agent
fi
# Enable and start QEMU Guest Agent
- |
echo "=========================================="
echo "Enabling and starting QEMU Guest Agent..."
echo "=========================================="
systemctl enable qemu-guest-agent
systemctl start qemu-guest-agent
echo "QEMU Guest Agent enabled and started"
# Verify guest agent service is running
- |
echo "=========================================="
echo "Verifying QEMU Guest Agent service status..."
echo "=========================================="
for i in {1..30}; do
if systemctl is-active --quiet qemu-guest-agent; then
echo "✅ QEMU Guest Agent service IS running"
systemctl status qemu-guest-agent --no-pager -l
exit 0
fi
echo "Waiting for QEMU Guest Agent to start... ($i/30)"
sleep 1
done
echo "⚠️ WARNING: QEMU Guest Agent may not have started properly"
systemctl status qemu-guest-agent --no-pager -l || true
echo "Attempting to restart..."
systemctl restart qemu-guest-agent
sleep 3
if systemctl is-active --quiet qemu-guest-agent; then
echo "✅ QEMU Guest Agent started after restart"
else
echo "❌ QEMU Guest Agent failed to start"
fi'''
def find_verification_block(content):
"""Find the old verification block in the content."""
# Pattern to match from "Verify packages" to end of guest agent verification
pattern = r'( # Verify packages are installed.*? systemctl status qemu-guest-agent --no-pager \|\| true)'
match = re.search(pattern, content, re.DOTALL)
return match
def enhance_file(file_path):
"""Enhance a single YAML file with improved verification."""
print(f"📝 Processing: {file_path}")
# Read file
try:
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
except Exception as e:
print(f"❌ Error reading {file_path}: {e}")
return False
# Check if file contains guest agent verification
if "Verifying required packages are installed" not in content:
print(f"⏭️ Skipping {file_path} (no guest agent verification found)")
return None
# Check if already enhanced
if "Checking qemu-guest-agent package details" in content:
print(f"✅ Already enhanced: {file_path}")
return None
# Find and replace
match = find_verification_block(content)
if not match:
print(f"⚠️ Could not find verification block in {file_path}")
return False
# Create backup
backup_path = f"{file_path}.backup-{datetime.now().strftime('%Y%m%d-%H%M%S')}"
try:
with open(backup_path, 'w', encoding='utf-8') as f:
f.write(content)
except Exception as e:
print(f"❌ Error creating backup: {e}")
return False
# Replace
new_content = content[:match.start()] + ENHANCED_VERIFICATION + content[match.end():]
# Write updated content
try:
with open(file_path, 'w', encoding='utf-8') as f:
f.write(new_content)
print(f"✅ Updated: {file_path}")
return True
except Exception as e:
print(f"❌ Error writing {file_path}: {e}")
# Restore from backup
try:
with open(backup_path, 'r', encoding='utf-8') as f:
with open(file_path, 'w', encoding='utf-8') as out:
out.write(f.read())
except:
pass
return False
def main():
"""Main function."""
script_dir = Path(__file__).parent
project_root = script_dir.parent
templates_dir = project_root / "examples" / "production"
if not templates_dir.exists():
print(f"❌ Templates directory not found: {templates_dir}")
sys.exit(1)
print("==========================================")
print("Enhancing Guest Agent Verification")
print("==========================================")
print(f"Target directory: {templates_dir}")
print()
# Find all YAML files (excluding backups)
yaml_files = sorted(templates_dir.rglob("*.yaml"))
yaml_files = [f for f in yaml_files if "backup" not in f.name]
if not yaml_files:
print("No YAML files found")
sys.exit(1)
updated_count = 0
skipped_count = 0
failed_count = 0
for file_path in yaml_files:
result = enhance_file(file_path)
if result is True:
updated_count += 1
elif result is None:
skipped_count += 1
else:
failed_count += 1
print()
print("==========================================")
print("Summary")
print("==========================================")
print(f"✅ Updated: {updated_count} files")
print(f"⏭️ Skipped: {skipped_count} files")
if failed_count > 0:
print(f"❌ Failed: {failed_count} files")
print()
print("Done!")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,220 @@
#!/bin/bash
# Enhance guest agent verification in all VM YAML templates
# Adds detailed verification commands matching the check script
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
TEMPLATES_DIR="$PROJECT_ROOT/examples/production"
# Enhanced verification block
ENHANCED_VERIFICATION=' # Verify packages are installed
- |
echo "=========================================="
echo "Verifying required packages are installed..."
echo "=========================================="
for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades; do
if ! dpkg -l | grep -q "^ii.*$pkg"; then
echo "ERROR: Package $pkg is not installed"
exit 1
fi
echo "✅ Package $pkg is installed"
done
echo "All required packages verified"
# Verify qemu-guest-agent package details
- |
echo "=========================================="
echo "Checking qemu-guest-agent package details..."
echo "=========================================="
if dpkg -l | grep -q "^ii.*qemu-guest-agent"; then
echo "✅ qemu-guest-agent package IS installed"
dpkg -l | grep qemu-guest-agent
else
echo "❌ qemu-guest-agent package is NOT installed"
echo "Attempting to install..."
apt-get update
apt-get install -y qemu-guest-agent
fi
# Enable and start QEMU Guest Agent
- |
echo "=========================================="
echo "Enabling and starting QEMU Guest Agent..."
echo "=========================================="
systemctl enable qemu-guest-agent
systemctl start qemu-guest-agent
echo "QEMU Guest Agent enabled and started"
# Verify guest agent service is running
- |
echo "=========================================="
echo "Verifying QEMU Guest Agent service status..."
echo "=========================================="
for i in {1..30}; do
if systemctl is-active --quiet qemu-guest-agent; then
echo "✅ QEMU Guest Agent service IS running"
systemctl status qemu-guest-agent --no-pager -l
exit 0
fi
echo "Waiting for QEMU Guest Agent to start... ($i/30)"
sleep 1
done
echo "⚠️ WARNING: QEMU Guest Agent may not have started properly"
systemctl status qemu-guest-agent --no-pager -l || true
echo "Attempting to restart..."
systemctl restart qemu-guest-agent
sleep 3
if systemctl is-active --quiet qemu-guest-agent; then
echo "✅ QEMU Guest Agent started after restart"
else
echo "❌ QEMU Guest Agent failed to start"
fi'
# Old verification block pattern (to be replaced)
OLD_PATTERN_START=' # Verify packages are installed'
OLD_PATTERN_END=' systemctl status qemu-guest-agent --no-pager || true'
echo "=========================================="
echo "Enhancing Guest Agent Verification"
echo "=========================================="
echo ""
echo "Target directory: $TEMPLATES_DIR"
echo ""
# Find all YAML files (excluding backups)
FILES=$(find "$TEMPLATES_DIR" -name "*.yaml" -type f ! -name "*.backup*" | sort)
if [ -z "$FILES" ]; then
echo "No YAML files found in $TEMPLATES_DIR"
exit 1
fi
UPDATED_COUNT=0
SKIPPED_COUNT=0
for file in $FILES; do
# Check if file contains the old verification pattern
if ! grep -q "Verifying required packages are installed" "$file"; then
echo "⏭️ Skipping $file (no guest agent verification found)"
((SKIPPED_COUNT++))
continue
fi
# Check if already enhanced
if grep -q "Checking qemu-guest-agent package details" "$file"; then
echo "✅ Already enhanced: $file"
continue
fi
echo "📝 Processing: $file"
# Create backup
cp "$file" "${file}.backup-$(date +%Y%m%d-%H%M%S)"
# Use Python for more reliable YAML manipulation
python3 <<PYTHON_SCRIPT
import re
import sys
file_path = "$file"
with open(file_path, 'r') as f:
content = f.read()
# Find the old verification block
old_pattern = r' # Verify packages are installed.*? systemctl status qemu-guest-agent --no-pager \|\| true'
# Enhanced verification block (properly indented)
enhanced = ''' # Verify packages are installed
- |
echo "=========================================="
echo "Verifying required packages are installed..."
echo "=========================================="
for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades; do
if ! dpkg -l | grep -q "^ii.*\\$pkg"; then
echo "ERROR: Package \\$pkg is not installed"
exit 1
fi
echo "✅ Package \\$pkg is installed"
done
echo "All required packages verified"
# Verify qemu-guest-agent package details
- |
echo "=========================================="
echo "Checking qemu-guest-agent package details..."
echo "=========================================="
if dpkg -l | grep -q "^ii.*qemu-guest-agent"; then
echo "✅ qemu-guest-agent package IS installed"
dpkg -l | grep qemu-guest-agent
else
echo "❌ qemu-guest-agent package is NOT installed"
echo "Attempting to install..."
apt-get update
apt-get install -y qemu-guest-agent
fi
# Enable and start QEMU Guest Agent
- |
echo "=========================================="
echo "Enabling and starting QEMU Guest Agent..."
echo "=========================================="
systemctl enable qemu-guest-agent
systemctl start qemu-guest-agent
echo "QEMU Guest Agent enabled and started"
# Verify guest agent service is running
- |
echo "=========================================="
echo "Verifying QEMU Guest Agent service status..."
echo "=========================================="
for i in {1..30}; do
if systemctl is-active --quiet qemu-guest-agent; then
echo "✅ QEMU Guest Agent service IS running"
systemctl status qemu-guest-agent --no-pager -l
exit 0
fi
echo "Waiting for QEMU Guest Agent to start... (\\$i/30)"
sleep 1
done
echo "⚠️ WARNING: QEMU Guest Agent may not have started properly"
systemctl status qemu-guest-agent --no-pager -l || true
echo "Attempting to restart..."
systemctl restart qemu-guest-agent
sleep 3
if systemctl is-active --quiet qemu-guest-agent; then
echo "✅ QEMU Guest Agent started after restart"
else
echo "❌ QEMU Guest Agent failed to start"
fi'''
# Try to match and replace
if re.search(old_pattern, content, re.DOTALL):
new_content = re.sub(old_pattern, enhanced, content, flags=re.DOTALL)
with open(file_path, 'w') as f:
f.write(new_content)
print(f"✅ Updated: {file_path}")
sys.exit(0)
else:
print(f"⚠️ Pattern not found in {file_path}")
sys.exit(1)
PYTHON_SCRIPT
if [ $? -eq 0 ]; then
((UPDATED_COUNT++))
else
echo "⚠️ Failed to update: $file"
fi
done
echo ""
echo "=========================================="
echo "Summary"
echo "=========================================="
echo "✅ Updated: $UPDATED_COUNT files"
echo "⏭️ Skipped: $SKIPPED_COUNT files"
echo ""
echo "Done!"

View File

@@ -0,0 +1,46 @@
#!/bin/bash
# enhance-vm-template.sh
# Template for enhancing VM YAML files - use as reference
# This script shows the pattern for enhancing VM YAML files
# Apply these changes to each VM file:
# 1. Add packages after lsb-release:
# - chrony
# - unattended-upgrades
# - apt-listchanges
# 2. Add NTP configuration after package_upgrade:
# # Time synchronization (NTP)
# ntp:
# enabled: true
# ntp_client: chrony
# servers:
# - 0.pool.ntp.org
# - 1.pool.ntp.org
# - 2.pool.ntp.org
# - 3.pool.ntp.org
# 3. Update package verification:
# for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades; do
# 4. Add security configuration before final_message:
# # Configure automatic security updates
# # Configure NTP (Chrony)
# # SSH hardening
# 5. Add write_files section before final_message:
# write_files:
# - path: /etc/apt/apt.conf.d/20auto-upgrades
# content: |
# APT::Periodic::Update-Package-Lists "1";
# APT::Periodic::Download-Upgradeable-Packages "1";
# APT::Periodic::AutocleanInterval "7";
# APT::Periodic::Unattended-Upgrade "1";
# permissions: '0644'
# owner: root:root
# 6. Enhance final_message with comprehensive status
echo "This is a template script - use as reference for manual updates"

258
scripts/enhance-vm-yaml.py Executable file
View File

@@ -0,0 +1,258 @@
#!/usr/bin/env python3
"""
Enhance VM YAML files with NTP, security hardening, and enhanced final_message
"""
import os
import sys
import yaml
import re
from pathlib import Path
def enhance_vm_yaml(file_path):
"""Enhance a single VM YAML file with all improvements"""
with open(file_path, 'r') as f:
content = f.read()
# Check if already enhanced
if 'chrony' in content and 'unattended-upgrades' in content and 'SSH hardening' in content:
print(f" ⚠️ {os.path.basename(file_path)} already enhanced, skipping")
return False
# Create backup
backup_path = f"{file_path}.backup"
with open(backup_path, 'w') as f:
f.write(content)
# Parse YAML
try:
data = yaml.safe_load(content)
except yaml.YAMLError as e:
print(f" ❌ Error parsing {file_path}: {e}")
return False
# Get userData
user_data = data['spec']['forProvider']['userData']
# Add chrony and unattended-upgrades to packages if not present
if 'chrony' not in user_data:
# Find packages section and add new packages
packages_pattern = r'packages:\s*\n((?:\s+- .+\n?)+)'
match = re.search(packages_pattern, user_data)
if match:
packages_block = match.group(1)
if 'chrony' not in packages_block:
# Add after lsb-release
user_data = re.sub(
r'(\s+- lsb-release\n)',
r'\1 - chrony\n - unattended-upgrades\n - apt-listchanges\n',
user_data
)
# Add NTP configuration if not present
if 'ntp:' not in user_data:
ntp_config = '''
# Time synchronization (NTP)
ntp:
enabled: true
ntp_client: chrony
servers:
- 0.pool.ntp.org
- 1.pool.ntp.org
- 2.pool.ntp.org
- 3.pool.ntp.org
'''
# Insert after package_upgrade
user_data = re.sub(
r'(package_upgrade: true\n)',
r'\1' + ntp_config,
user_data
)
# Update package verification to include new packages
user_data = re.sub(
r'for pkg in qemu-guest-agent curl wget net-tools(?: [^;]+)?;',
r'for pkg in qemu-guest-agent curl wget net-tools chrony unattended-upgrades',
user_data
)
# Add security updates, NTP, and SSH hardening to runcmd if not present
if 'Configure automatic security updates' not in user_data:
security_config = '''
# Configure automatic security updates
- |
echo "Configuring automatic security updates..."
cat > /etc/apt/apt.conf.d/50unattended-upgrades <<'EOF'
Unattended-Upgrade::Allowed-Origins {
"${distro_id}:${distro_codename}-security";
"${distro_id}ESMApps:${distro_codename}-apps-security";
"${distro_id}ESM:${distro_codename}-infra-security";
};
Unattended-Upgrade::AutoFixInterruptedDpkg "true";
Unattended-Upgrade::MinimalSteps "true";
Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
Unattended-Upgrade::Automatic-Reboot "false";
Unattended-Upgrade::Automatic-Reboot-Time "02:00";
EOF
systemctl enable unattended-upgrades
systemctl start unattended-upgrades
echo "Automatic security updates configured"
# Configure NTP (Chrony)
- |
echo "Configuring NTP (Chrony)..."
systemctl enable chrony
systemctl restart chrony
sleep 3
if systemctl is-active --quiet chrony; then
echo "NTP (Chrony) is running"
chronyc tracking | head -1 || true
else
echo "WARNING: NTP (Chrony) may not be running"
fi
# SSH hardening
- |
echo "Hardening SSH configuration..."
if ! grep -q "^PermitRootLogin no" /etc/ssh/sshd_config; then
sed -i 's/^#PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sed -i 's/^PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config
fi
if ! grep -q "^PasswordAuthentication no" /etc/ssh/sshd_config; then
sed -i 's/^#PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
sed -i 's/^PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
fi
if ! grep -q "^PubkeyAuthentication yes" /etc/ssh/sshd_config; then
sed -i 's/^#PubkeyAuthentication.*/PubkeyAuthentication yes/' /etc/ssh/sshd_config
fi
systemctl restart sshd
echo "SSH hardening completed"
'''
# Insert before final_message or at end of runcmd
if 'final_message:' in user_data:
user_data = re.sub(
r'(\n # Final message)',
security_config + r'\1',
user_data
)
else:
# Add at end of runcmd
user_data = re.sub(
r'(systemctl status qemu-guest-agent --no-pager \|\| true\n)',
r'\1' + security_config,
user_data
)
# Add write_files section if not present
if 'write_files:' not in user_data:
write_files = '''
# Write files for security configuration
write_files:
- path: /etc/apt/apt.conf.d/20auto-upgrades
content: |
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "7";
APT::Periodic::Unattended-Upgrade "1";
permissions: '0644'
owner: root:root
'''
if 'final_message:' in user_data:
user_data = re.sub(
r'(\n # Final message)',
write_files + r'\1',
user_data
)
else:
user_data += write_files
# Enhance final_message
enhanced_final = ''' # Final message
final_message: |
==========================================
System Boot Completed Successfully!
==========================================
Services Status:
- QEMU Guest Agent: $(systemctl is-active qemu-guest-agent)
- NTP (Chrony): $(systemctl is-active chrony)
- Automatic Security Updates: $(systemctl is-active unattended-upgrades)
System Information:
- Hostname: $(hostname)
- IP Address: $(hostname -I | awk '{print $1}')
- Time: $(date)
Packages Installed:
- qemu-guest-agent, curl, wget, net-tools
- chrony (NTP), unattended-upgrades (Security)
Security Configuration:
- SSH: Root login disabled, Password auth disabled
- Automatic security updates: Enabled
- NTP synchronization: Enabled
Next Steps:
1. Verify all services are running
2. Check cloud-init logs: /var/log/cloud-init-output.log
3. Test SSH access
=========================================='''
# Replace existing final_message
if 'final_message:' in user_data:
user_data = re.sub(
r' # Final message.*?(?=\n providerConfigRef|\Z)',
enhanced_final,
user_data,
flags=re.DOTALL
)
else:
# Add final_message before providerConfigRef
user_data = re.sub(
r'(\n providerConfigRef:)',
'\n' + enhanced_final + r'\1',
user_data
)
# Update data structure
data['spec']['forProvider']['userData'] = user_data
# Write back
with open(file_path, 'w') as f:
yaml.dump(data, f, default_flow_style=False, sort_keys=False, allow_unicode=True)
return True
def main():
if len(sys.argv) < 2:
print("Usage: python3 enhance-vm-yaml.py <yaml_file> [<yaml_file> ...]")
sys.exit(1)
files_enhanced = 0
files_skipped = 0
for file_path in sys.argv[1:]:
if not os.path.exists(file_path):
print(f"❌ File not found: {file_path}")
continue
print(f"Processing {os.path.basename(file_path)}...")
if enhance_vm_yaml(file_path):
files_enhanced += 1
print(f" ✅ Enhanced")
else:
files_skipped += 1
print(f"\n==========================================")
print(f"Enhanced: {files_enhanced} files")
print(f"Skipped: {files_skipped} files")
print(f"==========================================")
if __name__ == '__main__':
main()

42
scripts/extract_userdata.py Executable file
View File

@@ -0,0 +1,42 @@
#!/usr/bin/env python3
import sys
import re
def extract_userdata(file_path, vm_name, vmid):
with open(file_path, 'r') as f:
content = f.read()
# Find the section for this VM
# Pattern: VM: <name> (VMID: <id>) followed by separator, then userData until next separator
pattern = rf'VM: {re.escape(vm_name)}.*?VMID: {vmid}.*?==========================================\n(.*?)=========================================='
match = re.search(pattern, content, re.DOTALL)
if match:
userdata = match.group(1).strip()
# Remove any ANSI color codes if present
userdata = re.sub(r'\x1b\[[0-9;]*m', '', userdata)
return userdata
# Try alternative pattern without strict VM name match
pattern2 = rf'VM:.*?VMID: {vmid}.*?==========================================\n(.*?)=========================================='
match2 = re.search(pattern2, content, re.DOTALL)
if match2:
userdata = match2.group(1).strip()
userdata = re.sub(r'\x1b\[[0-9;]*m', '', userdata)
return userdata
return ""
if __name__ == "__main__":
if len(sys.argv) != 4:
print("Usage: extract_userdata.py <file> <vm_name> <vmid>", file=sys.stderr)
sys.exit(1)
file_path = sys.argv[1]
vm_name = sys.argv[2]
vmid = sys.argv[3]
userdata = extract_userdata(file_path, vm_name, vmid)
if userdata:
print(userdata)

114
scripts/fix-all-vm-boot.sh Executable file
View File

@@ -0,0 +1,114 @@
#!/bin/bash
# fix-all-vm-boot.sh
# Fix "Nothing to boot" issue for all VMs by importing OS image
set -euo pipefail
PROXMOX_1_HOST="192.168.11.10"
PROXMOX_2_HOST="192.168.11.11"
PROXMOX_PASS="L@kers2010"
SITE1_VMS="136 139 141 142 145 146 150 151"
SITE2_VMS="101 104 137 138 144 148"
IMAGE_PATH="/var/lib/vz/template/iso/ubuntu-22.04-cloud.img"
STORAGE="local-lvm"
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
fix_vm() {
local host=$1
local vmid=$2
local vmname=$3
echo "Processing VMID $vmid ($vmname)..."
# Stop VM
status=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm status $vmid 2>&1 | grep -oP 'status: \\K\\w+' || echo 'stopped'")
if [ "$status" = "running" ]; then
echo " Stopping VM..."
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm stop $vmid" 2>&1 | head -1 || true
sleep 3
fi
# Unlock if locked
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm unlock $vmid" 2>&1 | head -1 || true
# Check if disk has data
disk_usage=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"lvs | grep \"vm-${vmid}-disk-0\" | awk '{print \$6}'" 2>/dev/null || echo "0.00")
if [ "$disk_usage" != "0.00" ] && [ -n "$disk_usage" ]; then
echo " ✅ Disk already has data ($disk_usage used)"
echo " Starting VM..."
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm start $vmid" 2>&1 | head -1 || true
return 0
fi
# Import image to a temporary disk first
echo " Importing OS image..."
result=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm importdisk $vmid $IMAGE_PATH $STORAGE --format raw 2>&1" || true)
sleep 5
# Find the imported disk (should be disk-1 or higher)
imported_disk=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"lvs | grep \"vm-${vmid}-disk\" | grep -v 'cloudinit' | grep -v 'disk-0' | tail -1 | awk '{print \$1}'" 2>/dev/null || echo "")
if [ -n "$imported_disk" ]; then
echo " Found imported disk: $imported_disk"
echo " Copying to main disk..."
# Copy the imported disk to the main disk
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"dd if=/dev/pve/${imported_disk} of=/dev/pve/vm-${vmid}-disk-0 bs=4M status=progress 2>&1 | tail -3" || true
echo " Copy complete"
else
echo " ⚠️ No imported disk found, trying direct import..."
fi
# Ensure boot order
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm set $vmid --boot order=scsi0" 2>&1 | head -1 || true
echo " Starting VM..."
sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm start $vmid" 2>&1 | head -1 || true
echo " ✅ VM started"
}
echo "=========================================="
echo "Fix VM Boot Issue - Import OS Image"
echo "=========================================="
echo ""
echo "Site 1 (ml110-01):"
for vmid in $SITE1_VMS; do
name=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_1_HOST \
"qm config $vmid | grep '^name:' | cut -d' ' -f2" || echo "unknown")
fix_vm "$PROXMOX_1_HOST" "$vmid" "$name"
echo ""
done
echo "Site 2 (r630-01):"
for vmid in $SITE2_VMS; do
name=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_2_HOST \
"qm config $vmid | grep '^name:' | cut -d' ' -f2" || echo "unknown")
fix_vm "$PROXMOX_2_HOST" "$vmid" "$name"
echo ""
done
echo "=========================================="
echo "Boot fix complete!"
echo "=========================================="

View File

@@ -0,0 +1,91 @@
#!/bin/bash
# Fix guest agent configuration for VM 100
# Run on Proxmox node: root@ml110-01
set -e
VMID=100
echo "=========================================="
echo "Fixing Guest Agent for VM 100"
echo "=========================================="
echo ""
# Step 1: Check current status
echo "Step 1: Current Configuration"
echo "--------------------------------------"
echo "Checking if agent is configured..."
AGENT_CONFIG=$(qm config $VMID | grep '^agent:' || echo "")
if [ -z "$AGENT_CONFIG" ]; then
echo "❌ Guest agent NOT configured"
else
echo "Current: $AGENT_CONFIG"
fi
echo ""
# Step 2: Set guest agent
echo "Step 2: Setting Guest Agent"
echo "--------------------------------------"
echo "Setting agent=1..."
qm set $VMID --agent 1
if [ $? -eq 0 ]; then
echo "✅ Guest agent enabled"
else
echo "❌ Failed to set guest agent"
exit 1
fi
echo ""
# Step 3: Verify the setting
echo "Step 3: Verification"
echo "--------------------------------------"
AGENT_VERIFY=$(qm config $VMID | grep '^agent:' || echo "")
if [ -z "$AGENT_VERIFY" ]; then
echo "❌ ERROR: Guest agent still not configured!"
echo ""
echo "Trying alternative method..."
# Try setting via config file directly
CONFIG_FILE="/etc/pve/qemu-server/${VMID}.conf"
if [ -f "$CONFIG_FILE" ]; then
if ! grep -q "^agent:" "$CONFIG_FILE"; then
echo "agent: 1" >> "$CONFIG_FILE"
echo "✅ Added agent: 1 to config file"
else
sed -i 's/^agent:.*/agent: 1/' "$CONFIG_FILE"
echo "✅ Updated agent in config file"
fi
fi
else
echo "✅ Verified: $AGENT_VERIFY"
fi
echo ""
# Step 4: Show full agent-related config
echo "Step 4: Full Agent Configuration"
echo "--------------------------------------"
qm config $VMID | grep -E 'agent|qemu' || echo "No agent/qemu config found"
echo ""
# Step 5: Check if VM needs to be restarted
echo "Step 5: VM Status"
echo "--------------------------------------"
VM_STATUS=$(qm status $VMID | awk '{print $2}')
echo "VM Status: $VM_STATUS"
if [ "$VM_STATUS" = "running" ]; then
echo ""
echo "⚠️ VM is running - guest agent setting will take effect after restart"
echo " To apply immediately, restart the VM:"
echo " qm shutdown $VMID && sleep 5 && qm start $VMID"
else
echo "✅ VM is stopped - guest agent will be active on next start"
fi
echo ""
echo "=========================================="
echo "Guest Agent Fix Complete"
echo "=========================================="
echo ""
echo "Final verification:"
echo " qm config $VMID | grep '^agent:'"
echo ""

78
scripts/fix-vm-lock.sh Executable file
View File

@@ -0,0 +1,78 @@
#!/bin/bash
# Fix VM lock issues on Proxmox nodes
set -e
VMID=${1:-100}
NODE=${2:-ml110-01}
echo "=== Fixing VM Lock for VMID $VMID on $NODE ==="
echo ""
# Load environment variables
if [ -f .env ]; then
source .env
else
echo "ERROR: .env file not found"
exit 1
fi
echo "1. Checking VM status:"
echo "----------------------------------------"
sshpass -p "$PROXMOX_ROOT_PASS" ssh -o StrictHostKeyChecking=no root@$NODE "qm status $VMID" || echo " ⚠️ Could not get VM status"
echo ""
echo "2. Checking for lock file:"
echo "----------------------------------------"
LOCK_FILE="/var/lock/qemu-server/lock-$VMID.conf"
if sshpass -p "$PROXMOX_ROOT_PASS" ssh -o StrictHostKeyChecking=no root@$NODE "test -f $LOCK_FILE"; then
echo " ⚠️ Lock file exists: $LOCK_FILE"
echo " Lock file contents:"
sshpass -p "$PROXMOX_ROOT_PASS" ssh -o StrictHostKeyChecking=no root@$NODE "cat $LOCK_FILE" || echo " Could not read lock file"
else
echo " ✅ No lock file found"
fi
echo ""
echo "3. Checking for stuck processes:"
echo "----------------------------------------"
PROCS=$(sshpass -p "$PROXMOX_ROOT_PASS" ssh -o StrictHostKeyChecking=no root@$NODE "ps aux | grep -E 'qm (start|stop|destroy|set)' | grep -v grep || echo ''")
if [ -n "$PROCS" ]; then
echo " ⚠️ Found processes:"
echo "$PROCS" | sed 's/^/ /'
else
echo " ✅ No stuck processes found"
fi
echo ""
echo "4. Attempting to unlock VM:"
echo "----------------------------------------"
if sshpass -p "$PROXMOX_ROOT_PASS" ssh -o StrictHostKeyChecking=no root@$NODE "qm unlock $VMID" 2>&1; then
echo " ✅ Unlock command executed"
else
echo " ⚠️ Unlock command failed or VM not locked"
fi
echo ""
echo "5. Removing lock file (if exists):"
echo "----------------------------------------"
if sshpass -p "$PROXMOX_ROOT_PASS" ssh -o StrictHostKeyChecking=no root@$NODE "rm -f $LOCK_FILE" 2>&1; then
echo " ✅ Lock file removed (if it existed)"
else
echo " ⚠️ Could not remove lock file"
fi
echo ""
echo "6. Final VM status:"
echo "----------------------------------------"
sshpass -p "$PROXMOX_ROOT_PASS" ssh -o StrictHostKeyChecking=no root@$NODE "qm status $VMID" || echo " ⚠️ Could not get VM status"
echo ""
echo "=== Done ==="
echo ""
echo "If issues persist, you may need to:"
echo "1. Stop the VM: qm stop $VMID"
echo "2. Wait a few seconds"
echo "3. Remove lock file: rm -f $LOCK_FILE"
echo "4. Retry the operation"

117
scripts/force-cleanup-vms.sh Executable file
View File

@@ -0,0 +1,117 @@
#!/bin/bash
# Force cleanup script for orphaned VMs with lock files
# This script attempts to remove lock files and force delete VMs
set -e
PROXMOX_ENDPOINT="${PROXMOX_ENDPOINT:-https://192.168.11.10:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-ml110-01}"
PROXMOX_USER="${PROXMOX_USER:-}"
PROXMOX_PASS="${PROXMOX_PASS:-}"
PROXMOX_SSH_HOST="${PROXMOX_SSH_HOST:-192.168.11.10}"
PROXMOX_SSH_USER="${PROXMOX_SSH_USER:-root}"
ORPHANED_VMS=(234 235 100 101 102)
if [ -z "$PROXMOX_USER" ] || [ -z "$PROXMOX_PASS" ]; then
echo "Error: PROXMOX_USER and PROXMOX_PASS must be set"
exit 1
fi
echo "Force cleanup of orphaned VMs with lock files"
echo "=============================================="
echo ""
# Get authentication ticket
TICKET=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.ticket // empty')
if [ -z "$TICKET" ]; then
echo "Error: Failed to authenticate with Proxmox"
exit 1
fi
CSRF_TOKEN=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.CSRFPreventionToken // empty')
echo "Attempting to remove lock files and delete VMs..."
echo ""
for VMID in "${ORPHANED_VMS[@]}"; do
echo "Processing VM $VMID..."
# Check if VM exists
VM_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -n "$VM_EXISTS" ] && [ "$VM_EXISTS" != "null" ]; then
echo " VM $VMID exists"
# Try multiple unlock attempts
for i in {1..3}; do
echo " Unlock attempt $i..."
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/unlock" > /dev/null
sleep 2
done
# Try to remove lock file via API (if supported) or provide manual instructions
echo " Attempting to delete VM $VMID..."
# Try delete with different parameters
DELETE_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X DELETE \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}?purge=1&skiplock=1" 2>&1)
TASK_UPID=$(echo "$DELETE_RESULT" | jq -r '.data // empty' 2>/dev/null)
if [ -n "$TASK_UPID" ] && [ "$TASK_UPID" != "null" ]; then
echo " Delete task started: $TASK_UPID"
sleep 5
# Check if VM still exists
VM_STILL_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_STILL_EXISTS" ] || [ "$VM_STILL_EXISTS" = "null" ]; then
echo " ✅ VM $VMID deleted successfully"
else
echo " ⚠️ VM $VMID still exists - lock file may need manual removal"
echo " Manual cleanup required:"
echo " ssh ${PROXMOX_SSH_USER}@${PROXMOX_SSH_HOST}"
echo " rm -f /var/lock/qemu-server/lock-${VMID}.conf"
echo " qm destroy ${VMID} --purge"
fi
else
echo " ⚠️ Failed to start delete task"
echo " Manual cleanup required:"
echo " ssh ${PROXMOX_SSH_USER}@${PROXMOX_SSH_HOST}"
echo " rm -f /var/lock/qemu-server/lock-${VMID}.conf"
echo " qm destroy ${VMID} --purge"
fi
else
echo " VM $VMID not found (already deleted)"
fi
echo ""
done
echo "Cleanup attempt complete!"
echo ""
echo "If VMs still exist, manual cleanup is required:"
echo "1. SSH into the Proxmox node:"
echo " ssh ${PROXMOX_SSH_USER}@${PROXMOX_SSH_HOST}"
echo ""
echo "2. For each VM, run:"
for VMID in "${ORPHANED_VMS[@]}"; do
echo " rm -f /var/lock/qemu-server/lock-${VMID}.conf"
echo " qm destroy ${VMID} --purge"
done

View File

@@ -0,0 +1,129 @@
#!/bin/bash
# Aggressively force delete all VMs
PROXMOX_PASS="L@kers2010"
SITE1="192.168.11.10"
SITE2="192.168.11.11"
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
force_delete_host() {
local host=$1
local site_name=$2
log "=========================================="
log "Force deleting all VMs on ${site_name} (${host})"
log "=========================================="
# Kill all qm processes
log "Killing all qm processes..."
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "pkill -9 -f 'qm '"
sleep 2
# Remove all lock files
log "Removing all lock files..."
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "rm -rf /var/lock/qemu-server/lock-*.conf"
sleep 1
# Get VM list
log "Getting VM list..."
local vm_list
vm_list=$(sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "qm list 2>/dev/null | tail -n +2 | awk '{print \$1}'")
if [ -z "${vm_list}" ]; then
log_warning "No VMs found"
return 0
fi
# Force delete each VM
for vmid in $vm_list; do
log "Force deleting VM ${vmid}..."
# Remove lock file
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "rm -f /var/lock/qemu-server/lock-${vmid}.conf" 2>/dev/null
# Stop VM forcefully
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "qm stop ${vmid} --skiplock" 2>/dev/null || true
sleep 1
# Delete VM config
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "rm -f /etc/pve/qemu-server/${vmid}.conf" 2>/dev/null
# Delete VM with purge and skip lock
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "qm destroy ${vmid} --purge --skiplock" 2>&1 | grep -v "trying to acquire lock" || true
# If still exists, remove config and disks manually
sleep 2
local still_exists
still_exists=$(sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "qm list 2>/dev/null | grep -c \"^[[:space:]]*${vmid}\" || echo 0")
if [ "${still_exists}" -gt 0 ]; then
log_warning "VM ${vmid} still exists, removing manually..."
# Remove config
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "rm -f /etc/pve/qemu-server/${vmid}.conf" 2>/dev/null
# Remove lock
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "rm -f /var/lock/qemu-server/lock-${vmid}.conf" 2>/dev/null
# Find and remove disks
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${host} "lvs | grep vm-${vmid} | awk '{print \$1}' | xargs -r lvremove -f" 2>/dev/null || true
else
log_success "VM ${vmid} deleted"
fi
done
log_success "Completed force deletion on ${site_name}"
}
main() {
log "=========================================="
log "Aggressive Force Delete All VMs"
log "=========================================="
log ""
log_warning "This will forcefully delete ALL VMs on both hosts"
log ""
# Force delete Site 1
force_delete_host "${SITE1}" "Site 1 (ml110-01)"
echo ""
# Force delete Site 2
force_delete_host "${SITE2}" "Site 2 (r630-01)"
echo ""
log "=========================================="
log "Verifying deletion..."
echo ""
log "Site 1 VMs remaining:"
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE1} "qm list"
echo ""
log "Site 2 VMs remaining:"
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE2} "qm list"
echo ""
log "=========================================="
log_success "Force deletion completed!"
}
main "$@"

View File

@@ -0,0 +1,152 @@
#!/bin/bash
# Aggressive force removal of all remaining VMs
set -e
PROXMOX_ENDPOINT="${PROXMOX_ENDPOINT:-https://192.168.11.10:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-ml110-01}"
PROXMOX_USER="${PROXMOX_USER:-}"
PROXMOX_PASS="${PROXMOX_PASS:-}"
if [ -z "$PROXMOX_USER" ] || [ -z "$PROXMOX_PASS" ]; then
echo "Error: PROXMOX_USER and PROXMOX_PASS must be set"
exit 1
fi
echo "FORCE REMOVING ALL REMAINING VMs"
echo "================================"
echo ""
# Get authentication
TICKET=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.ticket // empty')
if [ -z "$TICKET" ]; then
echo "Error: Failed to authenticate"
exit 1
fi
CSRF_TOKEN=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.CSRFPreventionToken // empty')
# Get list of all VMs
echo "Fetching list of all VMs..."
ALL_VMS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu" | \
jq -r '.data[] | .vmid' | sort -n)
if [ -z "$ALL_VMS" ]; then
echo "No VMs found. All clean!"
exit 0
fi
VM_COUNT=$(echo "$ALL_VMS" | wc -l)
echo "Found $VM_COUNT VM(s) to remove"
echo ""
SUCCESS_COUNT=0
FAILED_COUNT=0
for VMID in $ALL_VMS; do
echo "=== Processing VM $VMID ==="
# Aggressive unlock - multiple attempts with longer delays
echo " Unlocking VM (multiple attempts)..."
for i in 1 2 3 4 5 6 7 8 9 10; do
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/unlock" > /dev/null 2>&1
sleep 1
done
# Ensure stopped
echo " Ensuring VM is stopped..."
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/stop" > /dev/null 2>&1
sleep 3
# Try delete with all possible parameters
echo " Attempting force delete..."
DELETE_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X DELETE \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}?purge=1&skiplock=1" 2>&1)
TASK_UPID=$(echo "$DELETE_RESULT" | jq -r '.data // empty' 2>/dev/null)
if [ -n "$TASK_UPID" ] && [ "$TASK_UPID" != "null" ]; then
echo " Delete task started: $TASK_UPID"
echo " Waiting for completion (up to 60 seconds)..."
DELETED=false
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60; do
sleep 1
TASK_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.status // "unknown"')
if [ "$TASK_STATUS" = "stopped" ]; then
EXIT_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.exitstatus // "unknown"')
ERROR_MSG=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.exitstatus // empty')
if [ "$EXIT_STATUS" = "OK" ] || [ "$EXIT_STATUS" = "0" ]; then
sleep 3
VM_STILL_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_STILL_EXISTS" ] || [ "$VM_STILL_EXISTS" = "null" ]; then
echo " ✅ VM $VMID deleted successfully"
SUCCESS_COUNT=$((SUCCESS_COUNT + 1))
DELETED=true
else
echo " ⚠️ Task completed but VM still exists"
fi
else
echo " ⚠️ Task failed with status: $EXIT_STATUS"
if echo "$EXIT_STATUS" | grep -q "lock"; then
echo " ❌ Lock file issue - requires manual cleanup"
fi
fi
break
fi
done
if [ "$DELETED" = "false" ]; then
FAILED_COUNT=$((FAILED_COUNT + 1))
fi
else
echo " ❌ Failed to start delete task"
echo " Response: $DELETE_RESULT"
FAILED_COUNT=$((FAILED_COUNT + 1))
fi
echo ""
done
echo "========================================="
echo "Summary:"
echo " Successfully deleted: $SUCCESS_COUNT"
echo " Failed: $FAILED_COUNT"
echo " Total processed: $VM_COUNT"
echo ""
if [ $FAILED_COUNT -gt 0 ]; then
echo "⚠️ Some VMs failed to delete due to lock file issues."
echo "Manual cleanup required via SSH:"
echo " ssh root@192.168.11.10"
echo ""
echo "For each failed VM, run:"
echo " rm -f /var/lock/qemu-server/lock-<VMID>.conf"
echo " qm destroy <VMID> --purge"
fi

147
scripts/force-remove-vms-batch.sh Executable file
View File

@@ -0,0 +1,147 @@
#!/bin/bash
# Force remove VMs in batches
set -e
PROXMOX_ENDPOINT="${PROXMOX_ENDPOINT:-https://192.168.11.10:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-ml110-01}"
PROXMOX_USER="${PROXMOX_USER:-}"
PROXMOX_PASS="${PROXMOX_PASS:-}"
PROXMOX_SSH_HOST="${PROXMOX_SSH_HOST:-192.168.11.10}"
PROXMOX_SSH_USER="${PROXMOX_SSH_USER:-root}"
# Generate VM IDs: 103-145 and 211-233
ORPHANED_VMS=()
for i in {103..145}; do
ORPHANED_VMS+=($i)
done
for i in {211..233}; do
ORPHANED_VMS+=($i)
done
if [ -z "$PROXMOX_USER" ] || [ -z "$PROXMOX_PASS" ]; then
echo "Error: PROXMOX_USER and PROXMOX_PASS must be set"
exit 1
fi
echo "FORCE REMOVING VMs: 103-145 and 211-233"
echo "========================================="
echo "Total VMs to process: ${#ORPHANED_VMS[@]}"
echo ""
# Get authentication
TICKET=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.ticket // empty')
if [ -z "$TICKET" ]; then
echo "Error: Failed to authenticate"
exit 1
fi
CSRF_TOKEN=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.CSRFPreventionToken // empty')
SUCCESS_COUNT=0
FAILED_COUNT=0
NOT_FOUND_COUNT=0
echo "Starting batch deletion..."
echo ""
for VMID in "${ORPHANED_VMS[@]}"; do
echo -n "VM $VMID: "
# Check if VM exists
VM_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_EXISTS" ] || [ "$VM_EXISTS" = "null" ]; then
echo "not found (already deleted)"
((NOT_FOUND_COUNT++))
continue
fi
# Multiple unlock attempts
for i in {1..3}; do
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/unlock" > /dev/null 2>&1
sleep 1
done
# Ensure stopped
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/stop" > /dev/null 2>&1
sleep 2
# Force delete with purge
DELETE_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X DELETE \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}?purge=1&skiplock=1" 2>&1)
TASK_UPID=$(echo "$DELETE_RESULT" | jq -r '.data // empty' 2>/dev/null)
if [ -n "$TASK_UPID" ] && [ "$TASK_UPID" != "null" ]; then
# Wait for task completion (shorter timeout for batch processing)
for i in {1..30}; do
sleep 1
TASK_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.status // "unknown"')
if [ "$TASK_STATUS" = "stopped" ]; then
EXIT_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.exitstatus // "unknown"')
if [ "$EXIT_STATUS" = "OK" ] || [ "$EXIT_STATUS" = "0" ]; then
# Verify deletion (wait a bit longer for cleanup)
sleep 2
VM_STILL_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_STILL_EXISTS" ] || [ "$VM_STILL_EXISTS" = "null" ]; then
echo "✅ deleted"
((SUCCESS_COUNT++))
else
echo "⚠️ task completed but VM still exists (may need retry)"
((FAILED_COUNT++))
fi
else
echo "⚠️ task failed (status: $EXIT_STATUS)"
((FAILED_COUNT++))
fi
break
fi
done
else
echo "❌ failed to start delete task"
((FAILED_COUNT++))
fi
done
echo ""
echo "========================================="
echo "Batch deletion complete!"
echo " Successfully deleted: $SUCCESS_COUNT"
echo " Failed: $FAILED_COUNT"
echo " Not found (already deleted): $NOT_FOUND_COUNT"
echo " Total processed: ${#ORPHANED_VMS[@]}"
echo ""
if [ $FAILED_COUNT -gt 0 ]; then
echo "Some VMs failed to delete. Manual cleanup may be required:"
echo "ssh ${PROXMOX_SSH_USER}@${PROXMOX_SSH_HOST}"
echo "For each failed VM:"
echo " rm -f /var/lock/qemu-server/lock-<VMID>.conf"
echo " qm destroy <VMID> --purge"
fi

133
scripts/force-remove-vms-fast.sh Executable file
View File

@@ -0,0 +1,133 @@
#!/bin/bash
# Fast force remove VMs - optimized for batch processing
set -e
PROXMOX_ENDPOINT="${PROXMOX_ENDPOINT:-https://192.168.11.10:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-ml110-01}"
PROXMOX_USER="${PROXMOX_USER:-}"
PROXMOX_PASS="${PROXMOX_PASS:-}"
# Generate VM IDs: 103-145 and 211-233
ORPHANED_VMS=()
for i in {103..145}; do
ORPHANED_VMS+=($i)
done
for i in {211..233}; do
ORPHANED_VMS+=($i)
done
if [ -z "$PROXMOX_USER" ] || [ -z "$PROXMOX_PASS" ]; then
echo "Error: PROXMOX_USER and PROXMOX_PASS must be set"
exit 1
fi
echo "FORCE REMOVING VMs: 103-145 and 211-233"
echo "Total VMs: ${#ORPHANED_VMS[@]}"
echo ""
# Get authentication
TICKET=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.ticket // empty')
if [ -z "$TICKET" ]; then
echo "Error: Failed to authenticate"
exit 1
fi
CSRF_TOKEN=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.CSRFPreventionToken // empty')
SUCCESS_COUNT=0
FAILED_COUNT=0
NOT_FOUND_COUNT=0
PROCESSED=0
for VMID in "${ORPHANED_VMS[@]}"; do
PROCESSED=$((PROCESSED + 1))
echo -n "[$PROCESSED/${#ORPHANED_VMS[@]}] VM $VMID: "
# Quick check if VM exists
VM_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_EXISTS" ] || [ "$VM_EXISTS" = "null" ]; then
echo "not found"
NOT_FOUND_COUNT=$((NOT_FOUND_COUNT + 1))
continue
fi
# Quick unlock (single attempt)
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/unlock" > /dev/null 2>&1
# Stop if running
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/stop" > /dev/null 2>&1
sleep 1
# Delete with purge
DELETE_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X DELETE \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}?purge=1&skiplock=1" 2>&1)
TASK_UPID=$(echo "$DELETE_RESULT" | jq -r '.data // empty' 2>/dev/null)
if [ -n "$TASK_UPID" ] && [ "$TASK_UPID" != "null" ]; then
# Wait for completion (shorter timeout)
DELETED=false
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do
sleep 1
TASK_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.status // "unknown"')
if [ "$TASK_STATUS" = "stopped" ]; then
EXIT_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.exitstatus // "unknown"')
if [ "$EXIT_STATUS" = "OK" ] || [ "$EXIT_STATUS" = "0" ]; then
sleep 1
VM_STILL_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_STILL_EXISTS" ] || [ "$VM_STILL_EXISTS" = "null" ]; then
echo "✅"
SUCCESS_COUNT=$((SUCCESS_COUNT + 1))
DELETED=true
fi
fi
break
fi
done
if [ "$DELETED" = "false" ]; then
echo "⚠️"
FAILED_COUNT=$((FAILED_COUNT + 1))
fi
else
echo "❌"
FAILED_COUNT=$((FAILED_COUNT + 1))
fi
done
echo ""
echo "========================================="
echo "Summary:"
echo " Successfully deleted: $SUCCESS_COUNT"
echo " Failed: $FAILED_COUNT"
echo " Not found: $NOT_FOUND_COUNT"
echo " Total: ${#ORPHANED_VMS[@]}"
echo ""

147
scripts/force-remove-vms.sh Executable file
View File

@@ -0,0 +1,147 @@
#!/bin/bash
# Force remove orphaned VMs by removing lock files and using destructive methods
set -e
PROXMOX_ENDPOINT="${PROXMOX_ENDPOINT:-https://192.168.11.10:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-ml110-01}"
PROXMOX_USER="${PROXMOX_USER:-}"
PROXMOX_PASS="${PROXMOX_PASS:-}"
PROXMOX_SSH_HOST="${PROXMOX_SSH_HOST:-192.168.11.10}"
PROXMOX_SSH_USER="${PROXMOX_SSH_USER:-root}"
ORPHANED_VMS=(234 235 100 101 102)
if [ -z "$PROXMOX_USER" ] || [ -z "$PROXMOX_PASS" ]; then
echo "Error: PROXMOX_USER and PROXMOX_PASS must be set"
exit 1
fi
echo "FORCE REMOVING ORPHANED VMs"
echo "==========================="
echo ""
# Get authentication
TICKET=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.ticket // empty')
if [ -z "$TICKET" ]; then
echo "Error: Failed to authenticate"
exit 1
fi
CSRF_TOKEN=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.CSRFPreventionToken // empty')
# Try to remove lock files via storage API or direct file manipulation
echo "Attempting to remove lock files and force delete VMs..."
echo ""
for VMID in "${ORPHANED_VMS[@]}"; do
echo "=== Processing VM $VMID ==="
# Check if VM exists
VM_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_EXISTS" ] || [ "$VM_EXISTS" = "null" ]; then
echo " VM $VMID not found (already deleted)"
echo ""
continue
fi
echo " VM $VMID exists, attempting force removal..."
# Method 1: Try multiple unlock attempts with delays
for i in {1..5}; do
echo " Unlock attempt $i/5..."
UNLOCK_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/unlock" 2>&1)
sleep 3
done
# Method 2: Try to stop any running processes first
echo " Ensuring VM is stopped..."
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/stop" > /dev/null 2>&1
sleep 5
# Method 3: Try delete with all possible parameters
echo " Attempting force delete with purge..."
# Try with purge and skiplock
DELETE_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X DELETE \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}?purge=1&skiplock=1" 2>&1)
TASK_UPID=$(echo "$DELETE_RESULT" | jq -r '.data // empty' 2>/dev/null)
if [ -n "$TASK_UPID" ] && [ "$TASK_UPID" != "null" ]; then
echo " Delete task started: $TASK_UPID"
echo " Waiting up to 60 seconds for completion..."
for i in {1..60}; do
sleep 1
TASK_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.status // "unknown"')
if [ "$TASK_STATUS" = "stopped" ]; then
EXIT_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.exitstatus // "unknown"')
if [ "$EXIT_STATUS" = "OK" ] || [ "$EXIT_STATUS" = "0" ]; then
echo " ✅ VM $VMID deleted successfully"
break
else
echo " ⚠️ Task completed with status: $EXIT_STATUS"
fi
fi
done
# Verify deletion
sleep 3
VM_STILL_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_STILL_EXISTS" ] || [ "$VM_STILL_EXISTS" = "null" ]; then
echo " ✅ Verified: VM $VMID is deleted"
else
echo " ❌ VM $VMID still exists - requires manual intervention"
echo ""
echo " MANUAL CLEANUP REQUIRED:"
echo " ssh ${PROXMOX_SSH_USER}@${PROXMOX_SSH_HOST}"
echo " rm -f /var/lock/qemu-server/lock-${VMID}.conf"
echo " qm destroy ${VMID} --purge"
fi
else
echo " ❌ Failed to start delete task"
echo " Response: $DELETE_RESULT"
echo ""
echo " MANUAL CLEANUP REQUIRED:"
echo " ssh ${PROXMOX_SSH_USER}@${PROXMOX_SSH_HOST}"
echo " rm -f /var/lock/qemu-server/lock-${VMID}.conf"
echo " qm destroy ${VMID} --purge"
fi
echo ""
done
echo "Force removal attempt complete!"
echo ""
echo "If any VMs still exist, they require manual cleanup via SSH:"
echo "ssh ${PROXMOX_SSH_USER}@${PROXMOX_SSH_HOST}"
for VMID in "${ORPHANED_VMS[@]}"; do
echo "rm -f /var/lock/qemu-server/lock-${VMID}.conf && qm destroy ${VMID} --purge"
done

116
scripts/force-restart-vm-100.sh Executable file
View File

@@ -0,0 +1,116 @@
#!/bin/bash
# Force restart VM 100 to apply guest agent fix
# Run on Proxmox node: root@ml110-01
set -e
VMID=100
echo "=========================================="
echo "Force Restart VM 100 for Guest Agent Fix"
echo "=========================================="
echo ""
# Step 1: Check current status
echo "Step 1: Current VM Status"
echo "--------------------------------------"
qm status $VMID
echo ""
# Step 2: Verify guest agent config
echo "Step 2: Guest Agent Configuration"
echo "--------------------------------------"
AGENT_CONFIG=$(qm config $VMID | grep '^agent:' || echo "")
if [ -z "$AGENT_CONFIG" ]; then
echo "⚠️ Guest agent NOT configured"
echo "Setting agent=1..."
qm set $VMID --agent 1
echo "✅ Guest agent enabled"
else
echo "✅ Guest agent configured: $AGENT_CONFIG"
fi
echo ""
# Step 3: Force stop VM
echo "Step 3: Force Stopping VM"
echo "--------------------------------------"
CURRENT_STATUS=$(qm status $VMID | awk '{print $2}')
if [ "$CURRENT_STATUS" = "running" ]; then
echo "VM is running, attempting graceful shutdown first..."
qm shutdown $VMID &
SHUTDOWN_PID=$!
# Wait up to 30 seconds for graceful shutdown
for i in {1..30}; do
sleep 1
STATUS=$(qm status $VMID | awk '{print $2}' 2>/dev/null || echo "unknown")
if [ "$STATUS" != "running" ]; then
echo "✅ VM shut down gracefully"
break
fi
if [ $i -eq 30 ]; then
echo "⚠️ Graceful shutdown timed out, forcing stop..."
kill $SHUTDOWN_PID 2>/dev/null || true
qm stop $VMID
echo "✅ VM force stopped"
fi
done
else
echo "VM is already stopped (status: $CURRENT_STATUS)"
fi
echo ""
# Step 4: Wait and verify stopped
echo "Step 4: Verifying VM is Stopped"
echo "--------------------------------------"
sleep 3
FINAL_STATUS=$(qm status $VMID | awk '{print $2}')
if [ "$FINAL_STATUS" = "stopped" ]; then
echo "✅ VM is stopped"
else
echo "⚠️ VM status: $FINAL_STATUS"
echo "Waiting a bit longer..."
sleep 5
FINAL_STATUS=$(qm status $VMID | awk '{print $2}')
echo "Final status: $FINAL_STATUS"
fi
echo ""
# Step 5: Start VM
echo "Step 5: Starting VM"
echo "--------------------------------------"
if [ "$FINAL_STATUS" = "stopped" ]; then
echo "Starting VM..."
qm start $VMID
echo ""
echo "Waiting 10 seconds for initialization..."
sleep 10
echo ""
echo "VM status after start:"
qm status $VMID
else
echo "⚠️ Cannot start VM - current status: $FINAL_STATUS"
echo "Please check VM state manually"
fi
echo ""
# Step 6: Instructions for guest agent verification
echo "=========================================="
echo "VM Restarted - Next Steps"
echo "=========================================="
echo ""
echo "Wait 30-60 seconds for VM to fully boot, then verify guest agent:"
echo ""
echo "1. Check guest agent service (once VM has booted):"
echo " qm guest exec $VMID -- systemctl status qemu-guest-agent"
echo ""
echo "2. If guest agent service is not running, install it inside VM:"
echo " (You'll need to SSH into the VM or use console)"
echo ""
echo "3. Monitor VM status:"
echo " watch -n 2 \"qm status $VMID\""
echo ""
echo "4. Check VM IP address (once booted):"
echo " qm guest exec $VMID -- hostname -I"
echo ""

View File

@@ -0,0 +1,96 @@
#!/bin/bash
# Force unlock Proxmox VM when qm unlock times out
# Usage: Run on Proxmox node: bash force-unlock-vm-proxmox.sh <VMID>
VMID="${1:-100}"
if [ -z "$VMID" ]; then
echo "Usage: $0 <VMID>"
echo "Example: $0 100"
exit 1
fi
echo "=== Force Unlocking VM $VMID ==="
echo ""
# 1. Check for stuck processes
echo "1. Checking for stuck processes..."
STUCK_PROCS=$(ps aux | grep -E "qm|qemu" | grep "$VMID" | grep -v grep)
if [ -n "$STUCK_PROCS" ]; then
echo " Found stuck processes:"
echo "$STUCK_PROCS" | while read line; do
echo " $line"
done
else
echo " ✅ No stuck processes found"
fi
echo ""
# 2. Check lock file
echo "2. Checking lock file..."
if [ -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
echo " Lock file exists: /var/lock/qemu-server/lock-$VMID.conf"
echo " Lock file contents:"
cat "/var/lock/qemu-server/lock-$VMID.conf" 2>/dev/null || echo " (unreadable)"
echo ""
else
echo " ✅ No lock file found"
echo ""
fi
# 3. Kill stuck processes
echo "3. Killing stuck processes..."
pkill -9 -f "qm.*$VMID" 2>/dev/null && echo " ✅ Killed qm processes" || echo " No qm processes to kill"
pkill -9 -f "qemu.*$VMID" 2>/dev/null && echo " ✅ Killed qemu processes" || echo " No qemu processes to kill"
sleep 2
echo ""
# 4. Force remove lock file
echo "4. Force removing lock file..."
if [ -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
rm -f "/var/lock/qemu-server/lock-$VMID.conf"
if [ ! -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
echo " ✅ Lock file removed"
else
echo " ⚠️ Failed to remove lock file (may need root)"
exit 1
fi
else
echo " Lock file already removed"
fi
echo ""
# 5. Verify lock is gone
echo "5. Verifying lock is cleared..."
if [ ! -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
echo " ✅ Lock file confirmed removed"
else
echo " ⚠️ Lock file still exists"
exit 1
fi
echo ""
# 6. Check VM status
echo "6. Checking VM status..."
qm status "$VMID" 2>&1
echo ""
# 7. Try unlock again (should work now)
echo "7. Attempting unlock again..."
qm unlock "$VMID" 2>&1
UNLOCK_RESULT=$?
if [ $UNLOCK_RESULT -eq 0 ]; then
echo " ✅ VM unlocked successfully"
else
echo " ⚠️ Unlock still failed (exit code: $UNLOCK_RESULT)"
echo " This may indicate the VM is in use or another issue exists"
fi
echo ""
echo "=== Force Unlock Complete ==="
echo ""
echo "Next steps:"
echo "1. Check VM status: qm status $VMID"
echo "2. Check VM config: qm config $VMID"
echo "3. If needed, restart VM: qm start $VMID"

142
scripts/gather-proxmox-info.sh Executable file
View File

@@ -0,0 +1,142 @@
#!/bin/bash
# gather-proxmox-info.sh
# Gathers comprehensive information from both Proxmox instances
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
OUTPUT_FILE="${1:-docs/proxmox/INSTANCE_INVENTORY.md}"
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
get_node_info() {
local endpoint=$1
local token=$2
local node_name=$3
echo "### Node: ${node_name}"
echo ""
# Node status
local node_data=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" "${endpoint}/api2/json/nodes/${node_name}/status" 2>/dev/null)
if [ -n "$node_data" ]; then
echo "$node_data" | jq -r '.data | "**Status**: \(.status)\n**CPU Usage**: \(.cpu)%\n**Memory Usage**: \(.mem)%\n**Uptime**: \(.uptime // 0) seconds\n**Max CPU**: \(.maxcpu)\n**Max Memory**: \(.maxmem | tonumber / 1024 / 1024 / 1024 | floor)GB"' >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
fi
}
get_storage_info() {
local endpoint=$1
local token=$2
local node_name=$3
echo "#### Storage Pools" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
curl -k -s -H "Authorization: PVEAPIToken ${token}" "${endpoint}/api2/json/nodes/${node_name}/storage" 2>/dev/null | \
jq -r '.data[] | "- **\(.storage)**: Type: \(.type), Content: \(.content | join(", ")), Available: \(.avail | tonumber / 1024 / 1024 / 1024 | floor)GB / Total: \(.total | tonumber / 1024 / 1024 / 1024 | floor)GB"' >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
}
get_network_info() {
local endpoint=$1
local token=$2
local node_name=$3
echo "#### Network Interfaces" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
curl -k -s -H "Authorization: PVEAPIToken ${token}" "${endpoint}/api2/json/nodes/${node_name}/network" 2>/dev/null | \
jq -r '.data[] | select(.type == "bridge" or .type == "bond") | "- **\(.iface)**: Type: \(.type), Bridge: \(.bridge // "N/A"), Address: \(.address // "N/A")"' >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
}
get_vm_info() {
local endpoint=$1
local token=$2
local node_name=$3
echo "#### Virtual Machines" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
local vm_data=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" "${endpoint}/api2/json/nodes/${node_name}/qemu" 2>/dev/null)
local vm_count=$(echo "$vm_data" | jq -r '.data | length // 0')
if [ "$vm_count" -gt 0 ]; then
echo "**Total VMs**: ${vm_count}" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
echo "$vm_data" | jq -r '.data[] | "- **VMID \(.vmid)**: \(.name) - Status: \(.status) - CPU: \(.cpus) - Memory: \(.maxmem | tonumber / 1024 / 1024 / 1024)GB"' >> "$OUTPUT_FILE"
else
echo "No VMs found" >> "$OUTPUT_FILE"
fi
echo "" >> "$OUTPUT_FILE"
}
main() {
log "Gathering information from Proxmox instances..."
# Create output file
cat > "$OUTPUT_FILE" <<EOF
# Proxmox Instance Inventory
**Generated**: $(date +'%Y-%m-%d %H:%M:%S')
**Source**: Automated inventory from Proxmox API
## Instance 1: ML110-01
**IP**: 192.168.11.10
**FQDN**: ml110-01.sankofa.nexus
**Site**: us-sfvalley
**Endpoint**: https://ml110-01.sankofa.nexus:8006
EOF
get_node_info "https://192.168.11.10:8006" "${PROXMOX_TOKEN_ML110_01}" "ML110-01"
get_storage_info "https://192.168.11.10:8006" "${PROXMOX_TOKEN_ML110_01}" "ML110-01"
get_network_info "https://192.168.11.10:8006" "${PROXMOX_TOKEN_ML110_01}" "ML110-01"
get_vm_info "https://192.168.11.10:8006" "${PROXMOX_TOKEN_ML110_01}" "ML110-01"
cat >> "$OUTPUT_FILE" <<EOF
## Instance 2: R630-01
**IP**: 192.168.11.11
**FQDN**: r630-01.sankofa.nexus
**Site**: us-sfvalley-2
**Endpoint**: https://r630-01.sankofa.nexus:8006
EOF
get_node_info "https://192.168.11.11:8006" "${PROXMOX_TOKEN_R630_01}" "R630-01"
get_storage_info "https://192.168.11.11:8006" "${PROXMOX_TOKEN_R630_01}" "R630-01"
get_network_info "https://192.168.11.11:8006" "${PROXMOX_TOKEN_R630_01}" "R630-01"
get_vm_info "https://192.168.11.11:8006" "${PROXMOX_TOKEN_R630_01}" "R630-01"
log "Inventory saved to: ${OUTPUT_FILE}"
}
main "$@"

178
scripts/get-cloudflare-info.sh Executable file
View File

@@ -0,0 +1,178 @@
#!/bin/bash
# get-cloudflare-info.sh
# Gets Cloudflare Zone ID and Account ID using credentials from .env
set -euo pipefail
# Load environment variables from .env if it exists
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
DOMAIN="${DOMAIN:-sankofa.nexus}"
API_TOKEN="${CLOUDFLARE_API_TOKEN:-}"
API_KEY="${CLOUDFLARE_API_KEY:-}"
API_EMAIL="${CLOUDFLARE_EMAIL:-}"
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_auth() {
if [ -z "$API_TOKEN" ] && [ -z "$API_KEY" ]; then
error "Either CLOUDFLARE_API_TOKEN or CLOUDFLARE_API_KEY must be set"
fi
if [ -z "$API_TOKEN" ] && [ -z "$API_EMAIL" ]; then
error "If using CLOUDFLARE_API_KEY, CLOUDFLARE_EMAIL must also be set"
fi
}
get_zone_id() {
log "Getting Zone ID for ${DOMAIN}..."
local zone_id
if [ -n "$API_TOKEN" ]; then
zone_id=$(curl -s -X GET \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones?name=${DOMAIN}" | \
jq -r '.result[0].id')
else
zone_id=$(curl -s -X GET \
-H "X-Auth-Email: ${API_EMAIL}" \
-H "X-Auth-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones?name=${DOMAIN}" | \
jq -r '.result[0].id')
fi
if [ "$zone_id" != "null" ] && [ -n "$zone_id" ]; then
echo "CLOUDFLARE_ZONE_ID=${zone_id}"
log "✓ Zone ID: ${zone_id}"
return 0
else
warn "Could not get Zone ID for ${DOMAIN}"
return 1
fi
}
get_account_id() {
log "Getting Account ID..."
local account_id
if [ -n "$API_TOKEN" ]; then
account_id=$(curl -s -X GET \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts" | \
jq -r '.result[0].id')
else
account_id=$(curl -s -X GET \
-H "X-Auth-Email: ${API_EMAIL}" \
-H "X-Auth-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts" | \
jq -r '.result[0].id')
fi
if [ "$account_id" != "null" ] && [ -n "$account_id" ]; then
echo "CLOUDFLARE_ACCOUNT_ID=${account_id}"
log "✓ Account ID: ${account_id}"
return 0
else
warn "Could not get Account ID"
return 1
fi
}
update_env_file() {
local zone_id=$1
local account_id=$2
local env_file="${SCRIPT_DIR}/../.env"
if [ ! -f "$env_file" ]; then
warn ".env file not found, creating it..."
touch "$env_file"
fi
# Update or add Zone ID
if grep -q "^CLOUDFLARE_ZONE_ID=" "$env_file"; then
sed -i "s/^CLOUDFLARE_ZONE_ID=.*/CLOUDFLARE_ZONE_ID=${zone_id}/" "$env_file"
else
echo "CLOUDFLARE_ZONE_ID=${zone_id}" >> "$env_file"
fi
# Update or add Account ID
if grep -q "^CLOUDFLARE_ACCOUNT_ID=" "$env_file"; then
sed -i "s/^CLOUDFLARE_ACCOUNT_ID=.*/CLOUDFLARE_ACCOUNT_ID=${account_id}/" "$env_file"
else
echo "CLOUDFLARE_ACCOUNT_ID=${account_id}" >> "$env_file"
fi
log "✓ Updated .env file"
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Cloudflare Information Retrieval ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
check_auth
local zone_id_output=$(get_zone_id)
local account_id_output=$(get_account_id)
echo ""
info "Retrieved Information:"
echo "$zone_id_output"
echo "$account_id_output"
# Extract values
local zone_id=$(echo "$zone_id_output" | cut -d'=' -f2)
local account_id=$(echo "$account_id_output" | cut -d'=' -f2)
if [ -n "$zone_id" ] && [ "$zone_id" != "null" ] && [ -n "$account_id" ] && [ "$account_id" != "null" ]; then
echo ""
read -p "Update .env file with these values? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
update_env_file "$zone_id" "$account_id"
else
info "Values not saved. Add them manually to .env:"
echo "$zone_id_output"
echo "$account_id_output"
fi
else
warn "Some values could not be retrieved. Check your credentials and domain."
fi
}
main "$@"

212
scripts/get-smom-vm-ips.sh Executable file
View File

@@ -0,0 +1,212 @@
#!/bin/bash
# get-smom-vm-ips.sh
# Get all SMOM-DBIS-138 VM IP addresses and export to SMOM-DBIS-138 project
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
SMOM_PROJECT="${SMOM_PROJECT:-$HOME/projects/smom-dbis-138}"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
get_vm_ip() {
local vm_name=$1
local ip
ip=$(kubectl get proxmoxvm "${vm_name}" -n default -o jsonpath='{.status.ipAddress}' 2>/dev/null || echo "")
if [ -z "${ip}" ] || [ "${ip}" = "<none>" ]; then
echo ""
return 1
fi
echo "${ip}"
}
main() {
log "=========================================="
log "SMOM-DBIS-138 VM IP Address Collector"
log "=========================================="
log ""
local output_file="${PROJECT_ROOT}/smom-vm-ips.txt"
local smom_config_file="${SMOM_PROJECT}/config/vm-ips.txt"
log "Collecting VM IP addresses..."
log ""
# Infrastructure VMs
log "Infrastructure VMs:"
{
echo "# SMOM-DBIS-138 VM IP Addresses"
echo "# Generated: $(date -Iseconds)"
echo "#"
echo "# Infrastructure VMs"
echo ""
local nginx_ip sentry_ip
nginx_ip=$(get_vm_ip "nginx-proxy-vm" 2>/dev/null || echo "")
if [ -n "${nginx_ip}" ]; then
echo "NGINX_PROXY_IP=${nginx_ip}"
log_success "nginx-proxy-vm: ${nginx_ip}"
else
echo "# NGINX_PROXY_IP= (not yet assigned)"
log_warning "nginx-proxy-vm: IP not yet assigned"
fi
local tunnel_ip
tunnel_ip=$(get_vm_ip "cloudflare-tunnel-vm" 2>/dev/null || echo "")
if [ -n "${tunnel_ip}" ]; then
echo "CLOUDFLARE_TUNNEL_IP=${tunnel_ip}"
log_success "cloudflare-tunnel-vm: ${tunnel_ip}"
else
echo "# CLOUDFLARE_TUNNEL_IP= (not yet assigned)"
log_warning "cloudflare-tunnel-vm: IP not yet assigned"
fi
echo ""
echo "# Application VMs"
echo ""
echo "# Validators"
# Validators
for i in 01 02 03 04; do
local ip
ip=$(get_vm_ip "smom-validator-${i}" 2>/dev/null || echo "")
if [ -n "${ip}" ]; then
echo "VALIDATOR_${i}_IP=${ip}"
log_success "smom-validator-${i}: ${ip}"
else
echo "# VALIDATOR_${i}_IP= (not yet assigned)"
log_warning "smom-validator-${i}: IP not yet assigned"
fi
done
echo ""
echo "# Sentries"
# Sentries
for i in 01 02 03 04; do
local ip
ip=$(get_vm_ip "smom-sentry-${i}" 2>/dev/null || echo "")
if [ -n "${ip}" ]; then
echo "SENTRY_${i}_IP=${ip}"
log_success "smom-sentry-${i}: ${ip}"
else
echo "# SENTRY_${i}_IP= (not yet assigned)"
log_warning "smom-sentry-${i}: IP not yet assigned"
fi
done
echo ""
echo "# RPC Nodes"
# RPC Nodes
for i in 01 02 03 04; do
local ip
ip=$(get_vm_ip "smom-rpc-node-${i}" 2>/dev/null || echo "")
if [ -n "${ip}" ]; then
echo "RPC_NODE_${i}_IP=${ip}"
log_success "smom-rpc-node-${i}: ${ip}"
else
echo "# RPC_NODE_${i}_IP= (not yet assigned)"
log_warning "smom-rpc-node-${i}: IP not yet assigned"
fi
done
echo ""
echo "# Services"
# Services
local services_ip
services_ip=$(get_vm_ip "smom-services" 2>/dev/null || echo "")
if [ -n "${services_ip}" ]; then
echo "SERVICES_IP=${services_ip}"
log_success "smom-services: ${services_ip}"
else
echo "# SERVICES_IP= (not yet assigned)"
log_warning "smom-services: IP not yet assigned"
fi
# Blockscout
local blockscout_ip
blockscout_ip=$(get_vm_ip "smom-blockscout" 2>/dev/null || echo "")
if [ -n "${blockscout_ip}" ]; then
echo "BLOCKSCOUT_IP=${blockscout_ip}"
log_success "smom-blockscout: ${blockscout_ip}"
else
echo "# BLOCKSCOUT_IP= (not yet assigned)"
log_warning "smom-blockscout: IP not yet assigned"
fi
# Monitoring
local monitoring_ip
monitoring_ip=$(get_vm_ip "smom-monitoring" 2>/dev/null || echo "")
if [ -n "${monitoring_ip}" ]; then
echo "MONITORING_IP=${monitoring_ip}"
log_success "smom-monitoring: ${monitoring_ip}"
else
echo "# MONITORING_IP= (not yet assigned)"
log_warning "smom-monitoring: IP not yet assigned"
fi
# Management
local management_ip
management_ip=$(get_vm_ip "smom-management" 2>/dev/null || echo "")
if [ -n "${management_ip}" ]; then
echo "MANAGEMENT_IP=${management_ip}"
log_success "smom-management: ${management_ip}"
else
echo "# MANAGEMENT_IP= (not yet assigned)"
log_warning "smom-management: IP not yet assigned"
fi
} > "${output_file}"
log ""
log_success "VM IPs saved to: ${output_file}"
# Copy to SMOM-DBIS-138 project if it exists
if [ -d "${SMOM_PROJECT}" ]; then
mkdir -p "${SMOM_PROJECT}/config"
cp "${output_file}" "${smom_config_file}"
log_success "VM IPs copied to: ${smom_config_file}"
log ""
log "To use in SMOM-DBIS-138 project:"
log " cd ${SMOM_PROJECT}"
log " source config/vm-ips.txt"
else
log_warning "SMOM-DBIS-138 project not found at: ${SMOM_PROJECT}"
log " Set SMOM_PROJECT environment variable to override"
fi
log ""
log "=========================================="
log_success "Complete!"
log ""
}
main "$@"

19
scripts/hosts-entries.txt Normal file
View File

@@ -0,0 +1,19 @@
# /etc/hosts entries for Proxmox instances
# Add these entries to /etc/hosts for local DNS resolution
# (Useful for testing before DNS records are configured)
# Proxmox Instance 1 (ML110-01)
192.168.11.10 ml110-01.sankofa.nexus
192.168.11.10 ml110-01-api.sankofa.nexus
192.168.11.10 ml110-01-metrics.sankofa.nexus
# Proxmox Instance 2 (R630-01)
192.168.11.11 r630-01.sankofa.nexus
192.168.11.11 r630-01-api.sankofa.nexus
192.168.11.11 r630-01-metrics.sankofa.nexus
# To add these entries, run:
# sudo cat scripts/hosts-entries.txt >> /etc/hosts
#
# Or manually add to /etc/hosts on each system that needs to resolve these names

View File

@@ -0,0 +1,108 @@
#!/usr/bin/env tsx
/**
* Convert CSV export to JSON format for infrastructure documentation
* Usage: tsx scripts/infrastructure/convert-csv-to-json.ts
*/
import * as fs from 'fs'
import * as path from 'path'
const PROJECT_ROOT = path.resolve(__dirname, '../..')
const DATA_DIR = path.join(PROJECT_ROOT, 'docs/infrastructure/data')
interface Country {
name: string
region: 'Africa (Sub-Saharan)' | 'Middle East & North Africa' | 'Americas' | 'Asia-Pacific' | 'Europe'
relationshipType: 'Full Diplomatic Relations' | 'Official (Non-Diplomatic)' | 'Ambassador Level' | 'Full Diplomatic Relations (Special Mission)'
priority: 'Critical' | 'High' | 'Medium' | 'Low'
cloudflareCoverage: boolean
networkInfrastructurePriority: string
notes?: string
coordinates?: { lat: number; lng: number }
}
// Country coordinates (approximate, can be enhanced with geocoding)
const COUNTRY_COORDINATES: Record<string, { lat: number; lng: number }> = {
'Italy': { lat: 41.9028, lng: 12.4964 },
'Germany': { lat: 51.1657, lng: 10.4515 },
'France': { lat: 46.2276, lng: 2.2137 },
'Spain': { lat: 40.4637, lng: -3.7492 },
'Brazil': { lat: -14.2350, lng: -51.9253 },
'Argentina': { lat: -38.4161, lng: -63.6167 },
'Philippines': { lat: 12.8797, lng: 121.7740 },
'Kenya': { lat: -0.0236, lng: 37.9062 },
'Ethiopia': { lat: 9.1450, lng: 38.7667 },
'Lebanon': { lat: 33.8547, lng: 35.8623 },
'Holy See (Vatican City)': { lat: 41.9029, lng: 12.4534 },
}
function parseCSV(csvContent: string): Country[] {
const lines = csvContent.trim().split('\n')
const headers = lines[0].split(',').map(h => h.trim())
return lines.slice(1).map(line => {
const values = line.split(',').map(v => v.trim())
const country: Country = {
name: values[0],
region: values[1] as Country['region'],
relationshipType: values[2] as Country['relationshipType'],
priority: values[3] as Country['priority'],
cloudflareCoverage: values[4] === 'Yes',
networkInfrastructurePriority: values[5],
notes: values[6] || undefined,
coordinates: COUNTRY_COORDINATES[values[0]] || undefined,
}
return country
})
}
function main() {
// Find the most recent CSV file
const files = fs.readdirSync(DATA_DIR)
.filter(f => f.startsWith('smom_countries_full_') && f.endsWith('.csv'))
.sort()
.reverse()
if (files.length === 0) {
console.error('No CSV files found. Please run export-smom-countries.sh first.')
process.exit(1)
}
const csvFile = path.join(DATA_DIR, files[0])
console.log(`Reading CSV file: ${csvFile}`)
const csvContent = fs.readFileSync(csvFile, 'utf-8')
const countries = parseCSV(csvContent)
// Write JSON file
const jsonFile = path.join(DATA_DIR, 'smom_countries.json')
fs.writeFileSync(jsonFile, JSON.stringify(countries, null, 2))
console.log(`✓ Created ${jsonFile} with ${countries.length} countries`)
// Create summary statistics
const summary = {
total: countries.length,
byRegion: countries.reduce((acc, c) => {
acc[c.region] = (acc[c.region] || 0) + 1
return acc
}, {} as Record<string, number>),
byPriority: countries.reduce((acc, c) => {
acc[c.priority] = (acc[c.priority] || 0) + 1
return acc
}, {} as Record<string, number>),
byRelationshipType: countries.reduce((acc, c) => {
acc[c.relationshipType] = (acc[c.relationshipType] || 0) + 1
return acc
}, {} as Record<string, number>),
lastUpdated: new Date().toISOString(),
}
const summaryFile = path.join(DATA_DIR, 'smom_countries_summary.json')
fs.writeFileSync(summaryFile, JSON.stringify(summary, null, 2))
console.log(`✓ Created ${summaryFile}`)
}
if (require.main === module) {
main()
}

View File

@@ -0,0 +1,153 @@
#!/bin/bash
# Export SMOM Country Relationships to CSV
# Generates CSV files for Sovereign Order of Hospitallers country relationships
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
OUTPUT_DIR="${PROJECT_ROOT}/docs/infrastructure/data"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
mkdir -p "${OUTPUT_DIR}"
# Create comprehensive CSV with all countries
cat > "${OUTPUT_DIR}/smom_countries_full_${TIMESTAMP}.csv" << 'EOF'
Country,Region,Relationship Type,Priority,Cloudflare Coverage,Network Infrastructure Priority,Notes
Angola,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Benin,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Burundi,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Burkina Faso,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Cameroon,Africa (Sub-Saharan),Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Cape Verde,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Central African Republic,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Chad,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Comoros,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Democratic Republic of the Congo,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Republic of the Congo,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Côte d'Ivoire,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Equatorial Guinea,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Eritrea,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Ethiopia,Africa (Sub-Saharan),Full Diplomatic Relations,High,Yes,Regional Datacenter,
Gabon,Africa (Sub-Saharan),Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
The Gambia,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Guinea,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Guinea-Bissau,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Kenya,Africa (Sub-Saharan),Full Diplomatic Relations,High,Yes,Regional Datacenter,
Lesotho,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Liberia,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Madagascar,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Mali,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Mauritania,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Mauritius,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Mozambique,Africa (Sub-Saharan),Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Namibia,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Niger,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
São Tomé and Príncipe,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Senegal,Africa (Sub-Saharan),Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Seychelles,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Sierra Leone,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Somalia,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
South Sudan,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Sudan,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Togo,Africa (Sub-Saharan),Full Diplomatic Relations,Low,Yes,Edge/CDN,
Egypt,Middle East & North Africa,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Jordan,Middle East & North Africa,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Lebanon,Middle East & North Africa,Full Diplomatic Relations,High,Yes,Regional Datacenter,
Morocco,Middle East & North Africa,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Antigua and Barbuda,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Argentina,Americas,Full Diplomatic Relations,High,Yes,Regional Datacenter,
The Bahamas,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Belize,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Bolivia,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Brazil,Americas,Full Diplomatic Relations,High,Yes,Core Datacenter,
Chile,Americas,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Colombia,Americas,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Costa Rica,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Cuba,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Dominican Republic,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Ecuador,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
El Salvador,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Grenada,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Guatemala,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Guyana,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Haiti,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Honduras,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Nicaragua,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Panama,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Paraguay,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Peru,Americas,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Saint Lucia,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Saint Vincent and the Grenadines,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Suriname,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Uruguay,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Venezuela,Americas,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Afghanistan,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Armenia,Asia-Pacific,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Cambodia,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Georgia,Asia-Pacific,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Kazakhstan,Asia-Pacific,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Kiribati,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Marshall Islands,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Micronesia (Federated States of),Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Nauru,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Philippines,Asia-Pacific,Full Diplomatic Relations,High,Yes,Regional Datacenter,
Tajikistan,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Thailand,Asia-Pacific,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Timor-Leste,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Turkmenistan,Asia-Pacific,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Albania,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Andorra,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Austria,Europe,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Belarus,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Bosnia & Herzegovina,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Bulgaria,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Croatia,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Cyprus,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Czechia,Europe,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Estonia,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Germany,Europe,Full Diplomatic Relations,High,Yes,Regional Datacenter,
Greece,Europe,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Holy See (Vatican City),Europe,Full Diplomatic Relations,Critical,Yes,Core Datacenter,
Hungary,Europe,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Italy,Europe,Full Diplomatic Relations,Critical,Yes,Core Datacenter,
Latvia,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Liechtenstein,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Lithuania,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Malta,Europe,Full Diplomatic Relations,High,Yes,Regional Datacenter,
Moldova,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Monaco,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Montenegro,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
North Macedonia,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Poland,Europe,Full Diplomatic Relations,High,Yes,Regional Datacenter,
Portugal,Europe,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Romania,Europe,Full Diplomatic Relations,Medium,Yes,Regional Datacenter,
Russian Federation,Europe,Full Diplomatic Relations (Special Mission),Medium,Yes,Edge/CDN,Special Mission Status
San Marino,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Serbia,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Slovakia,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Slovenia,Europe,Full Diplomatic Relations,Low,Yes,Edge/CDN,
Spain,Europe,Full Diplomatic Relations,High,Yes,Regional Datacenter,
Ukraine,Europe,Full Diplomatic Relations,Medium,Yes,Edge/CDN,
Belgium,Europe,Official (Non-Diplomatic),High,Yes,Regional Datacenter,
France,Europe,Official (Non-Diplomatic),High,Yes,Regional Datacenter,
Canada,Americas,Official (Non-Diplomatic),High,Yes,Regional Datacenter,
United Kingdom,Europe,Official (Non-Diplomatic),High,Yes,Regional Datacenter,
State of Palestine,Middle East,Ambassador Level,Medium,Yes,Edge/CDN,Ambassador-level relations
EOF
# Create regional summary CSV
cat > "${OUTPUT_DIR}/smom_regions_summary_${TIMESTAMP}.csv" << 'EOF'
Region,Total Countries,Full Diplomatic Relations,Official Relations,Ambassador Level,Core Datacenters,Regional Datacenters,Edge/CDN Only
Africa (Sub-Saharan),36,36,0,0,0,4,32
Middle East & North Africa,4,4,0,0,0,4,0
Americas,27,26,1,0,1,3,23
Asia-Pacific,14,14,0,0,0,5,9
Europe,39,35,4,0,2,10,27
Total,120,115,5,1,3,26,91
EOF
echo "CSV files generated:"
echo " - ${OUTPUT_DIR}/smom_countries_full_${TIMESTAMP}.csv"
echo " - ${OUTPUT_DIR}/smom_regions_summary_${TIMESTAMP}.csv"

View File

@@ -0,0 +1,185 @@
#!/usr/bin/env tsx
/**
* Generate network topology JSON from entity registry
* Usage: tsx scripts/infrastructure/generate-topology-data.ts
*/
import * as fs from 'fs'
import * as path from 'path'
import type { NetworkTopology, TopologyNode, TopologyEdge } from '../../src/lib/types/infrastructure'
const PROJECT_ROOT = path.resolve(__dirname, '../..')
const DATA_DIR = path.join(PROJECT_ROOT, 'docs/infrastructure/data')
const ENTITY_REGISTRY = path.join(PROJECT_ROOT, 'docs/infrastructure/ENTITY_REGISTRY.md')
// Generate topology data based on entity registry structure
function generateTopologyData(): NetworkTopology[] {
const topologies: NetworkTopology[] = []
// Cloudflare as root node
const cloudflareNode: TopologyNode = {
id: 'cloudflare',
type: 'region',
label: 'Cloudflare Global Network',
region: 'Global',
entity: 'Cloudflare',
position: { x: 400, y: 50 },
metadata: {
asn: 'AS13335',
dataCenters: 300,
},
}
// VM 137 - Cloudflare Tunnel
const vm137Node: TopologyNode = {
id: 'vm-137',
type: 'vm',
label: 'VM 137 (cloudflare-tunnel-vm)',
region: 'Site 2',
entity: 'Sankofa Phoenix',
position: { x: 300, y: 200 },
metadata: {
vmid: 137,
host: 'r630-01',
ip: '192.168.11.11',
function: 'Cloudflare Tunnel Agent',
},
}
// VM 136 - Nginx Proxy
const vm136Node: TopologyNode = {
id: 'vm-136',
type: 'vm',
label: 'VM 136 (nginx-proxy-vm)',
region: 'Site 1',
entity: 'Sankofa Phoenix',
position: { x: 500, y: 200 },
metadata: {
vmid: 136,
host: 'ml110-01',
ip: '192.168.11.10',
function: 'Reverse Proxy, SSL Termination',
},
}
// Regional topologies for SMOM
const smomRegions = [
{ name: 'Europe', x: 400, y: 350, countries: 35 },
{ name: 'Americas', x: 200, y: 350, countries: 26 },
{ name: 'Asia-Pacific', x: 600, y: 350, countries: 14 },
{ name: 'Africa', x: 400, y: 500, countries: 36 },
{ name: 'Middle East', x: 500, y: 500, countries: 4 },
]
smomRegions.forEach((region, index) => {
const regionNode: TopologyNode = {
id: `region-${region.name.toLowerCase()}`,
type: 'region',
label: `${region.name} Region`,
region: region.name,
entity: 'Sovereign Order of Hospitallers',
position: { x: region.x, y: region.y },
metadata: {
countries: region.countries,
priority: index < 2 ? 'High' : 'Medium',
},
}
const datacenterNode: TopologyNode = {
id: `dc-${region.name.toLowerCase()}`,
type: 'datacenter',
label: `${region.name} Datacenter`,
region: region.name,
entity: 'Sovereign Order of Hospitallers',
position: { x: region.x, y: region.y + 100 },
metadata: {
tier: index < 2 ? 'Tier 1' : 'Tier 2',
},
}
const tunnelNode: TopologyNode = {
id: `tunnel-${region.name.toLowerCase()}`,
type: 'tunnel',
label: `${region.name} Tunnel`,
region: region.name,
entity: 'Sovereign Order of Hospitallers',
position: { x: region.x + 100, y: region.y },
metadata: {
tunnelId: `hospitallers-${region.name.toLowerCase()}-tunnel`,
networkRoute: `10.${10 + index}.0.0/16`,
},
}
const edges: TopologyEdge[] = [
{
id: `edge-cloudflare-${region.name.toLowerCase()}`,
source: 'cloudflare',
target: regionNode.id,
type: 'tunnel',
metadata: { bandwidth: '10Gbps' },
},
{
id: `edge-${region.name.toLowerCase()}-dc`,
source: regionNode.id,
target: datacenterNode.id,
type: 'network-route',
metadata: {},
},
{
id: `edge-${region.name.toLowerCase()}-tunnel`,
source: regionNode.id,
target: tunnelNode.id,
type: 'tunnel',
metadata: {},
},
]
topologies.push({
nodes: [cloudflareNode, vm137Node, vm136Node, regionNode, datacenterNode, tunnelNode],
edges: [
{
id: 'edge-cloudflare-vm137',
source: 'cloudflare',
target: 'vm-137',
type: 'tunnel',
metadata: {},
},
{
id: 'edge-vm137-vm136',
source: 'vm-137',
target: 'vm-136',
type: 'network-route',
metadata: {},
},
...edges,
],
region: region.name,
entity: 'Sovereign Order of Hospitallers',
lastUpdated: new Date().toISOString(),
})
})
return topologies
}
function main() {
const topologies = generateTopologyData()
// Write individual topology files
topologies.forEach(topology => {
const filename = `topology-${topology.region.toLowerCase().replace(/\s+/g, '-')}.json`
const filepath = path.join(DATA_DIR, filename)
fs.writeFileSync(filepath, JSON.stringify(topology, null, 2))
console.log(`✓ Created ${filename}`)
})
// Write combined topology file
const combinedFile = path.join(DATA_DIR, 'network_topology.json')
fs.writeFileSync(combinedFile, JSON.stringify(topologies, null, 2))
console.log(`✓ Created network_topology.json with ${topologies.length} regional topologies`)
}
if (require.main === module) {
main()
}

309
scripts/install-ceph.sh Executable file
View File

@@ -0,0 +1,309 @@
#!/bin/bash
# install-ceph.sh
# Installs and configures Ceph on Proxmox nodes
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
source "${SCRIPT_DIR}/../.env"
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Configuration
CEPH_VERSION="${CEPH_VERSION:-quincy}"
DEPLOYMENT_NODE="${DEPLOYMENT_NODE:-192.168.11.10}"
DEPLOYMENT_HOSTNAME="${DEPLOYMENT_HOSTNAME:-ml110-01}"
NODES=("192.168.11.10" "192.168.11.11")
NODE_HOSTNAMES=("ml110-01" "r630-01")
SSH_KEY="${SSH_KEY:-~/.ssh/sankofa_proxmox}"
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_requirements() {
log "Checking requirements..."
# Check SSH access
for node in "${NODES[@]}"; do
if ! ssh -i "${SSH_KEY}" -o ConnectTimeout=5 -o StrictHostKeyChecking=no root@"${node}" 'echo "SSH OK"' &>/dev/null; then
error "Cannot SSH to ${node}"
fi
done
# Check if ceph-deploy is installed
if ! command -v ceph-deploy &> /dev/null; then
warn "ceph-deploy not found, will install"
fi
}
install_ceph_deploy() {
log "Installing ceph-deploy..."
if command -v ceph-deploy &> /dev/null; then
info "ceph-deploy already installed"
return
fi
pip3 install ceph-deploy --break-system-packages || pip3 install ceph-deploy
}
prepare_nodes() {
log "Preparing nodes..."
for i in "${!NODES[@]}"; do
node="${NODES[$i]}"
hostname="${NODE_HOSTNAMES[$i]}"
log "Preparing ${hostname} (${node})..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${node}" << EOF
set -e
# Update system
apt update && apt upgrade -y
# Install prerequisites
apt install -y chrony python3-pip || true
# Configure hostname
hostnamectl set-hostname ${hostname}
# Update /etc/hosts
if ! grep -q "192.168.11.10 ml110-01" /etc/hosts; then
echo "192.168.11.10 ml110-01 ml110-01.sankofa.nexus" >> /etc/hosts
fi
if ! grep -q "192.168.11.11 r630-01" /etc/hosts; then
echo "192.168.11.11 r630-01 r630-01.sankofa.nexus" >> /etc/hosts
fi
# Sync time
systemctl enable chronyd || true
systemctl start chronyd || true
chronyd -q 'server time.nist.gov iburst' || true
# Add Ceph repository (using new method without apt-key)
wget -q -O /etc/apt/keyrings/ceph-release.asc 'https://download.ceph.com/keys/release.asc'
mkdir -p /etc/apt/keyrings
echo "deb [signed-by=/etc/apt/keyrings/ceph-release.asc] https://download.ceph.com/debian-${CEPH_VERSION}/ bullseye main" > /etc/apt/sources.list.d/ceph.list
# Update (ignore enterprise repo errors)
apt update || apt update --allow-releaseinfo-change || true
# Install Ceph
apt install -y ceph ceph-common ceph-mds || {
# If installation fails, try with no-subscription repo
echo "deb http://download.proxmox.com/debian/ceph-quincy bullseye no-subscription" > /etc/apt/sources.list.d/ceph-no-sub.list
apt update
apt install -y ceph ceph-common ceph-mds
}
# Create ceph user
if ! id ceph &>/dev/null; then
useradd -d /home/ceph -m -s /bin/bash ceph
echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph
chmod 0440 /etc/sudoers.d/ceph
fi
EOF
done
}
setup_ssh_keys() {
log "Setting up SSH keys for ceph user..."
# Generate key on deployment node
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${DEPLOYMENT_NODE}" << EOF
set -e
su - ceph << 'CEPH_USER'
if [ ! -f ~/.ssh/id_rsa ]; then
ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
fi
CEPH_USER
EOF
# Copy key to other nodes
PUB_KEY=$(ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${DEPLOYMENT_NODE}" 'cat /home/ceph/.ssh/id_rsa.pub')
for node in "${NODES[@]}"; do
if [ "${node}" != "${DEPLOYMENT_NODE}" ]; then
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${node}" << EOF
set -e
mkdir -p /home/ceph/.ssh
echo "${PUB_KEY}" >> /home/ceph/.ssh/authorized_keys
chown -R ceph:ceph /home/ceph/.ssh
chmod 700 /home/ceph/.ssh
chmod 600 /home/ceph/.ssh/authorized_keys
EOF
fi
done
}
initialize_cluster() {
log "Initializing Ceph cluster..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${DEPLOYMENT_NODE}" << EOF
set -e
su - ceph << 'CEPH_USER'
cd ~
mkdir -p ceph-cluster
cd ceph-cluster
# Create cluster configuration
ceph-deploy new ${NODE_HOSTNAMES[@]}
# Add configuration for 2-node setup
cat >> ceph.conf << 'CEPH_CONF'
[global]
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128
public network = 192.168.11.0/24
cluster network = 192.168.11.0/24
CEPH_CONF
# Install Ceph on all nodes
ceph-deploy install ${NODE_HOSTNAMES[@]}
# Create initial monitors
ceph-deploy mon create-initial
# Deploy admin key
ceph-deploy admin ${NODE_HOSTNAMES[@]}
# Set permissions
sudo chmod +r /etc/ceph/ceph.client.admin.keyring
CEPH_USER
EOF
}
add_osds() {
log "Adding OSDs..."
info "Using /dev/sdb on both nodes (unused disk)"
for i in "${!NODES[@]}"; do
node_ip="${NODES[$i]}"
node_hostname="${NODE_HOSTNAMES[$i]}"
log "Listing disks on ${node_hostname} (${node_ip})..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${node_ip}" 'lsblk -d -o NAME,SIZE,TYPE | grep -E "NAME|disk"'
DISK="/dev/sdb"
log "Creating OSD on ${node_hostname} using ${DISK}..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${DEPLOYMENT_NODE}" << EOF
set -e
su - ceph << 'CEPH_USER'
cd ~/ceph-cluster
# Zap disk
ceph-deploy disk zap ${node_hostname} ${DISK}
# Create OSD
ceph-deploy osd create --data ${DISK} ${node_hostname}
CEPH_USER
EOF
done
}
deploy_manager() {
log "Deploying Ceph Manager..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${DEPLOYMENT_NODE}" << EOF
set -e
su - ceph << 'CEPH_USER'
cd ~/ceph-cluster
# Deploy manager
ceph-deploy mgr create ${NODE_HOSTNAMES[@]}
CEPH_USER
EOF
}
verify_cluster() {
log "Verifying Ceph cluster..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${DEPLOYMENT_NODE}" << EOF
set -e
su - ceph << 'CEPH_USER'
cd ~/ceph-cluster
echo "=== Cluster Status ==="
ceph -s
echo ""
echo "=== OSD Tree ==="
ceph osd tree
echo ""
echo "=== Health ==="
ceph health
CEPH_USER
EOF
}
create_rbd_pool() {
log "Creating RBD pool for Proxmox..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${DEPLOYMENT_NODE}" << EOF
set -e
su - ceph << 'CEPH_USER'
cd ~/ceph-cluster
# Create RBD pool
ceph osd pool create rbd 128 128
# Initialize pool
rbd pool init rbd
echo "RBD pool created and initialized"
CEPH_USER
EOF
}
main() {
log "Starting Ceph installation..."
check_requirements
install_ceph_deploy
prepare_nodes
setup_ssh_keys
initialize_cluster
add_osds
deploy_manager
verify_cluster
create_rbd_pool
log "Ceph installation complete!"
info "Next steps:"
info " 1. Configure Proxmox storage pools"
info " 2. Enable Ceph dashboard"
info " 3. Set up monitoring"
}
if [ "${BASH_SOURCE[0]}" == "${0}" ]; then
main "$@"
fi

View File

@@ -0,0 +1,90 @@
#!/bin/bash
# install-guest-agent-via-proxmox-console.sh
# Instructions for installing guest agent via Proxmox console
set -euo pipefail
PROXMOX_1_HOST="192.168.11.10"
PROXMOX_2_HOST="192.168.11.11"
PROXMOX_PASS="L@kers2010"
SITE1_VMS="136 139 141 142 145 146 150 151"
SITE2_VMS="101 104 137 138 144 148"
echo "=========================================="
echo "Guest Agent Installation via Proxmox Console"
echo "=========================================="
echo ""
echo "Since VMs are not accessible via SSH, use Proxmox Web UI console:"
echo ""
echo "METHOD 1: Proxmox Web UI Console (Recommended)"
echo "-----------------------------------------------"
echo "1. Open Proxmox Web UI: https://192.168.11.10:8006 or https://192.168.11.11:8006"
echo "2. For each VM, click on the VM → Console"
echo "3. Login as 'admin' user"
echo "4. Run these commands:"
echo ""
echo " sudo apt-get update"
echo " sudo apt-get install -y qemu-guest-agent"
echo " sudo systemctl enable qemu-guest-agent"
echo " sudo systemctl start qemu-guest-agent"
echo " sudo systemctl status qemu-guest-agent"
echo ""
echo "METHOD 2: Check if Package Already Installed"
echo "-----------------------------------------------"
echo "Checking if qemu-guest-agent package is already installed..."
echo ""
check_package_installed() {
local host=$1
local vmid=$2
local vmname=$3
echo -n "VMID $vmid ($vmname): "
# Try to check via guest agent (if it becomes available)
result=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$host \
"qm guest exec $vmid -- 'dpkg -l | grep qemu-guest-agent' 2>&1" 2>/dev/null || echo "agent_not_running")
if echo "$result" | grep -q "qemu-guest-agent"; then
echo "✅ Package installed (but service may not be running)"
return 0
elif echo "$result" | grep -q "agent_not_running"; then
echo "⚠️ Cannot check (guest agent not running - chicken-and-egg)"
return 1
else
echo "❌ Package not found or cannot verify"
return 1
fi
}
echo "Site 1 (ml110-01):"
for vmid in $SITE1_VMS; do
name=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_1_HOST \
"qm config $vmid | grep '^name:' | cut -d' ' -f2" || echo "unknown")
check_package_installed "$PROXMOX_1_HOST" "$vmid" "$name"
done
echo ""
echo "Site 2 (r630-01):"
for vmid in $SITE2_VMS; do
name=$(sshpass -p "$PROXMOX_PASS" ssh -o StrictHostKeyChecking=no root@$PROXMOX_2_HOST \
"qm config $vmid | grep '^name:' | cut -d' ' -f2" || echo "unknown")
check_package_installed "$PROXMOX_2_HOST" "$vmid" "$name"
done
echo ""
echo "=========================================="
echo "Installation Instructions"
echo "=========================================="
echo ""
echo "Since we cannot verify package installation without guest agent running,"
echo "and VMs are not accessible via SSH, please use Proxmox Web UI Console:"
echo ""
echo "1. Access Proxmox Web UI"
echo "2. Open each VM's console"
echo "3. Login and install/start guest agent"
echo ""
echo "OR wait for cloud-init to complete (may take 5-10 minutes after VM boot)"
echo ""

View File

@@ -0,0 +1,69 @@
#!/bin/bash
# Install qemu-guest-agent in VM 100 via Proxmox console
# This script provides commands to run on the Proxmox node
set -euo pipefail
VMID=100
PROXMOX_NODE="192.168.11.10"
echo "=========================================="
echo "Install Guest Agent in VM 100"
echo "=========================================="
echo ""
echo "Since qm guest exec doesn't work (guest agent not installed),"
echo "we need to access the VM via console or SSH."
echo ""
echo "Option 1: Via Proxmox Web Console"
echo "--------------------------------------"
echo "1. Open Proxmox web UI: https://$PROXMOX_NODE:8006"
echo "2. Navigate to: VM 100 -> Console"
echo "3. Login to the VM"
echo "4. Run these commands:"
echo ""
echo " sudo apt-get update"
echo " sudo apt-get install -y qemu-guest-agent"
echo " sudo systemctl enable qemu-guest-agent"
echo " sudo systemctl start qemu-guest-agent"
echo " sudo systemctl status qemu-guest-agent"
echo ""
echo "Option 2: Via SSH (if VM has network access)"
echo "--------------------------------------"
echo "1. Get VM IP address (if available):"
echo " ssh root@$PROXMOX_NODE 'qm config $VMID | grep net0'"
echo ""
echo "2. Try to find IP via ARP:"
echo " ssh root@$PROXMOX_NODE 'qm config $VMID | grep -oP \"mac=\\\\K[^,]+\" | xargs -I {} arp -a | grep {}'"
echo ""
echo "3. If IP found, SSH to VM:"
echo " ssh admin@<VM_IP>"
echo ""
echo "4. Then run the installation commands above"
echo ""
echo "Option 3: Via Proxmox Shell (qm terminal)"
echo "--------------------------------------"
echo "Note: qm terminal requires guest agent, so this won't work"
echo ""
echo "Option 4: Force Restart VM (if cloud-init should install it)"
echo "--------------------------------------"
echo "If VM was created with cloud-init that includes qemu-guest-agent,"
echo "a restart might trigger cloud-init to install it:"
echo ""
echo " ssh root@$PROXMOX_NODE 'qm shutdown $VMID'"
echo " # Wait for shutdown, or force stop:"
echo " ssh root@$PROXMOX_NODE 'qm stop $VMID'"
echo " ssh root@$PROXMOX_NODE 'qm start $VMID'"
echo ""
echo "=========================================="
echo "Verification"
echo "=========================================="
echo ""
echo "After installation, verify:"
echo ""
echo " ssh root@$PROXMOX_NODE 'qm guest exec $VMID -- systemctl status qemu-guest-agent'"
echo ""
echo "Or run the full check script:"
echo ""
echo " ssh root@$PROXMOX_NODE '/usr/local/bin/complete-vm-100-guest-agent-check.sh'"
echo ""

141
scripts/integrate-ceph-proxmox.sh Executable file
View File

@@ -0,0 +1,141 @@
#!/bin/bash
# integrate-ceph-proxmox.sh
# Integrates Ceph storage with Proxmox
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
source "${SCRIPT_DIR}/../.env"
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Configuration
NODES=("192.168.11.10" "192.168.11.11")
NODE_HOSTNAMES=("ml110-01" "r630-01")
SSH_KEY="${SSH_KEY:-~/.ssh/sankofa_proxmox}"
CEPH_POOL="${CEPH_POOL:-rbd}"
CEPH_FS="${CEPH_FS:-cephfs}"
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_ceph() {
log "Checking Ceph cluster..."
if ! ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${NODES[0]}" 'ceph -s' &>/dev/null; then
error "Ceph cluster not accessible"
fi
info "Ceph cluster is accessible"
}
add_rbd_storage() {
log "Adding RBD storage to Proxmox..."
MON_HOSTS=$(IFS=','; echo "${NODES[*]}")
for i in "${!NODES[@]}"; do
node_ip="${NODES[$i]}"
node_hostname="${NODE_HOSTNAMES[$i]}"
log "Configuring RBD storage on ${node_hostname} (${node_ip})..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${node_ip}" << EOF
set -e
# Check if storage already exists
if pvesm status | grep -q "ceph-rbd"; then
warn "RBD storage already exists on ${node_hostname}"
else
# Add RBD storage
pvesm add rbd ceph-rbd \\
--pool ${CEPH_POOL} \\
--monhost ${MON_HOSTS} \\
--username admin \\
--content images,rootdir \\
--krbd 1
info "RBD storage added on ${node_hostname}"
fi
EOF
done
}
add_cephfs_storage() {
log "Adding CephFS storage to Proxmox..."
MON_HOSTS=$(IFS=','; echo "${NODES[*]}")
for i in "${!NODES[@]}"; do
node_ip="${NODES[$i]}"
node_hostname="${NODE_HOSTNAMES[$i]}"
log "Configuring CephFS storage on ${node_hostname} (${node_ip})..."
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${node_ip}" << EOF
set -e
# Check if storage already exists
if pvesm status | grep -q "ceph-fs"; then
warn "CephFS storage already exists on ${node_hostname}"
else
# Add CephFS storage
pvesm add cephfs ceph-fs \\
--monhost ${MON_HOSTS} \\
--username admin \\
--fsname ${CEPH_FS} \\
--content iso,backup,snippets
info "CephFS storage added on ${node_hostname}"
fi
EOF
done
}
verify_storage() {
log "Verifying storage configuration..."
for i in "${!NODES[@]}"; do
node_ip="${NODES[$i]}"
node_hostname="${NODE_HOSTNAMES[$i]}"
log "Storage on ${node_hostname} (${node_ip}):"
ssh -i "${SSH_KEY}" -o StrictHostKeyChecking=no root@"${node_ip}" 'pvesm status | grep ceph || echo "No Ceph storage found"'
done
}
main() {
log "Integrating Ceph with Proxmox..."
check_ceph
add_rbd_storage
add_cephfs_storage
verify_storage
log "Ceph-Proxmox integration complete!"
info "Storage pools are now available in Proxmox Web UI"
}
if [ "${BASH_SOURCE[0]}" == "${0}" ]; then
main "$@"
fi

161
scripts/k6-load-test.js Normal file
View File

@@ -0,0 +1,161 @@
// k6 Load Testing Configuration
// Comprehensive load test for Sankofa Phoenix API
// Usage: k6 run k6-load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend, Counter } from 'k6/metrics';
// Custom metrics
const errorRate = new Rate('errors');
const apiLatency = new Trend('api_latency');
const graphqlLatency = new Trend('graphql_latency');
const requestCount = new Counter('requests_total');
// Configuration
const API_URL = __ENV.API_URL || 'https://api.sankofa.nexus';
const TEST_DURATION = __ENV.TEST_DURATION || '5m';
const VUS = parseInt(__ENV.VUS || '10');
export const options = {
stages: [
// Ramp up to 10 VUs over 1 minute
{ duration: '1m', target: 10 },
// Stay at 10 VUs for 2 minutes
{ duration: '2m', target: 10 },
// Ramp up to 50 VUs over 2 minutes
{ duration: '2m', target: 50 },
// Stay at 50 VUs for 2 minutes
{ duration: '2m', target: 50 },
// Ramp up to 100 VUs over 2 minutes
{ duration: '2m', target: 100 },
// Stay at 100 VUs for 2 minutes
{ duration: '2m', target: 100 },
// Ramp down to 0 VUs over 2 minutes
{ duration: '2m', target: 0 },
],
thresholds: {
// 95% of requests should be below 200ms
'http_req_duration': ['p(95)<200', 'p(99)<500'],
// Error rate should be less than 1%
'http_req_failed': ['rate<0.01'],
'errors': ['rate<0.01'],
// API latency should be below 200ms
'api_latency': ['p(95)<200'],
// GraphQL latency should be below 300ms
'graphql_latency': ['p(95)<300'],
},
};
// Test data
const graphqlQueries = [
{
query: '{ __typename }',
},
{
query: `
query {
me {
id
email
name
}
}
`,
},
{
query: `
query {
sites {
id
name
region
status
}
}
`,
},
];
// Helper function to get random query
function getRandomQuery() {
return graphqlQueries[Math.floor(Math.random() * graphqlQueries.length)];
}
export default function () {
// Test 1: Health endpoint
const healthStart = Date.now();
const healthRes = http.get(`${API_URL}/health`, {
tags: { name: 'HealthCheck' },
});
const healthDuration = Date.now() - healthStart;
const healthCheck = check(healthRes, {
'health status is 200': (r) => r.status === 200,
'health response time < 100ms': (r) => r.timings.duration < 100,
});
errorRate.add(!healthCheck);
apiLatency.add(healthDuration);
requestCount.add(1, { endpoint: 'health' });
sleep(0.5);
// Test 2: GraphQL endpoint
const graphqlQuery = getRandomQuery();
const graphqlStart = Date.now();
const graphqlRes = http.post(
`${API_URL}/graphql`,
JSON.stringify(graphqlQuery),
{
headers: { 'Content-Type': 'application/json' },
tags: { name: 'GraphQL' },
}
);
const graphqlDuration = Date.now() - graphqlStart;
const graphqlCheck = check(graphqlRes, {
'graphql status is 200': (r) => r.status === 200,
'graphql response time < 300ms': (r) => r.timings.duration < 300,
'graphql has data': (r) => {
try {
const body = JSON.parse(r.body);
return body.data !== undefined || body.errors === undefined;
} catch (e) {
return false;
}
},
});
errorRate.add(!graphqlCheck);
graphqlLatency.add(graphqlDuration);
requestCount.add(1, { endpoint: 'graphql' });
sleep(1);
}
export function handleSummary(data) {
return {
'stdout': textSummary(data, { indent: ' ', enableColors: true }),
'summary.json': JSON.stringify(data),
};
}
function textSummary(data, options) {
const indent = options.indent || '';
const enableColors = options.enableColors || false;
let summary = '';
summary += `${indent}Test Summary\n`;
summary += `${indent}============\n\n`;
// Metrics summary
summary += `${indent}Metrics:\n`;
summary += `${indent} - Total Requests: ${data.metrics.requests_total.values.count}\n`;
summary += `${indent} - Failed Requests: ${data.metrics.http_req_failed.values.rate * 100}%\n`;
summary += `${indent} - p95 Latency: ${data.metrics.http_req_duration.values['p(95)']}ms\n`;
summary += `${indent} - p99 Latency: ${data.metrics.http_req_duration.values['p(99)']}ms\n`;
return summary;
}

View File

@@ -0,0 +1,91 @@
#!/bin/bash
# Kill stuck Proxmox task processes for a specific VM
# Usage: bash kill-stuck-proxmox-tasks.sh <VMID>
VMID="${1:-100}"
if [ -z "$VMID" ]; then
echo "Usage: $0 <VMID>"
echo "Example: $0 100"
exit 1
fi
echo "=== Killing Stuck Proxmox Tasks for VM $VMID ==="
echo ""
# 1. Find all task processes for this VM
echo "1. Finding stuck processes..."
TASK_PROCS=$(ps aux | grep -E "task.*$VMID|qm.*$VMID|qemu.*$VMID" | grep -v grep)
if [ -z "$TASK_PROCS" ]; then
echo " ✅ No stuck processes found"
else
echo " Found processes:"
echo "$TASK_PROCS" | while read line; do
PID=$(echo "$line" | awk '{print $2}')
CMD=$(echo "$line" | awk '{for(i=11;i<=NF;i++) printf "%s ", $i; print ""}')
echo " PID $PID: $CMD"
done
echo ""
# Extract PIDs
PIDS=$(echo "$TASK_PROCS" | awk '{print $2}' | tr '\n' ' ')
# 2. Kill all task processes
echo "2. Killing stuck processes..."
for PID in $PIDS; do
if kill -9 "$PID" 2>/dev/null; then
echo " ✅ Killed PID $PID"
else
echo " ⚠️ Failed to kill PID $PID (may already be gone)"
fi
done
echo ""
fi
# 3. Also try pkill for any remaining processes
echo "3. Cleaning up any remaining processes..."
pkill -9 -f "task.*$VMID" 2>/dev/null && echo " ✅ Killed remaining task processes" || echo " No remaining task processes"
pkill -9 -f "qm.*$VMID" 2>/dev/null && echo " ✅ Killed remaining qm processes" || echo " No remaining qm processes"
sleep 2
echo ""
# 4. Verify no processes remain
echo "4. Verifying no processes remain..."
REMAINING=$(ps aux | grep -E "task.*$VMID|qm.*$VMID|qemu.*$VMID" | grep -v grep)
if [ -z "$REMAINING" ]; then
echo " ✅ No processes remaining"
else
echo " ⚠️ Some processes still running:"
echo "$REMAINING"
fi
echo ""
# 5. Remove lock file
echo "5. Removing lock file..."
if [ -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
rm -f "/var/lock/qemu-server/lock-$VMID.conf"
if [ ! -f "/var/lock/qemu-server/lock-$VMID.conf" ]; then
echo " ✅ Lock file removed"
else
echo " ⚠️ Failed to remove lock file"
exit 1
fi
else
echo " Lock file already removed"
fi
echo ""
# 6. Final verification
echo "6. Final status check..."
echo " Lock file: $([ ! -f "/var/lock/qemu-server/lock-$VMID.conf" ] && echo "✅ Removed" || echo "⚠️ Still exists")"
echo " Processes: $([ -z "$(ps aux | grep -E 'task.*'$VMID'|qm.*'$VMID'|qemu.*'$VMID | grep -v grep)" ] && echo "✅ None" || echo "⚠️ Some remain")"
echo ""
echo "=== Cleanup Complete ==="
echo ""
echo "Next steps:"
echo "1. Try unlock: qm unlock $VMID"
echo "2. Check VM status: qm status $VMID"
echo "3. Check VM config: qm config $VMID"

131
scripts/list-proxmox-images.sh Executable file
View File

@@ -0,0 +1,131 @@
#!/bin/bash
# list-proxmox-images.sh
# Lists all ISO and disk images available on Proxmox instances
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
NODE1_IP="192.168.11.10"
NODE1_NAME="ML110-01"
NODE1_TOKEN="${PROXMOX_TOKEN_ML110_01:-}"
NODE2_IP="192.168.11.11"
NODE2_NAME="R630-01"
NODE2_TOKEN="${PROXMOX_TOKEN_R630_01:-}"
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
list_images() {
local endpoint=$1
local token=$2
local node_name=$3
echo ""
info "=== ${node_name} Available Images ==="
echo ""
# Get all storage pools
local storages=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" \
"${endpoint}/api2/json/storage" 2>/dev/null | \
jq -r '.data[]? | select(.content | contains("iso") or contains("vztmpl")) | .storage')
if [ -z "$storages" ]; then
warn "No storage pools found with ISO or template content"
return
fi
local has_images=false
for storage in $storages; do
local content=$(curl -k -s -H "Authorization: PVEAPIToken ${token}" \
"${endpoint}/api2/json/storage/${storage}/content" 2>/dev/null)
# Check for ISO files
local isos=$(echo "$content" | jq -r '.data[]? | select(.content == "iso") | "\(.volid) | \(.size | tonumber / 1024 / 1024 / 1024 | floor)GB"')
if [ -n "$isos" ]; then
has_images=true
echo "📀 ISO Images in ${storage}:"
echo "$isos" | while IFS='|' read -r volid size; do
echo "${volid} (${size})"
done
echo ""
fi
# Check for container templates
local templates=$(echo "$content" | jq -r '.data[]? | select(.content == "vztmpl") | "\(.volid) | \(.size | tonumber / 1024 / 1024 / 1024 | floor)GB"')
if [ -n "$templates" ]; then
has_images=true
echo "📦 Container Templates in ${storage}:"
echo "$templates" | while IFS='|' read -r volid size; do
echo "${volid} (${size})"
done
echo ""
fi
# Check for disk images (raw, qcow2, etc.)
local disk_images=$(echo "$content" | jq -r '.data[]? | select(.volid | contains(".img") or contains(".raw") or contains(".qcow2")) | "\(.volid) | \(.size | tonumber / 1024 / 1024 / 1024 | floor)GB"')
if [ -n "$disk_images" ]; then
has_images=true
echo "💾 Disk Images in ${storage}:"
echo "$disk_images" | while IFS='|' read -r volid size; do
echo "${volid} (${size})"
done
echo ""
fi
done
if [ "$has_images" = false ]; then
warn "No ISO files, templates, or disk images found"
fi
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Proxmox Image Inventory ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
if [ -z "$NODE1_TOKEN" ] || [ -z "$NODE2_TOKEN" ]; then
warn "Proxmox API tokens not found in .env file"
exit 1
fi
list_images "https://${NODE1_IP}:8006" "${NODE1_TOKEN}" "${NODE1_NAME}"
list_images "https://${NODE2_IP}:8006" "${NODE2_TOKEN}" "${NODE2_NAME}"
echo ""
info "Note: This script lists images available via API."
info "Some images may require additional permissions to list."
echo ""
}
main "$@"

20
scripts/load-env.sh Executable file
View File

@@ -0,0 +1,20 @@
#!/bin/bash
# load-env.sh
# Helper script to load environment variables from .env file
if [ -f .env ]; then
# Export variables from .env file
# This handles comments and empty lines
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/^/export /')
set +a
# Also set CLOUDFLARE_API_TOKEN from Global API Key if not set
if [ -z "${CLOUDFLARE_API_TOKEN:-}" ] && [ -n "${CLOUDFLARE_API_KEY:-}" ] && [ -n "${CLOUDFLARE_EMAIL:-}" ]; then
# For scripts that need API Token, we can use Global API Key + Email
# Some scripts may need the token format, so we'll keep both
export CLOUDFLARE_API_KEY
export CLOUDFLARE_EMAIL
fi
fi

70
scripts/monitor-vm-deletion.sh Executable file
View File

@@ -0,0 +1,70 @@
#!/bin/bash
# Monitor VM deletion progress
PROXMOX_PASS="L@kers2010"
SITE1="192.168.11.10"
SITE2="192.168.11.11"
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
echo "=========================================="
echo "VM Deletion Progress Monitor"
echo "=========================================="
echo ""
echo -e "${CYAN}Site 1 (ml110-01) - ${SITE1}${NC}"
echo "VMs remaining:"
VM_COUNT1=$(sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE1} "qm list 2>/dev/null | tail -n +2 | wc -l")
if [ "${VM_COUNT1}" -eq 0 ]; then
echo -e "${GREEN} ✅ All VMs deleted${NC}"
else
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE1} "qm list 2>/dev/null | tail -n +2"
echo -e "${YELLOW}${VM_COUNT1} VMs remaining${NC}"
fi
PROCESS_COUNT1=$(sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE1} "ps aux | grep 'qm destroy' | grep -v grep | wc -l" 2>/dev/null || echo "0")
if [ "${PROCESS_COUNT1}" -gt 0 ]; then
echo -e "${BLUE} 🔄 ${PROCESS_COUNT1} deletion processes running${NC}"
fi
echo ""
echo -e "${CYAN}Site 2 (r630-01) - ${SITE2}${NC}"
echo "VMs remaining:"
VM_COUNT2=$(sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE2} "qm list 2>/dev/null | tail -n +2 | wc -l")
if [ "${VM_COUNT2}" -eq 0 ]; then
echo -e "${GREEN} ✅ All VMs deleted${NC}"
else
sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE2} "qm list 2>/dev/null | tail -n +2"
echo -e "${YELLOW}${VM_COUNT2} VMs remaining${NC}"
fi
PROCESS_COUNT2=$(sshpass -p "${PROXMOX_PASS}" ssh -o StrictHostKeyChecking=no root@${SITE2} "ps aux | grep 'qm destroy' | grep -v grep | wc -l" 2>/dev/null || echo "0")
if [ "${PROCESS_COUNT2}" -gt 0 ]; then
echo -e "${BLUE} 🔄 ${PROCESS_COUNT2} deletion processes running${NC}"
fi
echo ""
TOTAL_VMS=$((VM_COUNT1 + VM_COUNT2))
TOTAL_PROCESSES=$((PROCESS_COUNT1 + PROCESS_COUNT2))
echo "=========================================="
echo "Summary:"
echo " Total VMs remaining: ${TOTAL_VMS}"
echo " Total deletion processes: ${TOTAL_PROCESSES}"
echo "=========================================="
if [ "${TOTAL_VMS}" -eq 0 ]; then
echo -e "${GREEN}✅ All VMs deleted successfully!${NC}"
exit 0
elif [ "${TOTAL_PROCESSES}" -gt 0 ]; then
echo -e "${BLUE}⏳ Deletion in progress...${NC}"
exit 1
else
echo -e "${YELLOW}⚠️ No deletion processes running${NC}"
exit 2
fi

190
scripts/performance-test.sh Executable file
View File

@@ -0,0 +1,190 @@
#!/bin/bash
# Performance Testing Script
# Uses k6, Apache Bench, or curl for load testing
set -e
# Configuration
API_URL="${API_URL:-https://api.sankofa.nexus}"
PORTAL_URL="${PORTAL_URL:-https://portal.sankofa.nexus}"
TEST_DURATION="${TEST_DURATION:-60s}"
VUS="${VUS:-10}" # Virtual Users
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Check for k6
if command -v k6 &> /dev/null; then
log_info "Using k6 for load testing"
USE_K6=true
elif command -v ab &> /dev/null; then
log_info "Using Apache Bench for load testing"
USE_K6=false
USE_AB=true
else
log_error "Neither k6 nor Apache Bench found. Install k6: https://k6.io/docs/getting-started/installation/"
exit 1
fi
# Create k6 test script
create_k6_script() {
cat > /tmp/k6_test.js << 'EOF'
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';
const errorRate = new Rate('errors');
export const options = {
stages: [
{ duration: '30s', target: 10 }, // Ramp up
{ duration: '1m', target: 10 }, // Stay at 10 VUs
{ duration: '30s', target: 20 }, // Ramp up to 20
{ duration: '1m', target: 20 }, // Stay at 20 VUs
{ duration: '30s', target: 0 }, // Ramp down
],
thresholds: {
'http_req_duration': ['p(95)<200'], // 95% of requests should be below 200ms
'http_req_failed': ['rate<0.01'], // Error rate should be less than 1%
'errors': ['rate<0.01'],
},
};
const API_URL = __ENV.API_URL || 'https://api.sankofa.nexus';
export default function () {
// Test health endpoint
const healthRes = http.get(`${API_URL}/health`);
const healthCheck = check(healthRes, {
'health status is 200': (r) => r.status === 200,
'health response time < 100ms': (r) => r.timings.duration < 100,
});
errorRate.add(!healthCheck);
// Test GraphQL endpoint
const graphqlPayload = JSON.stringify({
query: '{ __typename }',
});
const graphqlRes = http.post(`${API_URL}/graphql`, graphqlPayload, {
headers: { 'Content-Type': 'application/json' },
});
const graphqlCheck = check(graphqlRes, {
'graphql status is 200': (r) => r.status === 200,
'graphql response time < 200ms': (r) => r.timings.duration < 200,
'graphql has data': (r) => JSON.parse(r.body).data !== undefined,
});
errorRate.add(!graphqlCheck);
sleep(1);
}
EOF
}
# Run k6 test
run_k6_test() {
log_info "Starting k6 load test..."
log_info "Duration: ${TEST_DURATION}"
log_info "Virtual Users: ${VUS}"
log_info "API URL: ${API_URL}"
echo ""
k6 run --env API_URL="${API_URL}" \
--duration "${TEST_DURATION}" \
--vus "${VUS}" \
/tmp/k6_test.js
}
# Run Apache Bench test
run_ab_test() {
log_info "Starting Apache Bench load test..."
log_info "Requests: 1000"
log_info "Concurrency: ${VUS}"
log_info "API URL: ${API_URL}"
echo ""
# Test health endpoint
log_info "Testing /health endpoint..."
ab -n 1000 -c "${VUS}" -k "${API_URL}/health"
echo ""
log_info "Testing GraphQL endpoint..."
echo '{"query": "{ __typename }"}' > /tmp/graphql_payload.json
ab -n 1000 -c "${VUS}" -k -p /tmp/graphql_payload.json \
-T 'application/json' \
"${API_URL}/graphql"
}
# Simple curl-based test
run_curl_test() {
log_info "Running simple curl-based performance test..."
log_info "This is a basic test. For comprehensive testing, install k6."
echo ""
local total_time=0
local success_count=0
local fail_count=0
local iterations=100
for i in $(seq 1 $iterations); do
start=$(date +%s%N)
if curl -sf "${API_URL}/health" > /dev/null; then
end=$(date +%s%N)
duration=$(( (end - start) / 1000000 )) # Convert to milliseconds
total_time=$((total_time + duration))
((success_count++))
else
((fail_count++))
fi
done
if [ $success_count -gt 0 ]; then
avg_time=$((total_time / success_count))
echo "Results:"
echo " Success: ${success_count}/${iterations}"
echo " Failed: ${fail_count}/${iterations}"
echo " Average response time: ${avg_time}ms"
if [ $avg_time -lt 200 ]; then
log_info "✓ Average response time is below 200ms target"
else
log_error "✗ Average response time exceeds 200ms target"
fi
else
log_error "All requests failed!"
exit 1
fi
}
# Main execution
main() {
echo "=========================================="
echo "Sankofa Phoenix Performance Test"
echo "=========================================="
echo ""
if [ "$USE_K6" = true ]; then
create_k6_script
run_k6_test
elif [ "$USE_AB" = true ]; then
run_ab_test
else
run_curl_test
fi
echo ""
log_info "Performance test completed!"
}
main "$@"

View File

@@ -0,0 +1,251 @@
#!/bin/bash
# Pre-deployment quota check for VM deployments
# Checks quota before applying VM manifests
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
VM_DIR="${PROJECT_ROOT}/examples/production"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%H:%M:%S')] ❌${NC} $*"
}
# Load environment
load_env() {
if [ -f "${PROJECT_ROOT}/.env" ]; then
source "${PROJECT_ROOT}/.env"
fi
SANKOFA_API_URL="${SANKOFA_API_URL:-http://localhost:4000/graphql}"
SANKOFA_API_TOKEN="${SANKOFA_API_TOKEN:-}"
}
# Extract VM resources from YAML
extract_vm_resources() {
local file=$1
local cpu=$(grep "cpu:" "$file" | head -1 | sed 's/.*cpu: *\([0-9]*\).*/\1/')
local memory=$(grep "memory:" "$file" | head -1 | sed 's/.*memory: *"\(.*\)".*/\1/')
local disk=$(grep "disk:" "$file" | head -1 | sed 's/.*disk: *"\(.*\)".*/\1/')
local tenant=$(grep "tenant.sankofa.nexus/id:" "$file" | head -1 | sed 's/.*tenant.sankofa.nexus\/id: *"\(.*\)".*/\1/' || echo "")
echo "${cpu}|${memory}|${disk}|${tenant}"
}
# Convert memory to GB
memory_to_gb() {
local memory=$1
if [[ "$memory" =~ Gi$ ]]; then
echo "${memory%Gi}"
elif [[ "$memory" =~ Mi$ ]]; then
echo "scale=2; ${memory%Mi} / 1024" | bc
else
echo "$memory"
fi
}
# Convert disk to GB
disk_to_gb() {
local disk=$1
if [[ "$disk" =~ Gi$ ]]; then
echo "${disk%Gi}"
elif [[ "$disk" =~ Ti$ ]]; then
echo "scale=2; ${disk%Ti} * 1024" | bc
else
echo "$disk"
fi
}
# Check quota via API
check_quota_api() {
local tenant_id=$1
local cpu=$2
local memory_gb=$3
local disk_gb=$4
if [[ -z "$SANKOFA_API_TOKEN" ]]; then
log_warning "SANKOFA_API_TOKEN not set, skipping API quota check"
return 0
fi
local query=$(cat <<EOF
mutation {
checkQuota(
tenantId: "$tenant_id"
resourceRequest: {
compute: {
vcpu: $cpu
memory: $memory_gb
instances: 1
}
storage: {
size: $disk_gb
}
}
) {
allowed
exceeded
current {
compute {
instances
vcpu
memory
}
storage {
total
}
}
limits {
compute {
instances
vcpu
memory
}
storage {
total
}
}
}
}
EOF
)
local response=$(curl -s -X POST "$SANKOFA_API_URL" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $SANKOFA_API_TOKEN" \
-d "{\"query\": \"$query\"}")
local allowed=$(echo "$response" | jq -r '.data.checkQuota.allowed // false')
local exceeded=$(echo "$response" | jq -r '.data.checkQuota.exceeded // []' | jq -r '.[]' | tr '\n' ',' | sed 's/,$//')
if [[ "$allowed" == "true" ]]; then
log_success "Quota check passed for tenant: $tenant_id"
return 0
else
log_error "Quota check failed for tenant: $tenant_id"
if [[ -n "$exceeded" ]]; then
log_error "Exceeded resources: $exceeded"
fi
return 1
fi
}
# Check Proxmox resources
check_proxmox_resources() {
local file=$1
local cpu=$2
local memory_gb=$3
local disk_gb=$4
log "Checking Proxmox resources..."
# Run the quota check script
if [[ -f "${SCRIPT_DIR}/check-proxmox-quota-ssh.sh" ]]; then
"${SCRIPT_DIR}/check-proxmox-quota-ssh.sh" > /tmp/quota-check-output.txt 2>&1 || true
# Parse output (simplified - would need more robust parsing)
log_info "Proxmox resource check completed (see output above)"
else
log_warning "Proxmox quota check script not found"
fi
}
# Check single VM file
check_vm_file() {
local file=$1
log "Checking: $file"
local resources=$(extract_vm_resources "$file")
IFS='|' read -r cpu memory disk tenant <<< "$resources"
if [[ -z "$cpu" || -z "$memory" || -z "$disk" ]]; then
log_error "$file: Missing resource specifications"
return 1
fi
local memory_gb=$(memory_to_gb "$memory")
local disk_gb=$(disk_to_gb "$disk")
log "Resources: CPU=$cpu, Memory=${memory_gb}GB, Disk=${disk_gb}GB, Tenant=${tenant:-none}"
# Check quota if tenant is specified
if [[ -n "$tenant" && "$tenant" != "infrastructure" ]]; then
check_quota_api "$tenant" "$cpu" "$memory_gb" "$disk_gb" || return 1
else
log_info "Infrastructure VM - skipping tenant quota check"
check_proxmox_resources "$file" "$cpu" "$memory_gb" "$disk_gb"
fi
return 0
}
# Main function
main() {
local files_to_check=("$@")
load_env
log "Pre-deployment quota check"
echo
if [[ ${#files_to_check[@]} -eq 0 ]]; then
log "No files specified. Checking all VM files..."
while IFS= read -r -d '' file; do
files_to_check+=("$file")
done < <(find "$VM_DIR" -name "*.yaml" -type f -print0)
fi
local total_files=${#files_to_check[@]}
local passed=0
local failed=0
for file in "${files_to_check[@]}"; do
if check_vm_file "$file"; then
passed=$((passed + 1))
else
failed=$((failed + 1))
fi
echo
done
# Summary
echo
log "=== Quota Check Summary ==="
echo "Files checked: $total_files"
echo "Passed: $passed"
echo "Failed: $failed"
echo
if [[ $failed -eq 0 ]]; then
log_success "All quota checks passed!"
return 0
else
log_error "$failed quota check(s) failed. Do not proceed with deployment."
return 1
fi
}
main "$@"

View File

@@ -0,0 +1,426 @@
#!/usr/bin/env python3
"""
Proxmox Review and Deployment Planning Script
Connects to both Proxmox instances, reviews configurations, checks status,
and generates a deployment plan with detailed task list.
"""
import os
import sys
import json
import requests
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional, Tuple
from urllib.parse import urlparse
# Try to import proxmoxer, but fall back to direct API calls if not available
try:
from proxmoxer import ProxmoxAPI
PROXMOXER_AVAILABLE = True
except ImportError:
PROXMOXER_AVAILABLE = False
print("Warning: proxmoxer not installed. Using direct API calls.")
print("Install with: pip install proxmoxer")
# Colors for terminal output
class Colors:
RED = '\033[0;31m'
GREEN = '\033[0;32m'
YELLOW = '\033[1;33m'
BLUE = '\033[0;34m'
CYAN = '\033[0;36m'
NC = '\033[0m' # No Color
def log(message: str, color: str = Colors.BLUE):
"""Print a log message with timestamp."""
timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print(f"{color}[{timestamp}]{Colors.NC} {message}")
def log_success(message: str):
"""Print a success message."""
log(f"{message}", Colors.GREEN)
def log_warning(message: str):
"""Print a warning message."""
log(f"⚠️ {message}", Colors.YELLOW)
def log_error(message: str):
"""Print an error message."""
log(f"{message}", Colors.RED)
class ProxmoxClient:
"""Proxmox API client."""
def __init__(self, api_url: str, username: str = None, password: str = None,
token: str = None, verify_ssl: bool = True):
self.api_url = api_url.rstrip('/')
self.username = username
self.password = password
self.token = token
self.verify_ssl = verify_ssl
self.session = requests.Session()
self.ticket = None
self.csrf_token = None
if not verify_ssl:
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
def authenticate(self) -> bool:
"""Authenticate to Proxmox API."""
if self.token:
# Token authentication
self.session.headers['Authorization'] = f'PVEAuthCookie={self.token}'
return True
if not self.username or not self.password:
log_error("Username/password or token required")
return False
# Password authentication
auth_url = f"{self.api_url}/api2/json/access/ticket"
try:
response = self.session.post(
auth_url,
data={'username': self.username, 'password': self.password},
verify=self.verify_ssl,
timeout=10
)
response.raise_for_status()
data = response.json()['data']
self.ticket = data['ticket']
self.csrf_token = data.get('CSRFPreventionToken', '')
self.session.headers['Cookie'] = f'PVEAuthCookie={self.ticket}'
if self.csrf_token:
self.session.headers['CSRFPreventionToken'] = self.csrf_token
return True
except Exception as e:
log_error(f"Authentication failed: {e}")
return False
def api_call(self, endpoint: str, method: str = 'GET', data: dict = None) -> Optional[dict]:
"""Make an API call to Proxmox."""
url = f"{self.api_url}/api2/json{endpoint}"
try:
if method == 'GET':
response = self.session.get(url, verify=self.verify_ssl, timeout=10)
elif method == 'POST':
response = self.session.post(url, json=data, verify=self.verify_ssl, timeout=10)
elif method == 'PUT':
response = self.session.put(url, json=data, verify=self.verify_ssl, timeout=10)
elif method == 'DELETE':
response = self.session.delete(url, verify=self.verify_ssl, timeout=10)
else:
log_error(f"Unsupported HTTP method: {method}")
return None
response.raise_for_status()
return response.json().get('data')
except Exception as e:
log_error(f"API call failed ({endpoint}): {e}")
return None
def get_version(self) -> Optional[dict]:
"""Get Proxmox version information."""
return self.api_call('/version')
def get_cluster_status(self) -> Optional[List[dict]]:
"""Get cluster status."""
return self.api_call('/cluster/status')
def get_nodes(self) -> Optional[List[dict]]:
"""Get list of nodes."""
return self.api_call('/nodes')
def get_node_status(self, node: str) -> Optional[dict]:
"""Get status of a specific node."""
return self.api_call(f'/nodes/{node}/status')
def get_storage(self) -> Optional[List[dict]]:
"""Get storage information."""
return self.api_call('/storage')
def get_vms(self, node: str) -> Optional[List[dict]]:
"""Get VMs on a node."""
return self.api_call(f'/nodes/{node}/qemu')
def get_networks(self, node: str) -> Optional[List[dict]]:
"""Get network configuration for a node."""
return self.api_call(f'/nodes/{node}/network')
def load_environment():
"""Load environment variables."""
env_file = Path(__file__).parent.parent / '.env'
env_vars = {}
if env_file.exists():
with open(env_file) as f:
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, value = line.split('=', 1)
env_vars[key.strip()] = value.strip()
# Set defaults
config = {
'proxmox_1': {
'api_url': env_vars.get('PROXMOX_1_API_URL', 'https://192.168.11.10:8006'),
'user': env_vars.get('PROXMOX_1_USER', 'root'),
'password': env_vars.get('PROXMOX_1_PASS', ''),
'token': env_vars.get('PROXMOX_1_API_TOKEN', ''),
'verify_ssl': env_vars.get('PROXMOX_1_INSECURE_SKIP_TLS_VERIFY', 'false').lower() != 'true'
},
'proxmox_2': {
'api_url': env_vars.get('PROXMOX_2_API_URL', 'https://192.168.11.11:8006'),
'user': env_vars.get('PROXMOX_2_USER', 'root'),
'password': env_vars.get('PROXMOX_2_PASS', ''),
'token': env_vars.get('PROXMOX_2_API_TOKEN', ''),
'verify_ssl': env_vars.get('PROXMOX_2_INSECURE_SKIP_TLS_VERIFY', 'false').lower() != 'true'
}
}
return config
def connect_and_review(client: ProxmoxClient, instance_num: int, output_dir: Path) -> Optional[dict]:
"""Connect to Proxmox and gather information."""
log(f"Connecting to Proxmox Instance {instance_num}...")
if not client.authenticate():
log_error(f"Failed to authenticate to Instance {instance_num}")
return None
log_success(f"Authenticated to Instance {instance_num}")
# Gather information
info = {
'instance': instance_num,
'timestamp': datetime.now().isoformat(),
'version': client.get_version(),
'cluster_status': client.get_cluster_status(),
'nodes': client.get_nodes(),
'storage': client.get_storage()
}
# Get detailed node information
if info['nodes']:
log(f" Found {len(info['nodes'])} nodes")
for node_data in info['nodes']:
node = node_data.get('node', 'unknown')
log(f" - {node}")
# Get node status
node_status = client.get_node_status(node)
if node_status:
info[f'node_{node}_status'] = node_status
# Get VMs
vms = client.get_vms(node)
if vms:
info[f'node_{node}_vms'] = vms
log(f" VMs: {len(vms)}")
# Get networks
networks = client.get_networks(node)
if networks:
info[f'node_{node}_networks'] = networks
# Save to file
output_file = output_dir / f"proxmox-{instance_num}-status-{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
with open(output_file, 'w') as f:
json.dump(info, f, indent=2)
log_success(f"Status saved to {output_file}")
# Display summary
if info.get('version'):
version = info['version'].get('version', 'unknown')
log(f" Version: {version}")
return info
def review_configurations(project_root: Path, output_dir: Path) -> str:
"""Review and document configurations."""
log("Reviewing configurations...")
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
config_file = output_dir / f"configuration-review-{timestamp}.md"
content = []
content.append("# Proxmox Configuration Review\n")
content.append(f"Generated: {datetime.now().isoformat()}\n")
# Environment configuration
content.append("## Environment Configuration\n")
content.append("### Proxmox Instance 1\n")
content.append("- API URL: From .env (PROXMOX_1_API_URL)\n")
content.append("- User: From .env (PROXMOX_1_USER)\n")
content.append("\n### Proxmox Instance 2\n")
content.append("- API URL: From .env (PROXMOX_2_API_URL)\n")
content.append("- User: From .env (PROXMOX_2_USER)\n")
content.append("\n")
# Provider config
provider_config = project_root / "crossplane-provider-proxmox" / "examples" / "provider-config.yaml"
if provider_config.exists():
content.append("## Crossplane Provider Configuration\n")
content.append("```yaml\n")
with open(provider_config) as f:
content.append(f.read())
content.append("```\n\n")
# Cloudflare tunnel configs
tunnel_configs_dir = project_root / "cloudflare" / "tunnel-configs"
if tunnel_configs_dir.exists():
content.append("## Cloudflare Tunnel Configurations\n")
for config_file in sorted(tunnel_configs_dir.glob("proxmox-site-*.yaml")):
content.append(f"### {config_file.name}\n")
content.append("```yaml\n")
with open(config_file) as f:
content.append(f.read())
content.append("```\n\n")
with open(config_file, 'w') as f:
f.write(''.join(content))
log_success(f"Configuration review saved to {config_file}")
return str(config_file)
def generate_deployment_plan(output_dir: Path, config: dict) -> str:
"""Generate deployment plan."""
log("Generating deployment plan...")
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
plan_file = output_dir / f"deployment-plan-{timestamp}.md"
content = []
content.append("# Proxmox Deployment Plan\n")
content.append(f"Generated: {datetime.now().isoformat()}\n")
content.append("## Current Status\n")
content.append(f"- **Instance 1**: {config['proxmox_1']['api_url']}\n")
content.append(f"- **Instance 2**: {config['proxmox_2']['api_url']}\n")
content.append("\n## Deployment Phases\n")
content.append("### Phase 1: Connection and Validation\n")
content.append("1. Verify connectivity to both instances\n")
content.append("2. Review cluster status and node health\n")
content.append("3. Review storage and network configuration\n")
content.append("\n### Phase 2: Configuration Alignment\n")
content.append("1. Map instances to sites\n")
content.append("2. Set up authentication (API tokens)\n")
content.append("3. Configure Cloudflare tunnels\n")
content.append("\n### Phase 3: Crossplane Provider Deployment\n")
content.append("1. Complete API client implementation\n")
content.append("2. Build and deploy provider\n")
content.append("3. Configure ProviderConfig\n")
content.append("\n### Phase 4: Infrastructure Deployment\n")
content.append("1. Deploy test VMs\n")
content.append("2. Set up monitoring\n")
content.append("3. Configure backups\n")
content.append("\n### Phase 5: Production Readiness\n")
content.append("1. Security hardening\n")
content.append("2. Documentation\n")
content.append("3. Testing and validation\n")
with open(plan_file, 'w') as f:
f.write(''.join(content))
log_success(f"Deployment plan saved to {plan_file}")
return str(plan_file)
def generate_task_list(output_dir: Path, config: dict) -> str:
"""Generate detailed task list."""
log("Generating task list...")
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
task_file = output_dir / f"task-list-{timestamp}.md"
content = []
content.append("# Proxmox Deployment Task List\n")
content.append(f"Generated: {datetime.now().isoformat()}\n")
content.append("## Immediate Tasks (Priority: High)\n")
content.append("### Connection and Authentication\n")
content.append("- [ ] **TASK-001**: Verify connectivity to Instance 1\n")
content.append(f" - URL: {config['proxmox_1']['api_url']}\n")
content.append("- [ ] **TASK-002**: Verify connectivity to Instance 2\n")
content.append(f" - URL: {config['proxmox_2']['api_url']}\n")
content.append("- [ ] **TASK-003**: Test authentication to Instance 1\n")
content.append("- [ ] **TASK-004**: Test authentication to Instance 2\n")
content.append("\n### Configuration Review\n")
content.append("- [ ] **TASK-005**: Review provider-config.yaml\n")
content.append("- [ ] **TASK-006**: Review Cloudflare tunnel configs\n")
content.append("- [ ] **TASK-007**: Map instances to sites\n")
content.append("\n## Short-term Tasks (Priority: Medium)\n")
content.append("### Crossplane Provider\n")
content.append("- [ ] **TASK-008**: Complete Proxmox API client implementation\n")
content.append("- [ ] **TASK-009**: Build and test provider\n")
content.append("- [ ] **TASK-010**: Deploy provider to Kubernetes\n")
content.append("- [ ] **TASK-011**: Create ProviderConfig resource\n")
content.append("\n### Infrastructure Setup\n")
content.append("- [ ] **TASK-012**: Deploy Prometheus exporters\n")
content.append("- [ ] **TASK-013**: Configure Cloudflare tunnels\n")
content.append("- [ ] **TASK-014**: Set up monitoring dashboards\n")
content.append("\n## Long-term Tasks (Priority: Low)\n")
content.append("- [ ] **TASK-015**: Deploy test VMs\n")
content.append("- [ ] **TASK-016**: End-to-end testing\n")
content.append("- [ ] **TASK-017**: Performance testing\n")
content.append("- [ ] **TASK-018**: Create runbooks\n")
content.append("- [ ] **TASK-019**: Set up backups\n")
content.append("- [ ] **TASK-020**: Security audit\n")
with open(task_file, 'w') as f:
f.write(''.join(content))
log_success(f"Task list saved to {task_file}")
return str(task_file)
def main():
"""Main execution."""
log("Starting Proxmox Review and Deployment Planning...")
log("=" * 50)
project_root = Path(__file__).parent.parent
output_dir = project_root / "docs" / "proxmox-review"
output_dir.mkdir(parents=True, exist_ok=True)
config = load_environment()
log("\n=== Phase 1: Connecting to Proxmox Instances ===")
# Connect to Instance 1
client1 = ProxmoxClient(
config['proxmox_1']['api_url'],
config['proxmox_1']['user'],
config['proxmox_1']['password'],
config['proxmox_1']['token'],
config['proxmox_1']['verify_ssl']
)
info1 = connect_and_review(client1, 1, output_dir)
log("")
# Connect to Instance 2
client2 = ProxmoxClient(
config['proxmox_2']['api_url'],
config['proxmox_2']['user'],
config['proxmox_2']['password'],
config['proxmox_2']['token'],
config['proxmox_2']['verify_ssl']
)
info2 = connect_and_review(client2, 2, output_dir)
log("\n=== Phase 2: Reviewing Configurations ===")
review_configurations(project_root, output_dir)
log("\n=== Phase 3: Generating Deployment Plan ===")
generate_deployment_plan(output_dir, config)
log("\n=== Phase 4: Generating Task List ===")
generate_task_list(output_dir, config)
log("\n" + "=" * 50)
log_success("Review and planning completed!")
log(f"\nOutput directory: {output_dir}")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,649 @@
#!/bin/bash
set -euo pipefail
# Proxmox Review and Deployment Planning Script
# This script connects to both Proxmox instances, reviews configurations,
# checks status, and generates a deployment plan with task list
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
OUTPUT_DIR="${PROJECT_ROOT}/docs/proxmox-review"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
error() {
log_error "$*"
exit 1
}
# Load environment variables
load_env() {
if [ -f "${PROJECT_ROOT}/.env" ]; then
source "${PROJECT_ROOT}/.env"
log "Loaded environment variables from .env"
else
log_warning ".env file not found, using defaults from ENV_EXAMPLES.md"
fi
# Set defaults if not provided
PROXMOX_1_API_URL="${PROXMOX_1_API_URL:-https://192.168.11.10:8006}"
PROXMOX_1_USER="${PROXMOX_1_USER:-root}"
PROXMOX_1_PASS="${PROXMOX_1_PASS:-}"
PROXMOX_1_API_TOKEN="${PROXMOX_1_API_TOKEN:-}"
PROXMOX_1_INSECURE_SKIP_TLS_VERIFY="${PROXMOX_1_INSECURE_SKIP_TLS_VERIFY:-false}"
PROXMOX_2_API_URL="${PROXMOX_2_API_URL:-https://192.168.11.11:8006}"
PROXMOX_2_USER="${PROXMOX_2_USER:-root}"
PROXMOX_2_PASS="${PROXMOX_2_PASS:-}"
PROXMOX_2_API_TOKEN="${PROXMOX_2_API_TOKEN:-}"
PROXMOX_2_INSECURE_SKIP_TLS_VERIFY="${PROXMOX_2_INSECURE_SKIP_TLS_VERIFY:-false}"
}
# Check prerequisites
check_prerequisites() {
log "Checking prerequisites..."
if ! command -v curl &> /dev/null; then
error "curl is required but not installed"
fi
if ! command -v jq &> /dev/null; then
log_warning "jq is not installed. JSON parsing will be limited."
JQ_AVAILABLE=false
else
JQ_AVAILABLE=true
fi
if [ -z "${PROXMOX_1_PASS}" ] && [ -z "${PROXMOX_1_API_TOKEN}" ]; then
log_warning "PROXMOX_1_PASS or PROXMOX_1_API_TOKEN not set"
fi
if [ -z "${PROXMOX_2_PASS}" ] && [ -z "${PROXMOX_2_API_TOKEN}" ]; then
log_warning "PROXMOX_2_PASS or PROXMOX_2_API_TOKEN not set"
fi
log_success "Prerequisites check completed"
}
# Create output directory
create_output_dir() {
mkdir -p "${OUTPUT_DIR}"
log "Output directory: ${OUTPUT_DIR}"
}
# Authenticate to Proxmox API
proxmox_auth() {
local api_url=$1
local username=$2
local password=$3
local token=$4
local insecure=$5
local auth_url="${api_url}/api2/json/access/ticket"
local curl_opts=()
if [ "${insecure}" = "true" ]; then
curl_opts+=("-k")
fi
if [ -n "${token}" ]; then
# Token authentication
echo "${token}"
return 0
fi
# Password authentication
local response
response=$(curl -s "${curl_opts[@]}" -X POST \
-d "username=${username}&password=${password}" \
"${auth_url}" 2>/dev/null || echo "")
if [ -z "${response}" ]; then
echo ""
return 1
fi
if [ "${JQ_AVAILABLE}" = "true" ]; then
echo "${response}" | jq -r '.data.ticket // empty'
else
echo "${response}" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4
fi
}
# Get Proxmox API ticket and CSRF token
get_proxmox_ticket() {
local api_url=$1
local username=$2
local password=$3
local token=$4
local insecure=$5
local auth_url="${api_url}/api2/json/access/ticket"
local curl_opts=()
if [ "${insecure}" = "true" ]; then
curl_opts+=("-k")
fi
if [ -n "${token}" ]; then
# For token auth, we need to split token into user:token
echo "TOKEN:${token}"
return 0
fi
local response
response=$(curl -s "${curl_opts[@]}" -X POST \
-d "username=${username}&password=${password}" \
"${auth_url}" 2>/dev/null || echo "")
if [ -z "${response}" ]; then
echo ""
return 1
fi
if [ "${JQ_AVAILABLE}" = "true" ]; then
local ticket=$(echo "${response}" | jq -r '.data.ticket // empty')
local csrf=$(echo "${response}" | jq -r '.data.CSRFPreventionToken // empty')
echo "${ticket}:${csrf}"
else
local ticket=$(echo "${response}" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf=$(echo "${response}" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "${ticket}:${csrf}"
fi
}
# Call Proxmox API
proxmox_api_call() {
local api_url=$1
local endpoint=$2
local ticket=$3
local csrf=$4
local insecure=$5
local curl_opts=(-s -f)
local full_url="${api_url}/api2/json${endpoint}"
if [ "${insecure}" = "true" ]; then
curl_opts+=("-k")
fi
if [[ "${ticket}" == TOKEN:* ]]; then
# Token authentication
local token="${ticket#TOKEN:}"
curl_opts+=(-H "Authorization: PVEAuthCookie=${token}")
else
# Ticket authentication
curl_opts+=(-b "PVEAuthCookie=${ticket}")
if [ -n "${csrf}" ]; then
curl_opts+=(-H "CSRFPreventionToken: ${csrf}")
fi
fi
curl "${curl_opts[@]}" "${full_url}" 2>/dev/null || echo ""
}
# Connect to Proxmox instance
connect_proxmox() {
local instance_num=$1
local api_url=$2
local username=$3
local password=$4
local token=$5
local insecure=$6
log "Connecting to Proxmox Instance ${instance_num} (${api_url})..."
local auth_result
auth_result=$(get_proxmox_ticket "${api_url}" "${username}" "${password}" "${token}" "${insecure}")
if [ -z "${auth_result}" ]; then
log_error "Failed to authenticate to Proxmox Instance ${instance_num}"
return 1
fi
local ticket="${auth_result%%:*}"
local csrf="${auth_result##*:}"
if [ "${ticket}" = "TOKEN" ]; then
ticket="${auth_result}"
csrf=""
fi
log_success "Authenticated to Proxmox Instance ${instance_num}"
# Get cluster status
log " Fetching cluster status..."
local cluster_status
cluster_status=$(proxmox_api_call "${api_url}" "/cluster/status" "${ticket}" "${csrf}" "${insecure}")
# Get nodes
log " Fetching nodes..."
local nodes
nodes=$(proxmox_api_call "${api_url}" "/nodes" "${ticket}" "${csrf}" "${insecure}")
# Get version
log " Fetching version..."
local version
version=$(proxmox_api_call "${api_url}" "/version" "${ticket}" "${csrf}" "${insecure}")
# Get storage
log " Fetching storage..."
local storage
storage=$(proxmox_api_call "${api_url}" "/storage" "${ticket}" "${csrf}" "${insecure}")
# Save results
local output_file="${OUTPUT_DIR}/proxmox-${instance_num}-status-${TIMESTAMP}.json"
{
echo "{"
echo " \"instance\": ${instance_num},"
echo " \"api_url\": \"${api_url}\","
echo " \"timestamp\": \"$(date -Iseconds)\","
echo " \"cluster_status\": ${cluster_status:-null},"
echo " \"nodes\": ${nodes:-null},"
echo " \"version\": ${version:-null},"
echo " \"storage\": ${storage:-null}"
echo "}"
} > "${output_file}"
log_success "Status saved to ${output_file}"
# Display summary
if [ "${JQ_AVAILABLE}" = "true" ]; then
log " Cluster Summary:"
if [ -n "${version}" ]; then
local pve_version=$(echo "${version}" | jq -r '.data.version // "unknown"')
log " Version: ${pve_version}"
fi
if [ -n "${nodes}" ]; then
local node_count=$(echo "${nodes}" | jq '.data | length')
log " Nodes: ${node_count}"
echo "${nodes}" | jq -r '.data[]? | " - \(.node) (status: \(.status // "unknown"))"' || true
fi
fi
echo "${ticket}:${csrf}"
}
# Review configurations
review_configurations() {
log "Reviewing configurations..."
local config_file="${OUTPUT_DIR}/configuration-review-${TIMESTAMP}.md"
{
echo "# Proxmox Configuration Review"
echo ""
echo "Generated: $(date -Iseconds)"
echo ""
echo "## Environment Configuration"
echo ""
echo "### Proxmox Instance 1"
echo "- API URL: ${PROXMOX_1_API_URL}"
echo "- User: ${PROXMOX_1_USER}"
echo "- Password: $([ -n "${PROXMOX_1_PASS}" ] && echo "***SET***" || echo "NOT SET")"
echo "- API Token: $([ -n "${PROXMOX_1_API_TOKEN}" ] && echo "***SET***" || echo "NOT SET")"
echo "- Insecure Skip TLS: ${PROXMOX_1_INSECURE_SKIP_TLS_VERIFY}"
echo ""
echo "### Proxmox Instance 2"
echo "- API URL: ${PROXMOX_2_API_URL}"
echo "- User: ${PROXMOX_2_USER}"
echo "- Password: $([ -n "${PROXMOX_2_PASS}" ] && echo "***SET***" || echo "NOT SET")"
echo "- API Token: $([ -n "${PROXMOX_2_API_TOKEN}" ] && echo "***SET***" || echo "NOT SET")"
echo "- Insecure Skip TLS: ${PROXMOX_2_INSECURE_SKIP_TLS_VERIFY}"
echo ""
echo "## Crossplane Provider Configuration"
echo ""
echo "### Provider Config"
if [ -f "${PROJECT_ROOT}/crossplane-provider-proxmox/examples/provider-config.yaml" ]; then
echo "\`\`\`yaml"
cat "${PROJECT_ROOT}/crossplane-provider-proxmox/examples/provider-config.yaml"
echo "\`\`\`"
else
echo "Provider config file not found"
fi
echo ""
echo "## Cloudflare Tunnel Configurations"
echo ""
for site_config in "${PROJECT_ROOT}/cloudflare/tunnel-configs/proxmox-site-"*.yaml; do
if [ -f "${site_config}" ]; then
echo "### $(basename "${site_config}")"
echo "\`\`\`yaml"
head -20 "${site_config}"
echo "\`\`\`"
echo ""
fi
done
} > "${config_file}"
log_success "Configuration review saved to ${config_file}"
}
# Generate deployment plan
generate_deployment_plan() {
log "Generating deployment plan..."
local plan_file="${OUTPUT_DIR}/deployment-plan-${TIMESTAMP}.md"
{
echo "# Proxmox Deployment Plan"
echo ""
echo "Generated: $(date -Iseconds)"
echo ""
echo "## Current Status"
echo ""
echo "### Proxmox Instances"
echo "- **Instance 1**: ${PROXMOX_1_API_URL}"
echo "- **Instance 2**: ${PROXMOX_2_API_URL}"
echo ""
echo "### Configuration Sites"
echo "- **us-east-1**: https://pve1.sankofa.nexus:8006 (node: pve1)"
echo "- **eu-west-1**: https://pve4.sankofa.nexus:8006 (node: pve4)"
echo "- **apac-1**: https://pve7.sankofa.nexus:8006 (node: pve7)"
echo ""
echo "## Deployment Phases"
echo ""
echo "### Phase 1: Connection and Validation"
echo ""
echo "1. **Verify Connectivity**"
echo " - [ ] Test connection to Instance 1"
echo " - [ ] Test connection to Instance 2"
echo " - [ ] Verify API authentication"
echo " - [ ] Check network connectivity"
echo ""
echo "2. **Status Review**"
echo " - [ ] Review cluster status for both instances"
echo " - [ ] Check node health and availability"
echo " - [ ] Review storage configuration"
echo " - [ ] Check network configuration"
echo " - [ ] Review existing VMs and resources"
echo ""
echo "### Phase 2: Configuration Alignment"
echo ""
echo "1. **Site Mapping**"
echo " - [ ] Map Instance 1 to appropriate site (us-east-1?)"
echo " - [ ] Map Instance 2 to appropriate site (eu-west-1?)"
echo " - [ ] Verify DNS/hostname configuration"
echo " - [ ] Update provider-config.yaml with actual endpoints"
echo ""
echo "2. **Authentication Setup**"
echo " - [ ] Create API tokens for Instance 1"
echo " - [ ] Create API tokens for Instance 2"
echo " - [ ] Update credentials in Kubernetes secrets"
echo " - [ ] Test token authentication"
echo ""
echo "3. **Cloudflare Tunnel Configuration**"
echo " - [ ] Review tunnel configs for all sites"
echo " - [ ] Update hostnames in tunnel configs"
echo " - [ ] Verify tunnel credentials"
echo " - [ ] Test tunnel connectivity"
echo ""
echo "### Phase 3: Crossplane Provider Deployment"
echo ""
echo "1. **Provider Installation**"
echo " - [ ] Build Crossplane provider"
echo " - [ ] Deploy CRDs"
echo " - [ ] Deploy provider controller"
echo " - [ ] Verify provider health"
echo ""
echo "2. **Provider Configuration**"
echo " - [ ] Create ProviderConfig resource"
echo " - [ ] Configure credentials secret"
echo " - [ ] Test provider connectivity to both instances"
echo " - [ ] Verify site configuration"
echo ""
echo "### Phase 4: Infrastructure Deployment"
echo ""
echo "1. **Initial VM Deployment**"
echo " - [ ] Deploy test VM on Instance 1"
echo " - [ ] Deploy test VM on Instance 2"
echo " - [ ] Verify VM creation via Crossplane"
echo " - [ ] Test VM lifecycle operations"
echo ""
echo "2. **Monitoring Setup**"
echo " - [ ] Deploy Prometheus exporters"
echo " - [ ] Configure Grafana dashboards"
echo " - [ ] Set up alerts"
echo " - [ ] Verify metrics collection"
echo ""
echo "3. **Backup and Recovery**"
echo " - [ ] Configure backup schedules"
echo " - [ ] Test backup procedures"
echo " - [ ] Test recovery procedures"
echo ""
echo "### Phase 5: Production Readiness"
echo ""
echo "1. **Security Hardening**"
echo " - [ ] Review and update firewall rules"
echo " - [ ] Enable TLS certificate validation"
echo " - [ ] Rotate API tokens"
echo " - [ ] Review access controls"
echo ""
echo "2. **Documentation**"
echo " - [ ] Document deployment procedures"
echo " - [ ] Create runbooks"
echo " - [ ] Update architecture diagrams"
echo ""
echo "3. **Testing and Validation**"
echo " - [ ] End-to-end testing"
echo " - [ ] Load testing"
echo " - [ ] Disaster recovery testing"
echo " - [ ] Performance validation"
echo ""
} > "${plan_file}"
log_success "Deployment plan saved to ${plan_file}"
}
# Generate task list
generate_task_list() {
log "Generating detailed task list..."
local task_file="${OUTPUT_DIR}/task-list-${TIMESTAMP}.md"
{
echo "# Proxmox Deployment Task List"
echo ""
echo "Generated: $(date -Iseconds)"
echo ""
echo "## Immediate Tasks (Priority: High)"
echo ""
echo "### Connection and Authentication"
echo ""
echo "- [ ] **TASK-001**: Verify network connectivity to ${PROXMOX_1_API_URL}"
echo " - Command: \`curl -k ${PROXMOX_1_API_URL}/api2/json/version\`"
echo " - Expected: JSON response with Proxmox version"
echo ""
echo "- [ ] **TASK-002**: Verify network connectivity to ${PROXMOX_2_API_URL}"
echo " - Command: \`curl -k ${PROXMOX_2_API_URL}/api2/json/version\`"
echo " - Expected: JSON response with Proxmox version"
echo ""
echo "- [ ] **TASK-003**: Test authentication to Instance 1"
echo " - Verify credentials or create API token"
echo " - Test API access"
echo ""
echo "- [ ] **TASK-004**: Test authentication to Instance 2"
echo " - Verify credentials or create API token"
echo " - Test API access"
echo ""
echo "### Configuration Review"
echo ""
echo "- [ ] **TASK-005**: Review current provider-config.yaml"
echo " - File: \`crossplane-provider-proxmox/examples/provider-config.yaml\`"
echo " - Verify endpoints match actual Proxmox instances"
echo " - Update if necessary"
echo ""
echo "- [ ] **TASK-006**: Review Cloudflare tunnel configurations"
echo " - Files: \`cloudflare/tunnel-configs/proxmox-site-*.yaml\`"
echo " - Verify hostnames and endpoints"
echo " - Update domain names if needed"
echo ""
echo "- [ ] **TASK-007**: Map Proxmox instances to sites"
echo " - Determine which instance corresponds to which site"
echo " - Update documentation"
echo ""
echo "## Short-term Tasks (Priority: Medium)"
echo ""
echo "### Crossplane Provider"
echo ""
echo "- [ ] **TASK-008**: Complete Proxmox API client implementation"
echo " - File: \`crossplane-provider-proxmox/pkg/proxmox/client.go\`"
echo " - Implement actual API calls (currently TODOs)"
echo " - Add proper HTTP client with authentication"
echo ""
echo "- [ ] **TASK-009**: Build and test Crossplane provider"
echo " - Run: \`cd crossplane-provider-proxmox && make build\`"
echo " - Test provider locally"
echo ""
echo "- [ ] **TASK-010**: Deploy Crossplane provider to Kubernetes"
echo " - Apply CRDs: \`kubectl apply -f config/crd/bases/\`"
echo " - Deploy provider: \`kubectl apply -f config/provider.yaml\`"
echo ""
echo "- [ ] **TASK-011**: Create ProviderConfig resource"
echo " - Update \`examples/provider-config.yaml\` with actual values"
echo " - Create credentials secret"
echo " - Apply ProviderConfig"
echo ""
echo "### Infrastructure Setup"
echo ""
echo "- [ ] **TASK-012**: Deploy Prometheus exporters to Proxmox nodes"
echo " - Use script: \`scripts/setup-proxmox-agents.sh\`"
echo " - Configure metrics collection"
echo ""
echo "- [ ] **TASK-013**: Configure Cloudflare tunnels"
echo " - Deploy tunnel configs to Proxmox nodes"
echo " - Verify tunnel connectivity"
echo " - Test access via Cloudflare"
echo ""
echo "- [ ] **TASK-014**: Set up monitoring dashboards"
echo " - Import Grafana dashboards"
echo " - Configure alerts"
echo ""
echo "## Long-term Tasks (Priority: Low)"
echo ""
echo "### Testing and Validation"
echo ""
echo "- [ ] **TASK-015**: Deploy test VMs via Crossplane"
echo " - Create test VM on Instance 1"
echo " - Create test VM on Instance 2"
echo " - Verify VM lifecycle operations"
echo ""
echo "- [ ] **TASK-016**: End-to-end testing"
echo " - Test VM creation from portal"
echo " - Test VM management operations"
echo " - Test multi-site deployments"
echo ""
echo "- [ ] **TASK-017**: Performance testing"
echo " - Load test API endpoints"
echo " - Test concurrent VM operations"
echo " - Measure response times"
echo ""
echo "### Documentation and Operations"
echo ""
echo "- [ ] **TASK-018**: Create operational runbooks"
echo " - VM provisioning procedures"
echo " - Troubleshooting guides"
echo " - Disaster recovery procedures"
echo ""
echo "- [ ] **TASK-019**: Set up backup procedures"
echo " - Configure automated backups"
echo " - Test backup and restore"
echo ""
echo "- [ ] **TASK-020**: Security audit"
echo " - Review access controls"
echo " - Enable TLS validation"
echo " - Rotate credentials"
echo ""
} > "${task_file}"
log_success "Task list saved to ${task_file}"
}
# Main execution
main() {
log "Starting Proxmox Review and Deployment Planning..."
log "=================================================="
load_env
check_prerequisites
create_output_dir
log ""
log "=== Phase 1: Connecting to Proxmox Instances ==="
local instance1_auth=""
local instance2_auth=""
# Connect to Instance 1
if instance1_auth=$(connect_proxmox 1 \
"${PROXMOX_1_API_URL}" \
"${PROXMOX_1_USER}" \
"${PROXMOX_1_PASS}" \
"${PROXMOX_1_API_TOKEN}" \
"${PROXMOX_1_INSECURE_SKIP_TLS_VERIFY}"); then
log_success "Successfully connected to Proxmox Instance 1"
else
log_error "Failed to connect to Proxmox Instance 1"
fi
log ""
# Connect to Instance 2
if instance2_auth=$(connect_proxmox 2 \
"${PROXMOX_2_API_URL}" \
"${PROXMOX_2_USER}" \
"${PROXMOX_2_PASS}" \
"${PROXMOX_2_API_TOKEN}" \
"${PROXMOX_2_INSECURE_SKIP_TLS_VERIFY}"); then
log_success "Successfully connected to Proxmox Instance 2"
else
log_error "Failed to connect to Proxmox Instance 2"
fi
log ""
log "=== Phase 2: Reviewing Configurations ==="
review_configurations
log ""
log "=== Phase 3: Generating Deployment Plan ==="
generate_deployment_plan
log ""
log "=== Phase 4: Generating Task List ==="
generate_task_list
log ""
log "=================================================="
log_success "Review and planning completed!"
log ""
log "Output files:"
log " - Configuration Review: ${OUTPUT_DIR}/configuration-review-${TIMESTAMP}.md"
log " - Deployment Plan: ${OUTPUT_DIR}/deployment-plan-${TIMESTAMP}.md"
log " - Task List: ${OUTPUT_DIR}/task-list-${TIMESTAMP}.md"
log " - Status JSONs: ${OUTPUT_DIR}/proxmox-*-status-${TIMESTAMP}.json"
log ""
}
main "$@"

139
scripts/quick-deploy.sh Executable file
View File

@@ -0,0 +1,139 @@
#!/bin/bash
# quick-deploy.sh
# Quick deployment script that runs all deployment steps in sequence
set -euo pipefail
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
step() {
echo ""
echo "═══════════════════════════════════════════════════════════════"
info "Step $1: $2"
echo "═══════════════════════════════════════════════════════════════"
echo ""
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Proxmox Quick Deployment ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
info "This script will guide you through the complete deployment process"
echo ""
# Step 1: Test Connectivity
step "1" "Test Proxmox Connectivity"
read -p "Test connectivity to Proxmox instances? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
"${SCRIPT_DIR}/test-proxmox-connectivity.sh"
fi
# Step 2: Setup DNS
step "2" "Configure DNS Records"
read -p "Setup DNS records via Cloudflare? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
if [ -z "${CLOUDFLARE_ZONE_ID:-}" ] || [ -z "${CLOUDFLARE_API_TOKEN:-}" ]; then
warn "CLOUDFLARE_ZONE_ID and CLOUDFLARE_API_TOKEN must be set"
read -p "Set them now? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
read -p "Cloudflare Zone ID: " ZONE_ID
read -sp "Cloudflare API Token: " API_TOKEN
echo
export CLOUDFLARE_ZONE_ID="$ZONE_ID"
export CLOUDFLARE_API_TOKEN="$API_TOKEN"
fi
fi
"${SCRIPT_DIR}/setup-dns-records.sh"
fi
# Step 3: Build Provider
step "3" "Build Crossplane Provider"
read -p "Build provider? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
"${SCRIPT_DIR}/deploy-crossplane-provider.sh"
fi
# Step 4: Create Secret
step "4" "Create Proxmox Credentials Secret"
read -p "Create Kubernetes secret? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
"${SCRIPT_DIR}/create-proxmox-secret.sh"
fi
# Step 5: Apply ProviderConfig
step "5" "Apply ProviderConfig"
read -p "Apply ProviderConfig? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
kubectl apply -f crossplane-provider-proxmox/examples/provider-config.yaml
log "ProviderConfig applied"
fi
# Step 6: Verify Deployment
step "6" "Verify Provider Deployment"
"${SCRIPT_DIR}/verify-provider-deployment.sh"
# Step 7: Deploy Test VMs
step "7" "Deploy Test VMs"
read -p "Deploy test VMs? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
"${SCRIPT_DIR}/deploy-test-vms.sh"
fi
# Step 8: Setup Monitoring
step "8" "Setup Monitoring"
read -p "Setup monitoring? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
"${SCRIPT_DIR}/setup-monitoring.sh"
fi
echo ""
log "Quick deployment complete!"
echo ""
info "Summary:"
info " • Connectivity: Tested"
info " • DNS: Configured"
info " • Provider: Deployed"
info " • Credentials: Created"
info " • Test VMs: Deployed"
info " • Monitoring: Configured"
echo ""
info "Next: Review logs and verify all components are working"
}
main "$@"

212
scripts/remove-test-vms.sh Executable file
View File

@@ -0,0 +1,212 @@
#!/bin/bash
# remove-test-vms.sh
# Remove test VMs 100-115 from Proxmox
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Load environment
if [ -f "${PROJECT_ROOT}/.env" ]; then
set -a
source "${PROJECT_ROOT}/.env"
set +a
fi
PROXMOX_PASS="${PROXMOX_ROOT_PASS:-L@kers2010}"
PROXMOX_1_URL="https://192.168.11.10:8006"
PROXMOX_2_URL="https://192.168.11.11:8006"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
# Get ticket
get_ticket() {
local api_url=$1
local response
response=$(curl -k -s -X POST \
-d "username=root@pam&password=${PROXMOX_PASS}" \
"${api_url}/api2/json/access/ticket" 2>/dev/null)
if echo "${response}" | grep -q "authentication failure"; then
echo ""
return 1
fi
if command -v jq &> /dev/null; then
echo "${response}" | jq -r '.data.ticket // empty' 2>/dev/null
else
echo "${response}" | grep -o '"ticket":"[^"]*' | head -1 | cut -d'"' -f4
fi
}
# Get CSRF
get_csrf() {
local api_url=$1
local response
response=$(curl -k -s -X POST \
-d "username=root@pam&password=${PROXMOX_PASS}" \
"${api_url}/api2/json/access/ticket" 2>/dev/null)
if echo "${response}" | grep -q "authentication failure"; then
echo ""
return 1
fi
if command -v jq &> /dev/null; then
echo "${response}" | jq -r '.data.CSRFPreventionToken // empty' 2>/dev/null
else
echo "${response}" | grep -o '"CSRFPreventionToken":"[^"]*' | head -1 | cut -d'"' -f4
fi
}
# Get VM info
get_vm_info() {
local api_url=$1
local node=$2
local vmid=$3
local ticket=$4
curl -k -s -b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}/config" 2>/dev/null | \
jq -r '.data.name // empty' 2>/dev/null
}
# Delete VM
delete_vm() {
local api_url=$1
local node=$2
local vmid=$3
local ticket=$4
local csrf=$5
# Stop VM first if running
curl -k -s -X POST \
-H "CSRFPreventionToken: ${csrf}" \
-b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}/status/stop" > /dev/null 2>&1
sleep 2
# Delete VM
local response
response=$(curl -k -s -X DELETE \
-H "CSRFPreventionToken: ${csrf}" \
-b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}" 2>/dev/null)
if echo "${response}" | grep -q '"data":null'; then
return 0
fi
return 1
}
# Find test VMs
find_test_vms() {
local api_url=$1
local node=$2
local ticket=$3
curl -k -s -b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu" 2>/dev/null | \
jq -r '.data[] | select(.vmid >= 100 and .vmid <= 115) | "\(.vmid) - \(.name)"' 2>/dev/null
}
main() {
log "=========================================="
log "Removing Test VMs (100-115)"
log "=========================================="
log ""
# Site 1
log "Site 1 (ml110-01):"
local ticket1 csrf1
ticket1=$(get_ticket "${PROXMOX_1_URL}")
csrf1=$(get_csrf "${PROXMOX_1_URL}")
if [ -z "${ticket1}" ] || [ -z "${csrf1}" ]; then
log_error "Failed to authenticate to Site 1"
else
log_success "Authenticated to Site 1"
local test_vms
test_vms=$(find_test_vms "${PROXMOX_1_URL}" "ml110-01" "${ticket1}")
if [ -z "${test_vms}" ]; then
log " No test VMs found (100-115)"
else
echo "${test_vms}" | while IFS=' - ' read -r vmid vmname; do
log " Found test VM: ${vmid} - ${vmname}"
log " Deleting..."
if delete_vm "${PROXMOX_1_URL}" "ml110-01" "${vmid}" "${ticket1}" "${csrf1}"; then
log_success " VM ${vmid} deleted"
else
log_error " Failed to delete VM ${vmid}"
fi
sleep 1
done
fi
fi
log ""
# Site 2
log "Site 2 (r630-01):"
local ticket2 csrf2
ticket2=$(get_ticket "${PROXMOX_2_URL}")
csrf2=$(get_csrf "${PROXMOX_2_URL}")
if [ -z "${ticket2}" ] || [ -z "${csrf2}" ]; then
log_error "Failed to authenticate to Site 2"
else
log_success "Authenticated to Site 2"
local test_vms
test_vms=$(find_test_vms "${PROXMOX_2_URL}" "r630-01" "${ticket2}")
if [ -z "${test_vms}" ]; then
log " No test VMs found (100-115)"
else
echo "${test_vms}" | while IFS=' - ' read -r vmid vmname; do
log " Found test VM: ${vmid} - ${vmname}"
log " Deleting..."
if delete_vm "${PROXMOX_2_URL}" "r630-01" "${vmid}" "${ticket2}" "${csrf2}"; then
log_success " VM ${vmid} deleted"
else
log_error " Failed to delete VM ${vmid}"
fi
sleep 1
done
fi
fi
log ""
log "=========================================="
log_success "Test VM cleanup complete!"
log ""
}
main "$@"

288
scripts/resolve-blockers.sh Executable file
View File

@@ -0,0 +1,288 @@
#!/bin/bash
# resolve-blockers.sh
# Automated script to resolve all remaining blockers
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
PASSED=0
FAILED=0
SKIPPED=0
log() {
echo -e "${GREEN}[✓]${NC} $1"
((PASSED++))
}
error() {
echo -e "${RED}[✗]${NC} $1"
((FAILED++))
}
warn() {
echo -e "${YELLOW}[!]${NC} $1"
((SKIPPED++))
}
info() {
echo -e "${BLUE}[i]${NC} $1"
}
# Blocker 1: Kubernetes
setup_kubernetes() {
info "=== Blocker 1: Kubernetes Cluster Setup ==="
echo ""
# Check kubectl
if command -v kubectl &> /dev/null; then
log "kubectl is installed"
else
warn "kubectl not installed - install manually: https://kubernetes.io/docs/tasks/tools/"
return 1
fi
# Check for existing cluster
if kubectl cluster-info &> /dev/null 2>&1; then
log "Kubernetes cluster is accessible"
kubectl get nodes 2>/dev/null && log "Cluster nodes are ready" || warn "Cluster accessible but nodes not ready"
return 0
fi
# Check Docker for kind/minikube
if ! docker info &> /dev/null; then
warn "Docker is not running - required for kind/minikube"
warn "Start Docker or use existing Kubernetes cluster"
return 1
fi
# Try kind
if command -v kind &> /dev/null; then
info "kind is installed, creating cluster..."
if kind create cluster --name sankofa 2>/dev/null; then
log "kind cluster 'sankofa' created"
kubectl config use-context kind-sankofa
return 0
else
warn "Failed to create kind cluster (may already exist)"
if kind get clusters | grep -q sankofa; then
log "Cluster 'sankofa' already exists"
kubectl config use-context kind-sankofa
return 0
fi
fi
else
warn "kind not installed - install manually or use existing cluster"
fi
# Try minikube
if command -v minikube &> /dev/null; then
info "minikube is installed, starting cluster..."
if minikube start --driver=docker 2>/dev/null; then
log "minikube cluster started"
return 0
else
warn "Failed to start minikube (may already be running)"
if minikube status &> /dev/null; then
log "minikube cluster is running"
return 0
fi
fi
fi
warn "No Kubernetes cluster available - manual setup required"
return 1
}
install_crossplane() {
info "Installing Crossplane..."
if ! kubectl cluster-info &> /dev/null 2>&1; then
warn "No Kubernetes cluster - skipping Crossplane installation"
return 1
fi
# Check if Crossplane is already installed
if kubectl get namespace crossplane-system &> /dev/null 2>&1; then
if kubectl get pods -n crossplane-system &> /dev/null 2>&1; then
log "Crossplane is already installed"
return 0
fi
fi
# Check for helm
if ! command -v helm &> /dev/null; then
warn "helm not installed - install manually: https://helm.sh/docs/intro/install/"
return 1
fi
# Install Crossplane
if helm repo list | grep -q crossplane-stable; then
log "Crossplane Helm repo already added"
else
helm repo add crossplane-stable https://charts.crossplane.io/stable
helm repo update
log "Crossplane Helm repo added"
fi
if helm list -n crossplane-system | grep -q crossplane; then
log "Crossplane is already installed via Helm"
else
if helm install crossplane crossplane-stable/crossplane \
--namespace crossplane-system \
--create-namespace \
--wait 2>/dev/null; then
log "Crossplane installed successfully"
else
warn "Failed to install Crossplane - check logs"
return 1
fi
fi
# Verify
sleep 5
if kubectl get pods -n crossplane-system &> /dev/null; then
log "Crossplane pods are running"
kubectl get pods -n crossplane-system
else
warn "Crossplane pods not ready yet"
fi
}
# Blocker 2: SSH
setup_ssh() {
info "=== Blocker 2: SSH Access Setup ==="
echo ""
SSH_KEY="${SSH_KEY:-$HOME/.ssh/sankofa_proxmox}"
# Generate key if not exists
if [ ! -f "$SSH_KEY" ]; then
info "Generating SSH key..."
if ssh-keygen -t ed25519 -C "sankofa-proxmox" -f "$SSH_KEY" -N "" -q; then
log "SSH key generated: $SSH_KEY"
else
error "Failed to generate SSH key"
return 1
fi
else
log "SSH key already exists: $SSH_KEY"
fi
# Test ML110-01
info "Testing SSH to ML110-01..."
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no root@192.168.11.10 'echo "SSH working"' &> /dev/null; then
log "SSH to ML110-01 works"
else
warn "SSH to ML110-01 failed - manual key copy required"
info "Run: ssh-copy-id -i $SSH_KEY.pub root@192.168.11.10"
fi
# Test R630-01
info "Testing SSH to R630-01..."
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no root@192.168.11.11 'echo "SSH working"' &> /dev/null; then
log "SSH to R630-01 works"
else
warn "SSH to R630-01 failed - manual key copy required"
info "Run: ssh-copy-id -i $SSH_KEY.pub root@192.168.11.11"
fi
}
# Blocker 3: Images
verify_images() {
info "=== Blocker 3: Image Verification ==="
echo ""
SSH_KEY="${SSH_KEY:-$HOME/.ssh/sankofa_proxmox}"
# Check ML110-01
info "Checking images on ML110-01..."
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no root@192.168.11.10 'pveam list local 2>/dev/null | grep -i ubuntu' &> /dev/null; then
local images=$(ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no root@192.168.11.10 'pveam list local 2>/dev/null | grep -i ubuntu' 2>/dev/null || echo "")
if [ -n "$images" ]; then
log "Images found on ML110-01:"
echo "$images" | head -3 | sed 's/^/ /'
else
warn "No Ubuntu images found on ML110-01"
fi
else
warn "Cannot check images on ML110-01 (SSH not configured)"
fi
# Check R630-01
info "Checking images on R630-01..."
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no root@192.168.11.11 'pveam list local 2>/dev/null | grep -i ubuntu' &> /dev/null; then
local images=$(ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no root@192.168.11.11 'pveam list local 2>/dev/null | grep -i ubuntu' 2>/dev/null || echo "")
if [ -n "$images" ]; then
log "Images found on R630-01:"
echo "$images" | head -3 | sed 's/^/ /'
else
warn "No Ubuntu images found on R630-01"
fi
else
warn "Cannot check images on R630-01 (SSH not configured)"
fi
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Resolving All Remaining Blockers ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
echo "Priority Order:"
echo " 1. SSH Access (needed for image verification)"
echo " 2. Image Verification (needed before VM deployment)"
echo " 3. Kubernetes Cluster (needed for provider deployment)"
echo ""
# Blocker 2: SSH (PRIORITY 1 - Do this first)
setup_ssh
echo ""
# Blocker 3: Images (PRIORITY 2 - Depends on SSH)
verify_images
echo ""
# Blocker 1: Kubernetes (PRIORITY 3 - Can be done in parallel)
if setup_kubernetes; then
install_crossplane
fi
echo ""
# Summary
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Summary ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
echo -e "${GREEN}Passed:${NC} ${PASSED}"
echo -e "${YELLOW}Skipped/Warnings:${NC} ${SKIPPED}"
echo -e "${RED}Failed:${NC} ${FAILED}"
echo ""
if [ $FAILED -eq 0 ]; then
log "All automated steps completed!"
if [ $SKIPPED -gt 0 ]; then
warn "Some steps require manual intervention (see warnings above)"
fi
else
error "Some steps failed - manual intervention required"
fi
echo ""
}
main "$@"

102
scripts/retry-failed-vms.sh Executable file
View File

@@ -0,0 +1,102 @@
#!/bin/bash
# Retry deletion of failed VMs
set -e
PROXMOX_ENDPOINT="${PROXMOX_ENDPOINT:-https://192.168.11.10:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-ml110-01}"
PROXMOX_USER="${PROXMOX_USER:-}"
PROXMOX_PASS="${PROXMOX_PASS:-}"
FAILED_VMS=(141 142 143 144 145)
if [ -z "$PROXMOX_USER" ] || [ -z "$PROXMOX_PASS" ]; then
echo "Error: PROXMOX_USER and PROXMOX_PASS must be set"
exit 1
fi
echo "Retrying deletion of failed VMs: ${FAILED_VMS[@]}"
echo ""
# Get authentication
TICKET=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.ticket // empty')
if [ -z "$TICKET" ]; then
echo "Error: Failed to authenticate"
exit 1
fi
CSRF_TOKEN=$(curl -s -k -d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_ENDPOINT}/api2/json/access/ticket" | \
jq -r '.data.CSRFPreventionToken // empty')
for VMID in "${FAILED_VMS[@]}"; do
echo "Processing VM $VMID..."
# Multiple unlock attempts
for i in 1 2 3 4 5; do
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/unlock" > /dev/null 2>&1
sleep 2
done
# Stop if running
curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X POST \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/stop" > /dev/null 2>&1
sleep 3
# Delete with purge
DELETE_RESULT=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF_TOKEN}" \
-X DELETE \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}?purge=1&skiplock=1" 2>&1)
TASK_UPID=$(echo "$DELETE_RESULT" | jq -r '.data // empty' 2>/dev/null)
if [ -n "$TASK_UPID" ] && [ "$TASK_UPID" != "null" ]; then
echo " Delete task started: $TASK_UPID"
echo " Waiting for completion..."
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45; do
sleep 1
TASK_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.status // "unknown"')
if [ "$TASK_STATUS" = "stopped" ]; then
EXIT_STATUS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/tasks/${TASK_UPID}/status" 2>/dev/null | \
jq -r '.data.exitstatus // "unknown"')
if [ "$EXIT_STATUS" = "OK" ] || [ "$EXIT_STATUS" = "0" ]; then
sleep 3
VM_STILL_EXISTS=$(curl -s -k -b "PVEAuthCookie=${TICKET}" \
"${PROXMOX_ENDPOINT}/api2/json/nodes/${PROXMOX_NODE}/qemu/${VMID}/status/current" 2>/dev/null | \
jq -r '.data // empty')
if [ -z "$VM_STILL_EXISTS" ] || [ "$VM_STILL_EXISTS" = "null" ]; then
echo " ✅ VM $VMID deleted successfully"
else
echo " ⚠️ VM $VMID task completed but VM still exists"
fi
else
echo " ⚠️ Task completed with status: $EXIT_STATUS"
fi
break
fi
done
else
echo " ❌ Failed to start delete task"
echo " Response: $DELETE_RESULT"
fi
echo ""
done
echo "Retry complete!"

197
scripts/rotate-credentials.sh Executable file
View File

@@ -0,0 +1,197 @@
#!/bin/bash
#
# Credential Rotation Script
#
# Per DoD/MilSpec requirements (NIST SP 800-53: SC-12, IA-5)
# This script assists with rotating credentials across the system
#
# Usage:
# ./scripts/rotate-credentials.sh [credential-type]
#
# Credential types:
# - jwt: JWT signing secret
# - db: Database password
# - keycloak: Keycloak client secret
# - proxmox: Proxmox API tokens
# - all: Rotate all credentials
#
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Logging functions
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Generate a secure random secret
generate_secret() {
local length=${1:-64}
openssl rand -base64 $length | tr -d '\n' | head -c $length
}
# Rotate JWT secret
rotate_jwt_secret() {
log_info "Rotating JWT secret..."
local new_secret=$(generate_secret 64)
# Update in environment or secret management system
if [ -f .env ]; then
if grep -q "^JWT_SECRET=" .env; then
sed -i.bak "s/^JWT_SECRET=.*/JWT_SECRET=${new_secret}/" .env
log_info "Updated JWT_SECRET in .env (backup created: .env.bak)"
else
echo "JWT_SECRET=${new_secret}" >> .env
log_info "Added JWT_SECRET to .env"
fi
else
log_warn ".env file not found. Please set JWT_SECRET=${new_secret} manually"
fi
# If using Kubernetes secrets
if command -v kubectl &> /dev/null; then
log_info "Updating Kubernetes secret..."
kubectl create secret generic jwt-secret \
--from-literal=secret="${new_secret}" \
--dry-run=client -o yaml | kubectl apply -f -
fi
log_warn "IMPORTANT: Restart all services using JWT_SECRET after rotation"
log_warn "All existing JWT tokens will be invalidated"
}
# Rotate database password
rotate_db_password() {
log_info "Rotating database password..."
local new_password=$(generate_secret 32)
# Update in environment
if [ -f .env ]; then
if grep -q "^DB_PASSWORD=" .env; then
sed -i.bak "s/^DB_PASSWORD=.*/DB_PASSWORD=${new_password}/" .env
log_info "Updated DB_PASSWORD in .env (backup created: .env.bak)"
else
echo "DB_PASSWORD=${new_password}" >> .env
log_info "Added DB_PASSWORD to .env"
fi
else
log_warn ".env file not found. Please set DB_PASSWORD=${new_password} manually"
fi
# Update database password
log_warn "IMPORTANT: You must update the database password manually:"
log_warn " ALTER USER postgres WITH PASSWORD '${new_password}';"
# If using Kubernetes secrets
if command -v kubectl &> /dev/null; then
log_info "Updating Kubernetes secret..."
kubectl create secret generic db-credentials \
--from-literal=password="${new_password}" \
--dry-run=client -o yaml | kubectl apply -f -
fi
}
# Rotate Keycloak client secret
rotate_keycloak_secret() {
log_info "Rotating Keycloak client secret..."
local new_secret=$(generate_secret 32)
if [ -f .env ]; then
if grep -q "^KEYCLOAK_CLIENT_SECRET=" .env; then
sed -i.bak "s/^KEYCLOAK_CLIENT_SECRET=.*/KEYCLOAK_CLIENT_SECRET=${new_secret}/" .env
log_info "Updated KEYCLOAK_CLIENT_SECRET in .env (backup created: .env.bak)"
else
echo "KEYCLOAK_CLIENT_SECRET=${new_secret}" >> .env
log_info "Added KEYCLOAK_CLIENT_SECRET to .env"
fi
else
log_warn ".env file not found. Please set KEYCLOAK_CLIENT_SECRET=${new_secret} manually"
fi
log_warn "IMPORTANT: Update Keycloak client secret in Keycloak admin console"
log_warn "All existing Keycloak sessions will be invalidated"
}
# Rotate Proxmox API tokens
rotate_proxmox_tokens() {
log_info "Rotating Proxmox API tokens..."
log_warn "Proxmox API tokens must be rotated manually:"
log_warn "1. Log into Proxmox web interface"
log_warn "2. Go to Datacenter -> Permissions -> API Tokens"
log_warn "3. Revoke old tokens and create new ones"
log_warn "4. Update tokens in Kubernetes secrets or configuration"
# If using Kubernetes secrets
if command -v kubectl &> /dev/null; then
log_info "To update Kubernetes secret after creating new token:"
log_info " kubectl create secret generic proxmox-credentials \\"
log_info " --from-literal=credentials.json='{\"username\":\"root@pam\",\"token\":\"NEW_TOKEN\"}' \\"
log_info " --dry-run=client -o yaml | kubectl apply -f -"
fi
}
# Main function
main() {
local credential_type=${1:-all}
log_info "Starting credential rotation (DoD/MilSpec compliance)"
log_info "Credential type: ${credential_type}"
case "${credential_type}" in
jwt)
rotate_jwt_secret
;;
db|database)
rotate_db_password
;;
keycloak)
rotate_keycloak_secret
;;
proxmox)
rotate_proxmox_tokens
;;
all)
log_info "Rotating all credentials..."
rotate_jwt_secret
echo ""
rotate_db_password
echo ""
rotate_keycloak_secret
echo ""
rotate_proxmox_tokens
;;
*)
log_error "Unknown credential type: ${credential_type}"
echo "Usage: $0 [jwt|db|keycloak|proxmox|all]"
exit 1
;;
esac
log_info "Credential rotation complete"
log_warn "Remember to:"
log_warn " 1. Restart all affected services"
log_warn " 2. Verify services are working correctly"
log_warn " 3. Update any documentation with new rotation date"
log_warn " 4. Archive old credentials securely (if required by policy)"
}
# Run main function
main "$@"

28
scripts/run-on-proxmox.sh Normal file
View File

@@ -0,0 +1,28 @@
#!/bin/bash
# Commands to run on Proxmox node
# Copy everything after the "---" line and paste into SSH session
cat << 'PROXMOX_COMMANDS'
---
VMID=100
echo "=== Automated Cleanup for VM $VMID ==="
pkill -9 -f "task.*$VMID" 2>/dev/null && echo "✅ Killed task processes" || echo " No task processes"
pkill -9 -f "qm.*$VMID" 2>/dev/null && echo "✅ Killed qm processes" || echo " No qm processes"
sleep 2
rm -f /var/lock/qemu-server/lock-$VMID.conf && echo "✅ Removed lock file" || echo "⚠️ Lock removal failed"
echo ""
echo "=== Verification ==="
ps aux | grep -E "task.*$VMID|qm.*$VMID" | grep -v grep || echo "✅ No processes remaining"
ls -la /var/lock/qemu-server/lock-$VMID.conf 2>&1 | grep -q "No such file" && echo "✅ Lock file removed" || echo "⚠️ Lock still exists"
echo ""
echo "=== Unlocking ==="
qm unlock $VMID
echo ""
echo "=== Final Status ==="
qm status $VMID
---
PROXMOX_COMMANDS
echo ""
echo "Copy everything between the '---' lines above and paste into your Proxmox SSH session"

View File

@@ -0,0 +1,80 @@
#!/bin/bash
# Helper script to provide commands for running verification on Proxmox node
# This script outputs the commands to copy to Proxmox SSH session
VMID=100
echo "=========================================="
echo "Copy and paste these commands into your"
echo "Proxmox SSH session (root@ml110-01):"
echo "=========================================="
echo ""
echo "--- START COPY ---"
echo ""
echo "VMID=100"
echo ""
echo "# Step 1: Check VM status"
echo "qm status \$VMID"
echo ""
echo "# Step 2: Verify boot order"
echo "BOOT=\$(qm config \$VMID | grep '^boot:' || echo '')"
echo "if [ -z \"\$BOOT\" ]; then"
echo " echo 'Fixing boot order...'"
echo " qm set \$VMID --boot order=scsi0"
echo "else"
echo " echo \"Boot order: \$BOOT\""
echo "fi"
echo ""
echo "# Step 3: Verify disk"
echo "SCSI0=\$(qm config \$VMID | grep '^scsi0:' || echo '')"
echo "if [ -z \"\$SCSI0\" ]; then"
echo " echo 'ERROR: Disk not configured!'"
echo " exit 1"
echo "else"
echo " echo \"Disk: \$SCSI0\""
echo "fi"
echo "lvs | grep vm-\$VMID-disk"
echo ""
echo "# Step 4: Verify cloud-init"
echo "IDE2=\$(qm config \$VMID | grep '^ide2:' || echo '')"
echo "if [ -z \"\$IDE2\" ]; then"
echo " echo 'Fixing cloud-init...'"
echo " qm set \$VMID --ide2 local-lvm:cloudinit"
echo " qm set \$VMID --ciuser admin"
echo " qm set \$VMID --ipconfig0 ip=dhcp"
echo "else"
echo " echo \"Cloud-init: \$IDE2\""
echo "fi"
echo ""
echo "# Step 5: Verify network"
echo "NET0=\$(qm config \$VMID | grep '^net0:' || echo '')"
echo "if [ -z \"\$NET0\" ]; then"
echo " echo 'Fixing network...'"
echo " qm set \$VMID --net0 virtio,bridge=vmbr0"
echo "else"
echo " echo \"Network: \$NET0\""
echo "fi"
echo ""
echo "# Step 6: Verify guest agent (already fixed)"
echo "qm config \$VMID | grep '^agent:'"
echo ""
echo "# Step 7: Final config summary"
echo "echo '=== Final Configuration ==='"
echo "qm config \$VMID | grep -E '^agent:|^boot:|^scsi0:|^ide2:|^net0:|^ciuser:'"
echo ""
echo "# Step 8: Start VM"
echo "STATUS=\$(qm status \$VMID | awk '{print \$2}')"
echo "if [ \"\$STATUS\" != \"running\" ]; then"
echo " echo 'Starting VM...'"
echo " qm start \$VMID"
echo " sleep 5"
echo " qm status \$VMID"
echo "else"
echo " echo 'VM is already running'"
echo "fi"
echo ""
echo "--- END COPY ---"
echo ""
echo "After running, monitor from Kubernetes:"
echo " kubectl get proxmoxvm basic-vm-001 -w"

View File

@@ -0,0 +1,30 @@
#!/bin/bash
# Helper script to run VM 100 verification on Proxmox node
# This script provides the commands to copy to Proxmox SSH session
echo "=========================================="
echo "VM 100 Verification Script"
echo "=========================================="
echo ""
echo "Option 1: Copy script to Proxmox and run"
echo "----------------------------------------"
echo "From your local machine:"
echo " scp scripts/complete-vm-100-verification.sh root@PROXMOX_HOST:/tmp/"
echo " ssh root@PROXMOX_HOST 'bash /tmp/complete-vm-100-verification.sh'"
echo ""
echo "Option 2: Run via heredoc (copy entire block)"
echo "----------------------------------------"
echo "Copy and paste this into your Proxmox SSH session:"
echo ""
echo "bash << 'SCRIPT_EOF'"
cat "$(dirname "$0")/complete-vm-100-verification.sh"
echo "SCRIPT_EOF"
echo ""
echo "Option 3: Direct commands (quick check)"
echo "----------------------------------------"
echo "VMID=100"
echo "qm status \$VMID"
echo "qm config \$VMID | grep -E 'agent:|boot:|scsi0:|ide2:|net0:'"
echo "qm start \$VMID"
echo "watch -n 2 'qm status \$VMID'"

219
scripts/setup-dev-environment.sh Executable file
View File

@@ -0,0 +1,219 @@
#!/bin/bash
# setup-dev-environment.sh
# Sets up development environment for Crossplane provider development
set -euo pipefail
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_command() {
local cmd=$1
local install_cmd=$2
if command -v "$cmd" &> /dev/null; then
local version=$($cmd --version 2>/dev/null | head -1 || echo "installed")
log "$cmd is installed: $version"
return 0
else
warn "$cmd is not installed"
if [ -n "$install_cmd" ]; then
info "Install with: $install_cmd"
fi
return 1
fi
}
install_go() {
log "Installing Go..."
local go_version="1.21.5"
local arch=$(uname -m)
case "$arch" in
x86_64) arch="amd64" ;;
aarch64) arch="arm64" ;;
*) error "Unsupported architecture: $arch" ;;
esac
local go_tar="go${go_version}.linux-${arch}.tar.gz"
local go_url="https://go.dev/dl/${go_tar}"
cd /tmp
wget -q "$go_url"
sudo rm -rf /usr/local/go
sudo tar -C /usr/local -xzf "$go_tar"
rm "$go_tar"
# Add to PATH
if ! grep -q "/usr/local/go/bin" ~/.bashrc 2>/dev/null; then
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc
export PATH=$PATH:/usr/local/go/bin
fi
log "✓ Go installed successfully"
}
install_kubectl() {
log "Installing kubectl..."
local kubectl_version=$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)
curl -LO "https://storage.googleapis.com/kubernetes-release/release/${kubectl_version}/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/
log "✓ kubectl installed successfully"
}
install_kind() {
log "Installing kind (Kubernetes in Docker)..."
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
log "✓ kind installed successfully"
}
install_tools() {
log "Installing development tools..."
local tools=(
"jq:apt-get install -y jq"
"yq:snap install yq"
"yamllint:pip3 install yamllint"
"docker:apt-get install -y docker.io"
)
for tool_info in "${tools[@]}"; do
IFS=':' read -r tool install_cmd <<< "$tool_info"
if ! check_command "$tool" "$install_cmd"; then
if [ "$tool" = "yq" ]; then
# Alternative yq installation
wget -qO /usr/local/bin/yq https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64
chmod +x /usr/local/bin/yq
log "✓ yq installed"
elif [ "$tool" = "yamllint" ]; then
pip3 install yamllint
log "✓ yamllint installed"
fi
fi
done
}
setup_git_hooks() {
log "Setting up git hooks..."
if [ -d ".git" ]; then
mkdir -p .git/hooks
cat > .git/hooks/pre-commit <<'EOF'
#!/bin/bash
# Pre-commit hook to validate configuration files
echo "Running pre-commit validation..."
if [ -f "./scripts/validate-configs.sh" ]; then
./scripts/validate-configs.sh
if [ $? -ne 0 ]; then
echo "❌ Configuration validation failed"
exit 1
fi
fi
echo "✓ Pre-commit checks passed"
EOF
chmod +x .git/hooks/pre-commit
log "✓ Git pre-commit hook installed"
else
warn "Not a git repository, skipping git hooks"
fi
}
create_kind_cluster() {
log "Creating kind cluster for local testing..."
if command -v kind &> /dev/null; then
if kind get clusters | grep -q "proxmox-test"; then
warn "kind cluster 'proxmox-test' already exists"
else
kind create cluster --name proxmox-test --config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 30000
hostPort: 30000
protocol: TCP
EOF
log "✓ kind cluster 'proxmox-test' created"
fi
else
warn "kind not installed, skipping cluster creation"
fi
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Development Environment Setup ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
log "Checking development tools..."
echo ""
local missing_tools=0
check_command "go" "See: https://go.dev/dl/" || { install_go; missing_tools=$((missing_tools + 1)); }
check_command "kubectl" "See: https://kubernetes.io/docs/tasks/tools/" || { install_kubectl; }
check_command "kind" "See: https://kind.sigs.k8s.io/" || { install_kind; }
check_command "make" "apt-get install -y build-essential" || ((missing_tools++))
check_command "docker" "apt-get install -y docker.io" || ((missing_tools++))
echo ""
install_tools
echo ""
setup_git_hooks
echo ""
read -p "Create kind cluster for local testing? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
create_kind_cluster
fi
echo ""
log "Development environment setup complete!"
echo ""
info "Next steps:"
info "1. cd crossplane-provider-proxmox"
info "2. make build"
info "3. make test"
info "4. Start developing!"
}
main "$@"

284
scripts/setup-dns-records.sh Executable file
View File

@@ -0,0 +1,284 @@
#!/bin/bash
# setup-dns-records.sh
# Creates DNS records for Proxmox instances using Cloudflare API
set -euo pipefail
# Load environment variables from .env if it exists
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
source "${SCRIPT_DIR}/../.env"
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Configuration
DOMAIN="${DOMAIN:-sankofa.nexus}"
ZONE_ID="${CLOUDFLARE_ZONE_ID:-}"
# Support both API Token and Global API Key + Email
API_TOKEN="${CLOUDFLARE_API_TOKEN:-}"
API_KEY="${CLOUDFLARE_API_KEY:-}"
API_EMAIL="${CLOUDFLARE_EMAIL:-}"
# Instance configurations
declare -A INSTANCES=(
["ml110-01"]="192.168.11.10"
["r630-01"]="192.168.11.11"
)
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
check_requirements() {
# Check if we have either API Token or Global API Key + Email
if [ -z "$API_TOKEN" ] && [ -z "$API_KEY" ]; then
error "Either CLOUDFLARE_API_TOKEN or CLOUDFLARE_API_KEY must be set"
fi
if [ -z "$API_TOKEN" ] && [ -z "$API_EMAIL" ]; then
error "If using CLOUDFLARE_API_KEY, CLOUDFLARE_EMAIL must also be set"
fi
if ! command -v curl &> /dev/null; then
error "curl is required but not installed"
fi
if ! command -v jq &> /dev/null; then
error "jq is required but not installed"
fi
# Try to get zone ID if not provided
if [ -z "$ZONE_ID" ]; then
get_zone_id
fi
}
get_zone_id() {
if [ -z "$ZONE_ID" ]; then
log "Getting zone ID for ${DOMAIN}..."
if [ -n "$API_TOKEN" ]; then
# Use API Token
ZONE_ID=$(curl -s -X GET \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones?name=${DOMAIN}" | \
jq -r '.result[0].id')
elif [ -n "$API_KEY" ] && [ -n "$API_EMAIL" ]; then
# Use Global API Key + Email
ZONE_ID=$(curl -s -X GET \
-H "X-Auth-Email: ${API_EMAIL}" \
-H "X-Auth-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones?name=${DOMAIN}" | \
jq -r '.result[0].id')
else
error "Cannot get Zone ID: No authentication method available"
fi
if [ "$ZONE_ID" == "null" ] || [ -z "$ZONE_ID" ]; then
error "Failed to get zone ID for ${DOMAIN}"
fi
log "Zone ID: ${ZONE_ID}"
export CLOUDFLARE_ZONE_ID="$ZONE_ID"
fi
}
create_a_record() {
local name=$1
local ip=$2
local comment=$3
log "Creating A record: ${name}.${DOMAIN}${ip}"
local response
if [ -n "$API_TOKEN" ]; then
response=$(curl -s -X POST \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dns_records" \
-d "{
\"type\": \"A\",
\"name\": \"${name}\",
\"content\": \"${ip}\",
\"ttl\": 300,
\"comment\": \"${comment}\",
\"proxied\": false
}")
else
response=$(curl -s -X POST \
-H "X-Auth-Email: ${API_EMAIL}" \
-H "X-Auth-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dns_records" \
-d "{
\"type\": \"A\",
\"name\": \"${name}\",
\"content\": \"${ip}\",
\"ttl\": 300,
\"comment\": \"${comment}\",
\"proxied\": false
}")
fi
local success=$(echo "$response" | jq -r '.success')
local record_id=$(echo "$response" | jq -r '.result.id // empty')
if [ "$success" == "true" ] && [ -n "$record_id" ]; then
log "✓ A record created: ${name}.${DOMAIN} (ID: ${record_id})"
return 0
else
local errors=$(echo "$response" | jq -r '.errors[].message // "Unknown error"' | head -1)
warn "Failed to create A record: ${errors}"
return 1
fi
}
create_cname_record() {
local name=$1
local target=$2
local comment=$3
log "Creating CNAME record: ${name}.${DOMAIN}${target}"
local response
if [ -n "$API_TOKEN" ]; then
response=$(curl -s -X POST \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dns_records" \
-d "{
\"type\": \"CNAME\",
\"name\": \"${name}\",
\"content\": \"${target}\",
\"ttl\": 300,
\"comment\": \"${comment}\",
\"proxied\": false
}")
else
response=$(curl -s -X POST \
-H "X-Auth-Email: ${API_EMAIL}" \
-H "X-Auth-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dns_records" \
-d "{
\"type\": \"CNAME\",
\"name\": \"${name}\",
\"content\": \"${target}\",
\"ttl\": 300,
\"comment\": \"${comment}\",
\"proxied\": false
}")
fi
local success=$(echo "$response" | jq -r '.success')
local record_id=$(echo "$response" | jq -r '.result.id // empty')
if [ "$success" == "true" ] && [ -n "$record_id" ]; then
log "✓ CNAME record created: ${name}.${DOMAIN} (ID: ${record_id})"
return 0
else
local errors=$(echo "$response" | jq -r '.errors[].message // "Unknown error"' | head -1)
warn "Failed to create CNAME record: ${errors}"
return 1
fi
}
check_record_exists() {
local name=$1
local type=$2
local response
if [ -n "$API_TOKEN" ]; then
response=$(curl -s -X GET \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dns_records?name=${name}.${DOMAIN}&type=${type}")
else
response=$(curl -s -X GET \
-H "X-Auth-Email: ${API_EMAIL}" \
-H "X-Auth-Key: ${API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dns_records?name=${name}.${DOMAIN}&type=${type}")
fi
local count=$(echo "$response" | jq -r '.result | length')
if [ "$count" -gt 0 ]; then
return 0 # Record exists
else
return 1 # Record does not exist
fi
}
setup_instance_dns() {
local instance_name=$1
local ip=$2
local fqdn="${instance_name}.${DOMAIN}"
# Create A record
if check_record_exists "$instance_name" "A"; then
warn "A record for ${fqdn} already exists, skipping..."
else
create_a_record "$instance_name" "$ip" "Proxmox Instance - ${instance_name}"
fi
# Create API CNAME
if check_record_exists "${instance_name}-api" "CNAME"; then
warn "CNAME record for ${instance_name}-api.${DOMAIN} already exists, skipping..."
else
create_cname_record "${instance_name}-api" "$fqdn" "Proxmox ${instance_name} API endpoint"
fi
# Create metrics CNAME
if check_record_exists "${instance_name}-metrics" "CNAME"; then
warn "CNAME record for ${instance_name}-metrics.${DOMAIN} already exists, skipping..."
else
create_cname_record "${instance_name}-metrics" "$fqdn" "Proxmox ${instance_name} metrics endpoint"
fi
}
main() {
log "Setting up DNS records for Proxmox instances"
log "Domain: ${DOMAIN}"
check_requirements
get_zone_id
log ""
log "Creating DNS records for ${#INSTANCES[@]} instances..."
log ""
for instance_name in "${!INSTANCES[@]}"; do
setup_instance_dns "$instance_name" "${INSTANCES[$instance_name]}"
echo ""
done
log "DNS setup complete!"
log ""
log "Created records:"
for instance_name in "${!INSTANCES[@]}"; do
echo "${instance_name}.${DOMAIN}${INSTANCES[$instance_name]}"
echo "${instance_name}-api.${DOMAIN}${instance_name}.${DOMAIN}"
echo "${instance_name}-metrics.${DOMAIN}${instance_name}.${DOMAIN}"
done
}
main "$@"

221
scripts/setup-monitoring.sh Executable file
View File

@@ -0,0 +1,221 @@
#!/bin/bash
# setup-monitoring.sh
# Sets up Prometheus scraping and Grafana dashboards for Proxmox
set -euo pipefail
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Configuration
PROMETHEUS_NAMESPACE="${PROMETHEUS_NAMESPACE:-monitoring}"
GRAFANA_NAMESPACE="${GRAFANA_NAMESPACE:-monitoring}"
DASHBOARD_DIR="${DASHBOARD_DIR:-./infrastructure/monitoring/dashboards}"
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
check_prerequisites() {
log "Checking prerequisites..."
if ! command -v kubectl &> /dev/null; then
error "kubectl is required but not installed"
fi
if ! kubectl cluster-info &> /dev/null; then
error "Cannot connect to Kubernetes cluster"
fi
log "✓ Prerequisites check passed"
}
create_prometheus_service_monitor() {
log "Creating Prometheus ServiceMonitor for Proxmox exporters..."
kubectl apply -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: proxmox-exporters
namespace: ${PROMETHEUS_NAMESPACE}
labels:
app: proxmox
spec:
selector:
matchLabels:
app: proxmox-exporter
endpoints:
- port: metrics
interval: 30s
path: /metrics
scheme: http
EOF
log "✓ ServiceMonitor created"
}
create_prometheus_scrape_config() {
log "Creating Prometheus scrape configuration..."
# This would be added to Prometheus ConfigMap
info "Add the following to your Prometheus configuration:"
cat <<EOF
- job_name: 'proxmox'
scrape_interval: 30s
static_configs:
- targets:
- 'ml110-01-metrics.sankofa.nexus:9221'
- 'r630-01-metrics.sankofa.nexus:9221'
labels:
cluster: 'proxmox'
EOF
}
import_grafana_dashboards() {
log "Importing Grafana dashboards..."
if [ ! -d "$DASHBOARD_DIR" ]; then
warn "Dashboard directory not found: ${DASHBOARD_DIR}"
return 0
fi
local dashboards=(
"proxmox-cluster.json"
"proxmox-vms.json"
"proxmox-node.json"
)
for dashboard in "${dashboards[@]}"; do
local dashboard_file="${DASHBOARD_DIR}/${dashboard}"
if [ -f "$dashboard_file" ]; then
info "Dashboard file found: ${dashboard}"
info "Import via Grafana UI or API:"
info " kubectl port-forward -n ${GRAFANA_NAMESPACE} svc/grafana 3000:80"
info " Then import: http://localhost:3000/dashboard/import"
else
warn "Dashboard file not found: ${dashboard_file}"
fi
done
}
create_grafana_datasource() {
log "Creating Grafana datasource configuration..."
info "Prometheus datasource should be configured in Grafana:"
info " URL: http://prometheus.${PROMETHEUS_NAMESPACE}.svc.cluster.local:9090"
info " Access: Server (default)"
info ""
info "Configure via Grafana UI or API"
}
create_alerts() {
log "Creating Prometheus alert rules..."
kubectl apply -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: proxmox-alerts
namespace: ${PROMETHEUS_NAMESPACE}
labels:
app: proxmox
spec:
groups:
- name: proxmox
interval: 30s
rules:
- alert: ProxmoxNodeDown
expr: up{job="proxmox"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Proxmox node is down"
description: "Proxmox node {{ \$labels.instance }} has been down for more than 5 minutes"
- alert: ProxmoxHighCPU
expr: pve_node_cpu_usage > 90
for: 10m
labels:
severity: warning
annotations:
summary: "Proxmox node CPU usage is high"
description: "Node {{ \$labels.node }} CPU usage is {{ \$value }}%"
- alert: ProxmoxHighMemory
expr: pve_node_memory_usage > 90
for: 10m
labels:
severity: warning
annotations:
summary: "Proxmox node memory usage is high"
description: "Node {{ \$labels.node }} memory usage is {{ \$value }}%"
- alert: ProxmoxStorageFull
expr: pve_storage_usage > 90
for: 5m
labels:
severity: critical
annotations:
summary: "Proxmox storage is nearly full"
description: "Storage {{ \$labels.storage }} on node {{ \$labels.node }} is {{ \$value }}% full"
EOF
log "✓ Alert rules created"
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Proxmox Monitoring Setup ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
check_prerequisites
echo ""
create_prometheus_service_monitor
echo ""
create_prometheus_scrape_config
echo ""
create_alerts
echo ""
import_grafana_dashboards
echo ""
create_grafana_datasource
echo ""
log "Monitoring setup complete!"
echo ""
info "Next steps:"
info "1. Verify Prometheus is scraping: kubectl port-forward -n ${PROMETHEUS_NAMESPACE} svc/prometheus 9090:9090"
info "2. Import Grafana dashboards via UI"
info "3. Configure alert notifications"
info "4. Verify metrics are being collected"
}
main "$@"

View File

@@ -0,0 +1,140 @@
#!/bin/bash
# setup-ssh-with-password.sh
# Sets up SSH access using password from .env file
set -euo pipefail
# Load environment variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "${SCRIPT_DIR}/../.env" ]; then
set -a
source <(grep -v '^#' "${SCRIPT_DIR}/../.env" | grep -v '^$' | sed 's/^/export /')
set +a
fi
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
SSH_KEY="${SSH_KEY:-$HOME/.ssh/sankofa_proxmox}"
NODE1_IP="192.168.11.10"
NODE2_IP="192.168.11.11"
PROXMOX_PASSWORD="${PROXMOX_ROOT_PASS:-${PROXMOX_PASSWORD:-}}"
log() {
echo -e "${GREEN}[✓]${NC} $1"
}
error() {
echo -e "${RED}[✗]${NC} $1" >&2
}
warn() {
echo -e "${YELLOW}[!]${NC} $1"
}
info() {
echo -e "${BLUE}[i]${NC} $1"
}
check_password() {
if [ -z "$PROXMOX_PASSWORD" ]; then
warn "PROXMOX_ROOT_PASS or PROXMOX_PASSWORD not set in .env file"
info "Add to .env: PROXMOX_ROOT_PASS=your-root-password"
return 1
fi
return 0
}
copy_key_with_password() {
local node_ip=$1
local node_name=$2
info "Copying SSH key to ${node_name} using password..."
if [ -z "$PROXMOX_PASSWORD" ]; then
error "Password not available - cannot copy key automatically"
return 1
fi
# Use sshpass if available, or expect, or manual
if command -v sshpass &> /dev/null; then
if sshpass -p "$PROXMOX_PASSWORD" ssh-copy-id -i "$SSH_KEY.pub" -o StrictHostKeyChecking=no root@"${node_ip}" 2>/dev/null; then
log "SSH key copied to ${node_name} using sshpass"
return 0
else
error "Failed to copy key to ${node_name}"
return 1
fi
else
warn "sshpass not installed - cannot automate password-based key copy"
info "Install sshpass: sudo apt-get install sshpass"
info "Or copy manually: ssh-copy-id -i $SSH_KEY.pub root@${node_ip}"
return 1
fi
}
main() {
echo ""
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ SSH Setup with Password from .env ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo ""
# Check for password
if ! check_password; then
echo ""
info "To use this script, add to .env file:"
echo " PROXMOX_PASSWORD=your-root-password-here"
echo ""
info "Alternatively, use manual SSH key copy:"
echo " ssh-copy-id -i $SSH_KEY.pub root@192.168.11.10"
echo " ssh-copy-id -i $SSH_KEY.pub root@192.168.11.11"
echo ""
return 1
fi
# Check for SSH key
if [ ! -f "$SSH_KEY" ]; then
info "Generating SSH key..."
ssh-keygen -t ed25519 -C "sankofa-proxmox" -f "$SSH_KEY" -N "" -q
log "SSH key generated: $SSH_KEY"
else
log "SSH key exists: $SSH_KEY"
fi
# Check for sshpass
if ! command -v sshpass &> /dev/null; then
warn "sshpass not installed"
info "Install with: sudo apt-get install sshpass"
info "Or use manual key copy (will prompt for password)"
echo ""
fi
# Copy keys
copy_key_with_password "$NODE1_IP" "ML110-01"
copy_key_with_password "$NODE2_IP" "R630-01"
# Test connections
echo ""
info "Testing SSH connections..."
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no root@"${NODE1_IP}" 'hostname' &> /dev/null; then
log "SSH to ML110-01 works!"
else
warn "SSH to ML110-01 failed"
fi
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no root@"${NODE2_IP}" 'hostname' &> /dev/null; then
log "SSH to R630-01 works!"
else
warn "SSH to R630-01 failed"
fi
echo ""
}
main "$@"

220
scripts/smoke-tests.sh Executable file
View File

@@ -0,0 +1,220 @@
#!/bin/bash
# Smoke Tests for Sankofa Phoenix
# Run critical user flows to verify system health
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Configuration
API_URL="${API_URL:-https://api.sankofa.nexus}"
PORTAL_URL="${PORTAL_URL:-https://portal.sankofa.nexus}"
KEYCLOAK_URL="${KEYCLOAK_URL:-https://keycloak.sankofa.nexus}"
# Test results
PASSED=0
FAILED=0
SKIPPED=0
# Helper functions
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
test_pass() {
log_info "$1"
((PASSED++))
}
test_fail() {
log_error "$1"
((FAILED++))
}
test_skip() {
log_warn "$1 (skipped)"
((SKIPPED++))
}
# Test functions
test_api_health() {
log_info "Testing API health endpoint..."
if curl -sf "${API_URL}/health" > /dev/null; then
test_pass "API health check"
else
test_fail "API health check"
return 1
fi
}
test_api_graphql() {
log_info "Testing GraphQL endpoint..."
RESPONSE=$(curl -sf -X POST "${API_URL}/graphql" \
-H "Content-Type: application/json" \
-d '{"query": "{ __typename }"}' || echo "ERROR")
if [[ "$RESPONSE" == *"__typename"* ]] || [[ "$RESPONSE" == *"data"* ]]; then
test_pass "GraphQL endpoint"
else
test_fail "GraphQL endpoint"
return 1
fi
}
test_portal_health() {
log_info "Testing Portal health endpoint..."
if curl -sf "${PORTAL_URL}/api/health" > /dev/null; then
test_pass "Portal health check"
else
test_fail "Portal health check"
return 1
fi
}
test_keycloak_health() {
log_info "Testing Keycloak health endpoint..."
if curl -sf "${KEYCLOAK_URL}/health" > /dev/null; then
test_pass "Keycloak health check"
else
test_fail "Keycloak health check"
return 1
fi
}
test_database_connectivity() {
log_info "Testing database connectivity..."
# This requires kubectl access
if command -v kubectl &> /dev/null; then
if kubectl exec -n api deployment/api -- \
psql "${DATABASE_URL}" -c "SELECT 1" > /dev/null 2>&1; then
test_pass "Database connectivity"
else
test_fail "Database connectivity"
return 1
fi
else
test_skip "Database connectivity (kubectl not available)"
fi
}
test_authentication() {
log_info "Testing authentication flow..."
# Test Keycloak OIDC discovery
if curl -sf "${KEYCLOAK_URL}/.well-known/openid-configuration" > /dev/null; then
test_pass "Keycloak OIDC discovery"
else
test_fail "Keycloak OIDC discovery"
return 1
fi
}
test_rate_limiting() {
log_info "Testing rate limiting..."
# Make multiple rapid requests
local count=0
for i in {1..10}; do
if curl -sf "${API_URL}/health" > /dev/null; then
((count++))
fi
done
if [ $count -gt 0 ]; then
test_pass "Rate limiting (health endpoint accessible)"
else
test_fail "Rate limiting"
return 1
fi
}
test_cors_headers() {
log_info "Testing CORS headers..."
RESPONSE=$(curl -sf -X OPTIONS "${API_URL}/graphql" \
-H "Origin: https://portal.sankofa.nexus" \
-H "Access-Control-Request-Method: POST" \
-v 2>&1 || echo "ERROR")
if [[ "$RESPONSE" == *"access-control-allow-origin"* ]]; then
test_pass "CORS headers"
else
test_skip "CORS headers (may not be configured)"
fi
}
test_security_headers() {
log_info "Testing security headers..."
RESPONSE=$(curl -sf -I "${API_URL}/health" || echo "ERROR")
local has_csp=false
local has_hsts=false
if [[ "$RESPONSE" == *"content-security-policy"* ]] || [[ "$RESPONSE" == *"Content-Security-Policy"* ]]; then
has_csp=true
fi
if [[ "$RESPONSE" == *"strict-transport-security"* ]] || [[ "$RESPONSE" == *"Strict-Transport-Security"* ]]; then
has_hsts=true
fi
if [ "$has_csp" = true ] || [ "$has_hsts" = true ]; then
test_pass "Security headers"
else
test_skip "Security headers (may not be configured)"
fi
}
# Main execution
main() {
echo "=========================================="
echo "Sankofa Phoenix Smoke Tests"
echo "=========================================="
echo ""
echo "API URL: ${API_URL}"
echo "Portal URL: ${PORTAL_URL}"
echo "Keycloak URL: ${KEYCLOAK_URL}"
echo ""
# Run tests
test_api_health
test_api_graphql
test_portal_health
test_keycloak_health
test_database_connectivity
test_authentication
test_rate_limiting
test_cors_headers
test_security_headers
# Summary
echo ""
echo "=========================================="
echo "Test Summary"
echo "=========================================="
echo "Passed: ${GREEN}${PASSED}${NC}"
echo "Failed: ${RED}${FAILED}${NC}"
echo "Skipped: ${YELLOW}${SKIPPED}${NC}"
echo ""
if [ $FAILED -eq 0 ]; then
log_info "All critical tests passed!"
exit 0
else
log_error "Some tests failed. Please investigate."
exit 1
fi
}
# Run main function
main "$@"

20
scripts/ssh-proxmox-site1.sh Executable file
View File

@@ -0,0 +1,20 @@
#!/bin/bash
# SSH to Proxmox Site 1 (ml110-01)
set -e
PROXMOX_PASSWORD="L@kers2010"
IP="192.168.11.10"
NAME="ml110-01"
echo "Connecting to $NAME at $IP..."
# Check if sshpass is installed
if command -v sshpass &> /dev/null; then
sshpass -p "$PROXMOX_PASSWORD" ssh -o StrictHostKeyChecking=no root@$IP
else
echo "sshpass not installed. Using regular SSH (you'll need to enter password manually)."
echo "Password: $PROXMOX_PASSWORD"
ssh -o StrictHostKeyChecking=no root@$IP
fi

20
scripts/ssh-proxmox-site2.sh Executable file
View File

@@ -0,0 +1,20 @@
#!/bin/bash
# SSH to Proxmox Site 2 (r630-01)
set -e
PROXMOX_PASSWORD="L@kers2010"
IP="192.168.11.11"
NAME="r630-01"
echo "Connecting to $NAME at $IP..."
# Check if sshpass is installed
if command -v sshpass &> /dev/null; then
sshpass -p "$PROXMOX_PASSWORD" ssh -o StrictHostKeyChecking=no root@$IP
else
echo "sshpass not installed. Using regular SSH (you'll need to enter password manually)."
echo "Password: $PROXMOX_PASSWORD"
ssh -o StrictHostKeyChecking=no root@$IP
fi

59
scripts/ssh-proxmox-sites.sh Executable file
View File

@@ -0,0 +1,59 @@
#!/bin/bash
# SSH to Proxmox Sites
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
echo -e "${BLUE}=== Proxmox Site SSH Connections ===${NC}"
echo ""
echo "Site 1 (ml110-01): 192.168.11.10"
echo "Site 2 (r630-01): 192.168.11.11"
echo ""
# Function to connect to a site
connect_site() {
local IP=$1
local NAME=$2
local SITE=$3
echo -e "${YELLOW}Connecting to $NAME ($SITE) at $IP...${NC}"
echo ""
ssh -o StrictHostKeyChecking=no root@$IP
}
# Menu
echo "Select which Proxmox server to connect to:"
echo " 1) Site 1 - ml110-01 (192.168.11.10)"
echo " 2) Site 2 - r630-01 (192.168.11.11)"
echo " 3) Both (test connectivity)"
echo ""
read -p "Enter choice [1-3]: " choice
case $choice in
1)
connect_site "192.168.11.10" "ml110-01" "Site 1"
;;
2)
connect_site "192.168.11.11" "r630-01" "Site 2"
;;
3)
echo -e "${YELLOW}Testing connectivity to both sites...${NC}"
echo ""
echo "Site 1 (192.168.11.10):"
ssh -o ConnectTimeout=3 -o StrictHostKeyChecking=no root@192.168.11.10 "hostname && pveversion" 2>&1 | head -5 || echo " Not reachable or requires authentication"
echo ""
echo "Site 2 (192.168.11.11):"
ssh -o ConnectTimeout=3 -o StrictHostKeyChecking=no root@192.168.11.11 "hostname && pveversion" 2>&1 | head -5 || echo " Not reachable or requires authentication"
;;
*)
echo "Invalid choice"
exit 1
;;
esac

View File

@@ -0,0 +1,62 @@
#!/bin/bash
# SSH to Proxmox with password (using sshpass if available)
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
RED='\033[0;31m'
NC='\033[0m' # No Color
PROXMOX_PASSWORD="L@kers2010"
# Check if sshpass is installed
if ! command -v sshpass &> /dev/null; then
echo -e "${YELLOW}sshpass not installed. Installing...${NC}"
if command -v apt-get &> /dev/null; then
sudo apt-get update && sudo apt-get install -y sshpass
elif command -v yum &> /dev/null; then
sudo yum install -y sshpass
else
echo -e "${RED}Could not install sshpass. Please install manually or use regular SSH.${NC}"
echo ""
echo "To connect manually:"
echo " ssh root@192.168.11.10 # Site 1"
echo " ssh root@192.168.11.11 # Site 2"
echo ""
echo "Password: $PROXMOX_PASSWORD"
exit 1
fi
fi
echo -e "${BLUE}=== Proxmox SSH Connection ===${NC}"
echo ""
echo "Select which Proxmox server to connect to:"
echo " 1) Site 1 - ml110-01 (192.168.11.10)"
echo " 2) Site 2 - r630-01 (192.168.11.11)"
echo ""
read -p "Enter choice [1-2]: " choice
case $choice in
1)
IP="192.168.11.10"
NAME="ml110-01"
;;
2)
IP="192.168.11.11"
NAME="r630-01"
;;
*)
echo "Invalid choice"
exit 1
;;
esac
echo -e "${YELLOW}Connecting to $NAME at $IP...${NC}"
echo ""
# Use sshpass to provide password
sshpass -p "$PROXMOX_PASSWORD" ssh -o StrictHostKeyChecking=no root@$IP

85
scripts/ssh-proxmox.sh Executable file
View File

@@ -0,0 +1,85 @@
#!/bin/bash
# SSH to Proxmox Server
set -e
# Default values
USERNAME="root"
PORT="22"
IP_ADDRESS=""
# Colors
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Function to display usage
usage() {
echo "Usage: $0 <IP_ADDRESS> [OPTIONS]"
echo ""
echo "Options:"
echo " -u, --user USERNAME SSH username (default: root)"
echo " -p, --port PORT SSH port (default: 22)"
echo " -h, --help Show this help message"
echo ""
echo "Examples:"
echo " $0 192.168.11.10"
echo " $0 192.168.11.10 -u root -p 22"
echo " $0 10.1.0.10"
echo ""
echo "Known Proxmox IPs from config:"
echo " • Site 1: 192.168.11.10, 10.1.0.10"
echo " • Site 2: 192.168.11.11, 10.2.0.10"
exit 1
}
# Parse arguments
if [ $# -eq 0 ]; then
usage
fi
IP_ADDRESS=$1
shift
while [[ $# -gt 0 ]]; do
case $1 in
-u|--user)
USERNAME="$2"
shift 2
;;
-p|--port)
PORT="$2"
shift 2
;;
-h|--help)
usage
;;
*)
echo -e "${RED}Unknown option: $1${NC}"
usage
;;
esac
done
# Validate IP address
if [ -z "$IP_ADDRESS" ]; then
echo -e "${RED}Error: IP address is required${NC}"
usage
fi
# Test connectivity
echo -e "${YELLOW}Testing connectivity to $IP_ADDRESS...${NC}"
if ping -c 1 -W 2 "$IP_ADDRESS" &>/dev/null; then
echo -e "${GREEN}✓ Host is reachable${NC}"
else
echo -e "${YELLOW}⚠ Host may not be reachable (ping failed, but continuing...)${NC}"
fi
# SSH connection
echo -e "${YELLOW}Connecting to Proxmox at $IP_ADDRESS...${NC}"
echo -e "${GREEN}SSH Command: ssh -p $PORT $USERNAME@$IP_ADDRESS${NC}"
echo ""
ssh -p "$PORT" "$USERNAME@$IP_ADDRESS"

168
scripts/start-smom-vms.sh Executable file
View File

@@ -0,0 +1,168 @@
#!/bin/bash
# start-smom-vms.sh
# Start all SMOM-DBIS-138 VMs via Proxmox API
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
BLUE='\033[0;34m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_warning() {
echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] ⚠️${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
# Get VM details
get_vm_details() {
local vm_name=$1
local vm_id
local node
local state
vm_id=$(kubectl get proxmoxvm "${vm_name}" -n default -o jsonpath='{.status.vmId}' 2>/dev/null || echo "")
node=$(kubectl get proxmoxvm "${vm_name}" -n default -o jsonpath='{.spec.forProvider.node}' 2>/dev/null || echo "")
state=$(kubectl get proxmoxvm "${vm_name}" -n default -o jsonpath='{.status.state}' 2>/dev/null || echo "")
if [ -z "${vm_id}" ] || [ "${vm_id}" = "0" ]; then
echo ""
return 1
fi
echo "${vm_id}|${node}|${state}"
}
# Start VM via Proxmox API (requires direct Proxmox access)
start_vm_direct() {
local vm_id=$1
local node=$2
local proxmox_host=$3
log "Starting VM ${vm_id} on ${node} via ${proxmox_host}..."
# This would require Proxmox API credentials
# For now, we'll provide instructions
log_warning "Direct Proxmox API access required"
log "To start VM manually:"
log " ssh root@${proxmox_host} 'qm start ${vm_id}'"
log " Or use Proxmox web UI: https://${proxmox_host}:8006"
log ""
}
main() {
log "=========================================="
log "SMOM-DBIS-138 VM Startup Guide"
log "=========================================="
log ""
log "VMs are created but in 'stopped' state."
log "They need to be started to receive IP addresses."
log ""
# Check VM status
local vms=(
"nginx-proxy-vm"
"cloudflare-tunnel-vm"
"smom-validator-01"
"smom-validator-02"
"smom-validator-03"
"smom-validator-04"
"smom-sentry-01"
"smom-sentry-02"
"smom-sentry-03"
"smom-sentry-04"
"smom-rpc-node-01"
"smom-rpc-node-02"
"smom-rpc-node-03"
"smom-rpc-node-04"
"smom-services"
"smom-blockscout"
"smom-monitoring"
"smom-management"
)
log "VM Status:"
log "----------"
local site1_vms=()
local site2_vms=()
for vm in "${vms[@]}"; do
local details
details=$(get_vm_details "${vm}" 2>/dev/null || echo "")
if [ -n "${details}" ]; then
IFS='|' read -r vm_id node state <<< "${details}"
if [ "${node}" = "ml110-01" ]; then
site1_vms+=("${vm_id}")
log " ${vm}: VMID=${vm_id}, Node=${node}, State=${state}"
elif [ "${node}" = "r630-01" ]; then
site2_vms+=("${vm_id}")
log " ${vm}: VMID=${vm_id}, Node=${node}, State=${state}"
fi
else
log_warning " ${vm}: Not found or no VMID"
fi
done
log ""
log "=========================================="
log "Start VMs via Proxmox"
log "=========================================="
log ""
log "Option 1: Via Proxmox Web UI"
log " 1. Open: https://192.168.11.10:8006 (Site 1)"
log " 2. Navigate to each VM"
log " 3. Click 'Start'"
log ""
log "Option 2: Via SSH/Command Line"
log ""
log "Site 1 (ml110-01) - 192.168.11.10:"
for vm_id in "${site1_vms[@]}"; do
log " ssh root@192.168.11.10 'qm start ${vm_id}'"
done
log ""
log "Site 2 (r630-01) - 192.168.11.11:"
for vm_id in "${site2_vms[@]}"; do
log " ssh root@192.168.11.11 'qm start ${vm_id}'"
done
log ""
log "Option 3: Wait for Auto-Start"
log " The Crossplane controller may start VMs automatically."
log " Monitor with: kubectl get proxmoxvm -A -w"
log ""
log "=========================================="
log "After Starting VMs"
log "=========================================="
log ""
log "1. Wait for VMs to boot (2-5 minutes)"
log "2. Check IP addresses:"
log " ./scripts/get-smom-vm-ips.sh"
log ""
log "3. Verify deployment:"
log " ./scripts/verify-deployment.sh"
log ""
log "4. Proceed with configuration:"
log " See: docs/smom-dbis-138-next-steps.md"
log ""
}
main "$@"

131
scripts/start-vms-simple.sh Executable file
View File

@@ -0,0 +1,131 @@
#!/bin/bash
# start-vms-simple.sh
# Start VMs using Proxmox API with simple curl commands
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# Load environment
if [ -f "${PROJECT_ROOT}/.env" ]; then
set -a
source "${PROJECT_ROOT}/.env"
set +a
fi
PROXMOX_PASS="${PROXMOX_ROOT_PASS:-L@kers2010}"
PROXMOX_1_URL="https://192.168.11.10:8006"
PROXMOX_2_URL="https://192.168.11.11:8006"
# Colors
GREEN='\033[0;32m'
BLUE='\033[0;34m'
RED='\033[0;31m'
NC='\033[0m'
log() {
echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"
}
log_success() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] ✅${NC} $*"
}
log_error() {
echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ❌${NC} $*"
}
# Start VM using API
start_vm_api() {
local api_url=$1
local node=$2
local vmid=$3
local password=$4
# Get ticket
local ticket_response
ticket_response=$(curl -k -s -X POST \
-d "username=root@pam&password=${password}" \
"${api_url}/api2/json/access/ticket" 2>/dev/null)
if echo "${ticket_response}" | grep -q "authentication failure"; then
return 1
fi
local ticket csrf
if command -v jq &> /dev/null; then
ticket=$(echo "${ticket_response}" | jq -r '.data.ticket // empty' 2>/dev/null)
csrf=$(echo "${ticket_response}" | jq -r '.data.CSRFPreventionToken // empty' 2>/dev/null)
else
ticket=$(echo "${ticket_response}" | grep -o '"ticket":"[^"]*' | head -1 | cut -d'"' -f4)
csrf=$(echo "${ticket_response}" | grep -o '"CSRFPreventionToken":"[^"]*' | head -1 | cut -d'"' -f4)
fi
if [ -z "${ticket}" ] || [ -z "${csrf}" ]; then
return 1
fi
# Start VM
local start_response
start_response=$(curl -k -s -X POST \
-H "CSRFPreventionToken: ${csrf}" \
-b "PVEAuthCookie=${ticket}" \
"${api_url}/api2/json/nodes/${node}/qemu/${vmid}/status/start" 2>/dev/null)
if echo "${start_response}" | grep -q '"data":null'; then
return 0
fi
return 1
}
main() {
log "Starting SMOM-DBIS-138 VMs..."
log ""
# Site 1 VMs
log "Site 1 (ml110-01):"
local site1_vms=("118:nginx-proxy-vm" "132:smom-validator-01" "133:smom-validator-02" "127:smom-sentry-01" "128:smom-sentry-02" "123:smom-rpc-node-01" "124:smom-rpc-node-02" "121:smom-management")
for vm_entry in "${site1_vms[@]}"; do
IFS=':' read -r vmid vmname <<< "${vm_entry}"
log " Starting ${vmname} (${vmid})..."
if start_vm_api "${PROXMOX_1_URL}" "ml110-01" "${vmid}" "${PROXMOX_PASS}"; then
log_success " ${vmname} started"
else
log_error " Failed to start ${vmname}"
fi
sleep 1
done
log ""
# Site 2 VMs
log "Site 2 (r630-01):"
local site2_vms=("119:cloudflare-tunnel-vm" "134:smom-validator-03" "122:smom-validator-04" "129:smom-sentry-03" "130:smom-sentry-04" "125:smom-rpc-node-03" "126:smom-rpc-node-04" "131:smom-services" "120:smom-blockscout" "122:smom-monitoring")
for vm_entry in "${site2_vms[@]}"; do
IFS=':' read -r vmid vmname <<< "${vm_entry}"
log " Starting ${vmname} (${vmid})..."
if start_vm_api "${PROXMOX_2_URL}" "r630-01" "${vmid}" "${PROXMOX_PASS}"; then
log_success " ${vmname} started"
else
log_error " Failed to start ${vmname}"
fi
sleep 1
done
log ""
log_success "VM startup initiated!"
log ""
log "Waiting 30 seconds for VMs to initialize..."
sleep 30
log "Checking VM status..."
kubectl get proxmoxvm -A -o custom-columns=NAME:.metadata.name,VMID:.status.vmId,STATE:.status.state --sort-by=.metadata.name | head -20
log ""
}
main "$@"

Some files were not shown because too many files have changed in this diff Show More