Files
loc_az_hci/scripts/MIGRATION_TO_GUEST_AGENT_IPS.md
defiQUG c39465c2bd
Some checks failed
Test / test (push) Has been cancelled
Initial commit: loc_az_hci (smom-dbis-138 excluded via .gitignore)
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-08 09:04:46 -08:00

4.7 KiB
Raw Permalink Blame History

Migration Guide: Hard-coded IPs → Guest Agent Discovery

Date: 2025-11-27
Purpose: Guide for updating remaining scripts to use guest-agent IP discovery

Quick Reference

Before

VMS=(
    "100 cloudflare-tunnel 192.168.1.60"
    "101 k3s-master 192.168.1.188"
)

read -r vmid name ip <<< "$vm_spec"
ssh "${VM_USER}@${ip}" ...

After

source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"

VMS=(
    "100 cloudflare-tunnel"
    "101 k3s-master"
)

read -r vmid name <<< "$vm_spec"
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
[[ -z "$ip" ]] && continue
ssh "${VM_USER}@${ip}" ...

Step-by-Step Migration

Step 1: Add Helper Library

At the top of your script (after loading .env):

# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
    source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
    log_error "Helper library not found. Run this script on Proxmox host or via SSH."
    exit 1
fi

Step 2: Update VM Array

Remove IPs, keep only VMID and NAME:

# Before
VMS=(
    "100 cloudflare-tunnel 192.168.1.60"
)

# After
VMS=(
    "100 cloudflare-tunnel"
)

Step 3: Update Loop Logic

# Before
for vm_spec in "${VMS[@]}"; do
    read -r vmid name ip <<< "$vm_spec"
    ssh "${VM_USER}@${ip}" ...
done

# After
for vm_spec in "${VMS[@]}"; do
    read -r vmid name <<< "$vm_spec"
    
    # Ensure guest agent is enabled
    ensure_guest_agent_enabled "$vmid" || true
    
    # Get IP from guest agent
    ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
    if [[ -z "$ip" ]]; then
        log_warn "Skipping VM $vmid ($name)  no IP from guest agent"
        continue
    fi
    
    ssh "${VM_USER}@${ip}" ...
done

Step 4: For Bootstrap Scripts (QGA Installation)

Use fallback IPs:

# Fallback IPs for bootstrap
declare -A FALLBACK_IPS=(
  ["100"]="192.168.1.60"
  ["101"]="192.168.1.188"
)

for vm_spec in "${VMS[@]}"; do
    read -r vmid name <<< "$vm_spec"
    
    # Try guest agent first, fallback to hardcoded
    ip="$(get_vm_ip_or_fallback "$vmid" "$name" "${FALLBACK_IPS[$vmid]:-}" || true)"
    [[ -z "$ip" ]] && continue
    
    # Install QGA using discovered/fallback IP
    ssh "${VM_USER}@${ip}" "sudo apt install -y qemu-guest-agent"
done

Scripts Already Updated

scripts/deploy/configure-vm-services.sh
scripts/deploy/add-ssh-keys-to-vms.sh
scripts/deploy/verify-cloud-init.sh
scripts/infrastructure/install-qemu-guest-agent.sh
scripts/fix/fix-vm-ssh-via-console.sh
scripts/ops/ssh-test-all.sh (example)

Scripts Needing Update

📋 High Priority:

  • scripts/troubleshooting/diagnose-vm-issues.sh
  • scripts/troubleshooting/test-all-access-paths.sh
  • scripts/deploy/deploy-vms-via-api.sh (IPs needed for creation, discovery after)

📋 Medium Priority:

  • scripts/vm-management/**/*.sh (many scripts)
  • scripts/infrastructure/**/*.sh (various)

📋 Low Priority:

  • Documentation scripts
  • One-time setup scripts

Testing

After updating a script:

  1. Ensure jq is installed on Proxmox host:

    ssh root@192.168.1.206 "apt update && apt install -y jq"
    
  2. Ensure QEMU Guest Agent is installed in VMs:

    ./scripts/infrastructure/install-qemu-guest-agent.sh
    
  3. Test the script:

    ./scripts/your-updated-script.sh
    
  4. Verify IP discovery:

    • Script should discover IPs automatically
    • No hard-coded IPs in output
    • Graceful handling if guest agent unavailable

Common Patterns

Pattern 1: Simple SSH Loop

for vm_spec in "${VMS[@]}"; do
    read -r vmid name <<< "$vm_spec"
    ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
    [[ -z "$ip" ]] && continue
    ssh "${VM_USER}@${ip}" "command"
done

Pattern 2: Collect IPs First

declare -A VM_IPS
for vm_spec in "${VMS[@]}"; do
    read -r vmid name <<< "$vm_spec"
    ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
    [[ -n "$ip" ]] && VM_IPS["$vmid"]="$ip"
done

# Use collected IPs
if [[ -n "${VM_IPS[100]:-}" ]]; then
    do_something "${VM_IPS[100]}"
fi

Pattern 3: Bootstrap with Fallback

declare -A FALLBACK_IPS=(
  ["100"]="192.168.1.60"
)

for vm_spec in "${VMS[@]}"; do
    read -r vmid name <<< "$vm_spec"
    ip="$(get_vm_ip_or_fallback "$vmid" "$name" "${FALLBACK_IPS[$vmid]:-}" || true)"
    [[ -z "$ip" ]] && continue
    # Use IP for bootstrap
done

Benefits After Migration

  1. No IP maintenance in scripts
  2. Works with DHCP, dynamic IPs
  3. Single source of truth (guest agent)
  4. Easier to add new VMs
  5. Better error handling

Next: Update remaining scripts following this pattern. Start with high-priority scripts.