Files
Sankofa/docs/archive/status/GUEST_AGENT_COMPLETE_PROCEDURE.md
defiQUG 7cd7022f6e Update .gitignore, remove package-lock.json, and enhance Cloudflare and Proxmox adapters
- Added lock file exclusions for pnpm in .gitignore.
- Removed obsolete package-lock.json from the api and portal directories.
- Enhanced Cloudflare adapter with additional interfaces for zones and tunnels.
- Improved Proxmox adapter error handling and logging for API requests.
- Updated Proxmox VM parameters with validation rules in the API schema.
- Enhanced documentation for Proxmox VM specifications and examples.
2025-12-12 19:29:01 -08:00

8.7 KiB

QEMU Guest Agent: Complete Setup and Verification Procedure

Last Updated: 2025-12-11
Status: Complete and Verified


Overview

This document provides comprehensive procedures for ensuring QEMU Guest Agent is properly configured in all VMs across the Sankofa Phoenix infrastructure. The guest agent is critical for:

  • Graceful VM shutdown/restart
  • VM lock prevention
  • Guest OS command execution
  • IP address detection
  • Resource monitoring

Architecture

Two-Level Configuration

  1. Proxmox Level (agent: 1 in VM config)

    • Configured by Crossplane provider automatically
    • Enables guest agent communication channel
  2. Guest OS Level (package + service)

    • qemu-guest-agent package installed
    • qemu-guest-agent service running
    • Configured via cloud-init in all templates

Automatic Configuration

Crossplane Provider (Automatic)

The Crossplane provider automatically sets agent: 1 during:

  • VM Creation (pkg/proxmox/client.go:317)
  • VM Cloning (pkg/proxmox/client.go:242)
  • VM Updates (pkg/proxmox/client.go:671)

No manual intervention required - this is handled by the provider.

Cloud-Init Templates (Automatic)

All VM templates include enhanced guest agent configuration:

  1. Package Installation: qemu-guest-agent in packages list
  2. Service Enablement: systemctl enable qemu-guest-agent
  3. Service Start: systemctl start qemu-guest-agent
  4. Verification: Automatic retry logic with status checks
  5. Error Handling: Automatic installation if package missing

Templates Updated:

  • examples/production/basic-vm.yaml
  • examples/production/medium-vm.yaml
  • examples/production/large-vm.yaml
  • crossplane-provider-proxmox/examples/vm-example.yaml
  • gitops/infrastructure/claims/vm-claim-example.yaml
  • All 29 production VM templates (via enhancement script)

Verification Procedures

1. Check Proxmox Configuration

On Proxmox Node:

# Check if guest agent is enabled in VM config
qm config <VMID> | grep agent

# Expected output:
# agent: 1

If not enabled:

qm set <VMID> --agent 1

2. Check Guest OS Package

On Proxmox Node (requires working guest agent):

# Check if package is installed
qm guest exec <VMID> -- dpkg -l | grep qemu-guest-agent

# Expected output:
# ii  qemu-guest-agent  <version>  amd64  Guest communication agent for QEMU

If not installed (via console/SSH):

apt-get update
apt-get install -y qemu-guest-agent
systemctl enable qemu-guest-agent
systemctl start qemu-guest-agent

3. Check Guest OS Service

On Proxmox Node:

# Check service status
qm guest exec <VMID> -- systemctl status qemu-guest-agent

# Expected output:
# ● qemu-guest-agent.service - QEMU Guest Agent
#    Loaded: loaded (...)
#    Active: active (running) since ...

If not running:

qm guest exec <VMID> -- systemctl enable qemu-guest-agent
qm guest exec <VMID> -- systemctl start qemu-guest-agent

4. Comprehensive Check Script

Use the automated check script:

# On Proxmox node
/usr/local/bin/complete-vm-100-guest-agent-check.sh

# Or for any VM:
VMID=100
/usr/local/bin/complete-vm-100-guest-agent-check.sh

Script checks:

  • VM exists and is running
  • Proxmox guest agent config (agent: 1)
  • Package installation
  • Service status
  • Provides clear error messages

Troubleshooting

Issue: "No QEMU guest agent configured"

Symptoms:

  • qm guest exec commands fail
  • Proxmox shows "No Guest Agent" in UI

Causes:

  1. Guest agent not enabled in Proxmox config
  2. Package not installed in guest OS
  3. Service not running in guest OS
  4. VM needs restart after configuration

Solutions:

  1. Enable in Proxmox:

    qm set <VMID> --agent 1
    
  2. Install in Guest OS:

    # Via console or SSH
    apt-get update
    apt-get install -y qemu-guest-agent
    systemctl enable qemu-guest-agent
    systemctl start qemu-guest-agent
    
  3. Restart VM:

    qm shutdown <VMID>  # Graceful (requires working agent)
    # OR
    qm stop <VMID>      # Force stop
    qm start <VMID>
    

Issue: VM Lock Issues

Symptoms:

  • qm commands fail with lock errors
  • VM appears stuck

Solution:

# Check for locks
ls -la /var/lock/qemu-server/lock-<VMID>.conf

# Remove lock (if safe)
qm unlock <VMID>

# Force stop if needed
qm stop <VMID> --skiplock

Issue: Guest Agent Not Starting

Symptoms:

  • Package installed but service not running
  • Service fails to start

Diagnosis:

# Check service logs
journalctl -u qemu-guest-agent -n 50

# Check service status
systemctl status qemu-guest-agent -l

Common Causes:

  • Missing dependencies
  • Permission issues
  • VM needs restart

Solution:

# Reinstall package
apt-get remove --purge qemu-guest-agent
apt-get install -y qemu-guest-agent

# Restart service
systemctl restart qemu-guest-agent

# If still failing, restart VM

Best Practices

1. Always Include Guest Agent in Templates

Required cloud-init configuration:

packages:
  - qemu-guest-agent

runcmd:
  - systemctl enable qemu-guest-agent
  - systemctl start qemu-guest-agent
  - |
    # Verification with retry
    for i in {1..30}; do
      if systemctl is-active --quiet qemu-guest-agent; then
        echo "✅ Guest agent running"
        exit 0
      fi
      sleep 1
    done

2. Verify After VM Creation

Always verify guest agent after creating a VM:

# Wait for cloud-init to complete (usually 1-2 minutes)
sleep 120

# Check status
qm guest exec <VMID> -- systemctl status qemu-guest-agent

3. Monitor Guest Agent Status

Regular monitoring:

# Check all VMs
for vmid in $(qm list | tail -n +2 | awk '{print $1}'); do
  echo "VM $vmid:"
  qm config $vmid | grep agent || echo "  ⚠️  Agent not configured"
  qm guest exec $vmid -- systemctl is-active qemu-guest-agent 2>/dev/null && echo "  ✅ Running" || echo "  ❌ Not running"
done

4. Document Exceptions

If a VM cannot have guest agent (rare), document why:

  • Legacy OS without support
  • Special security requirements
  • Known limitations

Scripts and Tools

Available Scripts

  1. scripts/complete-vm-100-guest-agent-check.sh

    • Comprehensive check for VM 100
    • Installed on both Proxmox nodes
    • Location: /usr/local/bin/complete-vm-100-guest-agent-check.sh
  2. scripts/copy-script-to-proxmox-nodes.sh

    • Copies scripts to Proxmox nodes
    • Uses SSH with password from .env
  3. scripts/enhance-guest-agent-verification.py

    • Enhanced all 29 VM templates
    • Adds robust verification logic

Usage

Copy script to Proxmox nodes:

bash scripts/copy-script-to-proxmox-nodes.sh

Run check on Proxmox node:

ssh root@<proxmox-node>
/usr/local/bin/complete-vm-100-guest-agent-check.sh

Verification Checklist

For New VMs

  • VM created with Crossplane provider (automatic agent: 1)
  • Cloud-init template includes qemu-guest-agent package
  • Cloud-init includes service enable/start commands
  • Wait for cloud-init to complete (1-2 minutes)
  • Verify package installed: qm guest exec <VMID> -- dpkg -l | grep qemu-guest-agent
  • Verify service running: qm guest exec <VMID> -- systemctl status qemu-guest-agent
  • Test graceful shutdown: qm shutdown <VMID>

For Existing VMs

  • Check Proxmox config: qm config <VMID> | grep agent
  • Enable if missing: qm set <VMID> --agent 1
  • Check package: qm guest exec <VMID> -- dpkg -l | grep qemu-guest-agent
  • Install if missing: qm guest exec <VMID> -- apt-get install -y qemu-guest-agent
  • Check service: qm guest exec <VMID> -- systemctl status qemu-guest-agent
  • Start if stopped: qm guest exec <VMID> -- systemctl start qemu-guest-agent
  • Restart VM if needed: qm shutdown <VMID> or qm stop <VMID> && qm start <VMID>

Summary

Automatic Configuration:

  • Crossplane provider sets agent: 1 automatically
  • All templates include guest agent in cloud-init

Verification:

  • Use check scripts on Proxmox nodes
  • Verify both Proxmox config and guest OS service

Troubleshooting:

  • Enable in Proxmox: qm set <VMID> --agent 1
  • Install in guest: apt-get install -y qemu-guest-agent
  • Start service: systemctl start qemu-guest-agent
  • Restart VM if needed

Best Practices:

  • Always include in templates
  • Verify after creation
  • Monitor regularly
  • Document exceptions

Related Documents:

  • docs/GUEST_AGENT_CONFIGURATION_ANALYSIS.md
  • docs/VM_100_GUEST_AGENT_FIXED.md
  • docs/GUEST_AGENT_VERIFICATION_ENHANCEMENT_COMPLETE.md
  • docs/SCRIPT_COPIED_TO_PROXMOX_NODES.md