Files
smom-dbis-138/docs/deployment/VM_DEPLOYMENT_TROUBLESHOOTING.md
defiQUG 1fb7266469 Add Oracle Aggregator and CCIP Integration
- Introduced Aggregator.sol for Chainlink-compatible oracle functionality, including round-based updates and access control.
- Added OracleWithCCIP.sol to extend Aggregator with CCIP cross-chain messaging capabilities.
- Created .gitmodules to include OpenZeppelin contracts as a submodule.
- Developed a comprehensive deployment guide in NEXT_STEPS_COMPLETE_GUIDE.md for Phase 2 and smart contract deployment.
- Implemented Vite configuration for the orchestration portal, supporting both Vue and React frameworks.
- Added server-side logic for the Multi-Cloud Orchestration Portal, including API endpoints for environment management and monitoring.
- Created scripts for resource import and usage validation across non-US regions.
- Added tests for CCIP error handling and integration to ensure robust functionality.
- Included various new files and directories for the orchestration portal and deployment scripts.
2025-12-12 14:57:48 -08:00

6.3 KiB

VM Deployment Troubleshooting Guide

Common Issues and Solutions

VM Not Accessible

Symptoms:

  • Cannot SSH into VM
  • Ping fails
  • Connection timeout

Solutions:

  1. Check VM status:

    az vm show --resource-group $RESOURCE_GROUP --name $VM_NAME --show-details
    
  2. Check Network Security Group rules:

    az network nsg rule list --resource-group $RESOURCE_GROUP --nsg-name $NSG_NAME
    
  3. Restart VM:

    az vm restart --resource-group $RESOURCE_GROUP --name $VM_NAME
    
  4. Check public IP:

    az vm show --resource-group $RESOURCE_GROUP --name $VM_NAME --show-details --query "publicIps" -o tsv
    

Besu Container Not Starting

Symptoms:

  • Container exits immediately
  • Container status shows "Exited"
  • No logs available

Solutions:

  1. Check container logs:

    ssh besuadmin@$VM_IP "docker logs besu-validator-0"
    
  2. Check Docker service:

    ssh besuadmin@$VM_IP "systemctl status docker"
    
  3. Check systemd service:

    ssh besuadmin@$VM_IP "systemctl status besu.service"
    
  4. Check configuration file:

    ssh besuadmin@$VM_IP "cat /opt/besu/config/besu-config.toml"
    
  5. Check disk space:

    ssh besuadmin@$VM_IP "df -h"
    

Genesis File Not Found

Symptoms:

  • Besu fails to start
  • Error: "Genesis file not found"

Solutions:

  1. Check if genesis file exists:

    ssh besuadmin@$VM_IP "ls -la /opt/besu/config/genesis.json"
    
  2. Download genesis file manually:

    ssh besuadmin@$VM_IP "wget -O /opt/besu/config/genesis.json $GENESIS_FILE_URL"
    
  3. Copy genesis file from local:

    scp config/genesis.json besuadmin@$VM_IP:/opt/besu/config/genesis.json
    

Validator Keys Not Found

Symptoms:

  • Validator node fails to start
  • Error: "Validator key not found"

Solutions:

  1. Check keys directory:

    ssh besuadmin@$VM_IP "ls -la /opt/besu/keys/"
    
  2. Download keys from Key Vault:

    az keyvault secret show --vault-name $KEY_VAULT_NAME --name "validator-key-0" --query value -o tsv | ssh besuadmin@$VM_IP "cat > /opt/besu/keys/validator-key.txt"
    
  3. Set correct permissions:

    ssh besuadmin@$VM_IP "chmod 600 /opt/besu/keys/*"
    

Network Connectivity Issues

Symptoms:

  • Nodes cannot peer
  • P2P connection fails
  • RPC endpoint not accessible

Solutions:

  1. Check P2P port:

    telnet $SENTRY_IP 30303
    
  2. Check RPC port:

    curl http://$RPC_IP:8545
    
  3. Check firewall rules:

    ssh besuadmin@$VM_IP "sudo ufw status"
    
  4. Check NSG rules:

    az network nsg rule list --resource-group $RESOURCE_GROUP --nsg-name $NSG_NAME
    

High Resource Usage

Symptoms:

  • VM is slow
  • High CPU usage
  • High memory usage

Solutions:

  1. Check resource usage:

    ssh besuadmin@$VM_IP "top"
    ssh besuadmin@$VM_IP "docker stats"
    
  2. Check Besu JVM settings:

    ssh besuadmin@$VM_IP "cat /opt/besu/docker-compose.yml | grep BESU_OPTS"
    
  3. Scale up VM:

    az vm resize --resource-group $RESOURCE_GROUP --name $VM_NAME --size Standard_D8s_v3
    

Disk Space Issues

Symptoms:

  • Besu fails to write
  • "No space left on device" error

Solutions:

  1. Check disk usage:

    ssh besuadmin@$VM_IP "df -h"
    
  2. Clean up old logs:

    ssh besuadmin@$VM_IP "docker system prune -f"
    ssh besuadmin@$VM_IP "find /opt/besu/logs -name '*.log' -mtime +7 -delete"
    
  3. Resize disk:

    az disk update --resource-group $RESOURCE_GROUP --name $DISK_NAME --size-gb 512
    

Cloud-init Issues

Symptoms:

  • VM not configured properly
  • Docker not installed
  • Services not started

Solutions:

  1. Check cloud-init logs:

    ssh besuadmin@$VM_IP "sudo cat /var/log/cloud-init-output.log"
    
  2. Re-run cloud-init:

    ssh besuadmin@$VM_IP "sudo cloud-init clean"
    ssh besuadmin@$VM_IP "sudo cloud-init init"
    
  3. Manually run setup script:

    ssh besuadmin@$VM_IP "sudo /opt/besu/setup.sh"
    

Key Vault Access Issues

Symptoms:

  • Cannot download keys from Key Vault
  • "Access denied" error

Solutions:

  1. Check Managed Identity:

    az vm identity show --resource-group $RESOURCE_GROUP --name $VM_NAME
    
  2. Check Key Vault access policy:

    az keyvault show --name $KEY_VAULT_NAME --query "properties.accessPolicies"
    
  3. Add access policy:

    PRINCIPAL_ID=$(az vm identity show --resource-group $RESOURCE_GROUP --name $VM_NAME --query "principalId" -o tsv)
    az keyvault set-policy --name $KEY_VAULT_NAME --object-id $PRINCIPAL_ID --secret-permissions get list
    

Diagnostic Commands

Check VM Status

az vm list --resource-group $RESOURCE_GROUP --show-details

Check Container Status

ssh besuadmin@$VM_IP "docker ps -a"

Check Service Status

ssh besuadmin@$VM_IP "systemctl status besu.service"

Check Logs

# Besu logs
ssh besuadmin@$VM_IP "docker logs besu-validator-0"

# System logs
ssh besuadmin@$VM_IP "journalctl -u besu.service -n 100"

# Cloud-init logs
ssh besuadmin@$VM_IP "sudo cat /var/log/cloud-init-output.log"

Check Network

# Check connectivity
ping $VM_IP

# Check ports
nmap -p 30303,8545,8546,9545 $VM_IP

# Check DNS
nslookup $VM_IP

Check Resources

# CPU and memory
ssh besuadmin@$VM_IP "top -bn1 | head -20"

# Disk usage
ssh besuadmin@$VM_IP "df -h"

# Network usage
ssh besuadmin@$VM_IP "iftop"

Getting Help

If you encounter issues not covered here:

  1. Check the main troubleshooting guide
  2. Review VM deployment documentation
  3. Check Besu logs for detailed error messages
  4. Review Azure VM logs in Azure Portal
  5. Check Network Security Group rules
  6. Verify Key Vault access policies

Prevention

To prevent common issues:

  1. Regular Monitoring: Use monitoring scripts to catch issues early
  2. Backup: Regularly backup VM data
  3. Updates: Keep VMs and Docker images updated
  4. Resource Planning: Monitor resource usage and scale as needed
  5. Security: Regularly review and update NSG rules and Key Vault policies