Files
proxmox/docs/02-architecture/AI_AGENTS_57XX_DEPLOYMENT_PLAN.md
defiQUG 39359e0441
Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
Docs: validate MCP + identify curl examples, inventory_ratio convention, Redis key schema
Made-with: Cursor
2026-02-26 11:03:27 -08:00

19 KiB
Raw Blame History

AI / Agents 57xx Deployment Plan

Last Updated: 2026-02-26
Status: Active
VMID band: 57005999 (see VMID_ALLOCATION_FINAL.md)


Overview

This plan aligns with the repos canonical allocation:

  • 30003003 = RPC/monitor-adjacent (ml110 / ccip-monitor-1..4); keep as-is.
  • 54005599 = CCIP DON (Commit/Execute/RMN + admin/monitor/relay).
  • 57005999 = AI / Agents / Dev (model serving, MCP, agent runtimes).

Target setup: Ubuntu/Debian VMs, Docker-based services. If your OS differs, only the package manager and service commands change; the structure stays the same. The 5701 MCP hub implementation (read-only + risk scoring) lives in the ai-mcp-pmm-controller submodule; see Appendix F.


1) QEMU Guest Agent on VMID 30003003 (inside the VMs)

Use when you want Proxmox to see guest info (IP, agent ping) for 30003003.

Debian/Ubuntu (inside each container):

sudo apt update
sudo apt install -y qemu-guest-agent
sudo systemctl enable --now qemu-guest-agent

Verify from Proxmox host (QEMU VMs only; LXC uses pct exec, not qm agent):

# For QEMU/KVM VMs:
qm agent 3000 ping
qm agent 3001 ping
qm agent 3002 ping
qm agent 3003 ping

Note: If 30003003 are LXC containers, use pct exec <vmid> -- systemctl status qemu-guest-agent (or skip agent; LXC doesnt use QEMU guest agent the same way).


VMID Role Purpose
5701 MCP Hub DODO PMM tool server + policy guardrails + signer boundary (keys stay here, not on inference VM)
5702 Inference HF model server (e.g. TGI or llama.cpp server)
5703 Agent Worker Orchestration + decisioning; talks to 5702 (inference) and 5701 (MCP)
5704 Memory/State (optional) Postgres / Redis / vector DB for agent state and tool-call logs

Ports & network assumptions

Choose ports per VM; keep all services internal-only (no public exposure). Example:

VMID Service Port (example) Allowed callers
5701 MCP Hub tcp/3000 5702, 5703 only
5702 Inference tcp/8000 5703 only
5704 Postgres tcp/5432 5701, 5703 (internal-only)
5704 Redis tcp/6379 5701, 5703 (internal-only)

Firewall/ACL: allowlist only the VMIDs above; no ingress from outside the 57xx band unless explicitly required.


3) VM 5701 — MCP Hub (skeleton)

Principle: Signing keys and policy live here; inference VM (5702) stays keyless.

Expose MCP tools (examples):

  • dodo.get_pool_state
  • dodo.quote_add_liquidity
  • dodo.add_liquidity
  • dodo.remove_liquidity
  • dodo.risk_check (slippage, max notional, cooldown, allowlist)

Policy defaults (safe):

  • Max trade/liquidity move per action
  • Max daily notional
  • Pool allowlist only
  • Circuit breaker on oracle deviation / pool illiquidity

Hardening checklist (5701)

Before enabling execution or signing on the MCP hub, confirm:

  • Pool allowlist — only allowlisted pools are exposed to tools
  • Max slippage — cap enforced per quote/add/remove liquidity
  • Max notional per tx — hard limit per single transaction
  • Max notional per day — daily aggregate cap
  • Cooldown window — minimum time between liquidity moves (e.g. per pool or global)
  • Circuit breaker triggers — e.g. oracle deviation threshold, pool illiquidity, or manual pause
  • Keys only on 5701 — signing keys never on inference VM (5702) or agent worker (5703)

4) VM 5702 — HF inference

Pick one stack:

  • TGI (text-generation-inference) — batteries-included, good for GPU.
  • llama.cpp server — CPU-friendly, GGUF, efficient.

5) VM 5703 — Agent worker

  • Calls inference locally (5702).
  • Calls MCP server (5701) for DODO PMM tools and policy.
  • Logs actions and tool calls to 5704 (or to an existing state VM if you prefer).

6) DODO PMM “agent” — read-only vs execution

Mode Description
Read-only first (recommended) MCP server exposes state + quotes only; agent produces proposed actions. No signing or on-chain tx from the agent.
Execution enabled MCP includes signing + on-chain tx submission behind policy caps (rate limits, slippage caps, allowlisted pools).

Recommendation: start read-only; enable execution once monitoring and policy are stable.


Design decisions (before DODO implementation)

These govern how the MCP hub implements tools and policy. Risk scoring and constraints should exist in read-only mode so the agent learns under the same rules it will later execute with.

1) Policy architecture (enforced before any write tool)

The MCP hub must enforce, in order:

  • Pool allowlist — contract addresses (hardcoded or DB-driven); only allowlisted pools are exposed.
  • Max slippage — e.g. 0.51%.
  • Max single-tx notional
  • Max daily notional
  • Cooldown window per pool (minimum time between liquidity moves).
  • Oracle deviation threshold — circuit breaker when oracle vs mid price exceeds limit.
  • Gas cap
  • Circuit breaker state — persisted in Redis or Postgres; can pause all write tools.

These checks run before any write tool; in read-only mode, risk scoring uses the same constraints for proposed actions and drift analysis.

2) Data model for dodo.get_pool_state

The agent must never see raw contract structs. MCP normalizes pool state into a single schema, e.g.:

{
  "pool_address": "...",
  "chain": "arbitrum",
  "base_token": "...",
  "quote_token": "...",
  "mid_price": 1.2345,
  "oracle_price": 1.2288,
  "inventory_ratio": 0.62,
  "liquidity_base": 1250000,
  "liquidity_quote": 980000,
  "k": 0.8,
  "fee_rate": 0.003,
  "timestamp": 1700000000
}

All MCP tool responses (state, quotes, risk) use this or an extended version; no raw ABI structs are exposed to the agent.

3) Activation sequence

  • Phase 1: Docker Compose + read-only first — build risk scoring, simulate proposed actions, record agent decision drift, evaluate behavior with zero capital exposure.
  • Phase 2: Set ALLOW_WRITE=true, enable signed tx submission, keep caps very low at first.

When ready to implement: reply with “Use Docker Compose + read-only first + chain = <Arbitrum|BSC|Ethereum>” to get concrete MCP implementation (web3/ethers, DODO ABI pattern, dodo.get_pool_state, dodo.risk_check, policy guardrails, structured payload schema, simulation stub).


Repo search commands (VMID / buffer usage)

From repo root, to find any remaining references to the old buffer name or AI/Agents start:

# Fast search
grep -RIn --exclude-dir=.git "VMID_BUFFER_START" .

# Variants
grep -RIn --exclude-dir=.git "BUFFER_START\|VMID_AI_AGENTS_START" .

If scripts still use VMID_BUFFER_START, keep the alias in config and migrate to VMID_AI_AGENTS_START over time.


Appendix A — Common prerequisites (all 57xx VMs)

Assumes Ubuntu/Debian, Docker Engine + Compose plugin. Internal network ACLs enforce 57xx-only reachability per the Ports table above. In Appendices CE, 5701/5702/5704 are used as hostnames in URLs; replace with your VM hostnames or IPs if your network does not resolve numeric hostnames.

sudo apt update
sudo apt install -y ca-certificates curl gnupg ufw

# Docker
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker $USER
# Log out/in once, or:
newgrp docker

# Standard directories
sudo mkdir -p /opt/ai/{mcp,inference,agent,state}/{config,data,logs}
sudo chown -R $USER:$USER /opt/ai

Health check helpers:

docker ps
docker logs --tail=100 <container>
curl -fsS http://127.0.0.1:3000/health || true

Appendix B — VM 5704 (Memory/State): Postgres + Redis (+ optional Qdrant)

Create: /opt/ai/state/docker-compose.yml

services:
  postgres:
    image: postgres:16
    container_name: ai-state-postgres
    environment:
      POSTGRES_DB: ai
      POSTGRES_USER: ai
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - /opt/ai/state/data/postgres:/var/lib/postgresql/data
    ports:
      - "5432:5432"
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ai -d ai"]
      interval: 10s
      timeout: 5s
      retries: 10

  redis:
    image: redis:7
    container_name: ai-state-redis
    command: ["redis-server", "--appendonly", "yes", "--save", "60", "1"]
    volumes:
      - /opt/ai/state/data/redis:/data
    ports:
      - "6379:6379"
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "PING"]
      interval: 10s
      timeout: 3s
      retries: 10

  # Optional vector DB (uncomment if needed)
  # qdrant:
  #   image: qdrant/qdrant:latest
  #   container_name: ai-state-qdrant
  #   volumes:
  #     - /opt/ai/state/data/qdrant:/qdrant/storage
  #   ports:
  #     - "6333:6333"
  #   restart: unless-stopped

Create: /opt/ai/state/.env

POSTGRES_PASSWORD=change_me_strong

Run:

cd /opt/ai/state
docker compose up -d
docker compose ps

Appendix C — VM 5701 (MCP Hub): MCP server + policy stub

Provides: /health endpoint; place to implement MCP tools (dodo.get_pool_state, dodo.risk_check, etc.); policy gate stub (deny-by-default for write actions until enabled).

1) MCP server skeleton

/opt/ai/mcp/config/server.py

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import os

app = FastAPI(title="AI MCP Hub", version="1.0")

ALLOW_WRITE = os.getenv("ALLOW_WRITE", "false").lower() == "true"

class ToolCall(BaseModel):
    tool: str
    params: dict = {}

@app.get("/health")
def health():
    return {"ok": True}

@app.post("/mcp/call")
def mcp_call(call: ToolCall):
    # Hard gate: deny all write tools unless explicitly enabled
    write_tools_prefixes = ("dodo.add_", "dodo.remove_", "dodo.execute_", "dodo.rebalance_")
    if call.tool.startswith(write_tools_prefixes) and not ALLOW_WRITE:
        raise HTTPException(status_code=403, detail="Write tools disabled (read-only mode).")

    # TODO: implement tool routing:
    # - dodo.get_pool_state
    # - dodo.quote_add_liquidity
    # - dodo.risk_check
    # - etc.
    return {"tool": call.tool, "result": {"stub": True, "params": call.params}}

2) Compose for MCP Hub

/opt/ai/mcp/docker-compose.yml

services:
  mcp:
    image: python:3.11-slim
    container_name: ai-mcp-prod
    working_dir: /app
    volumes:
      - /opt/ai/mcp/config:/app
      - /opt/ai/mcp/logs:/logs
    environment:
      ALLOW_WRITE: "false"
      # Add RPC URLs and chain config when wiring DODO:
      # CHAIN: "arbitrum"
      # RPC_URL: "http://..."
    command: >
      sh -lc "pip install --no-cache-dir fastapi uvicorn pydantic &&
              uvicorn server:app --host 0.0.0.0 --port 3000"
    ports:
      - "3000:3000"
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://127.0.0.1:3000/health').read()\""]
      interval: 10s
      timeout: 5s
      retries: 10

Run:

cd /opt/ai/mcp
docker compose up -d
curl -fsS http://127.0.0.1:3000/health

When you enable execution: set ALLOW_WRITE=true only after policy checks, allowlist, and limits are implemented (see Hardening checklist).


Appendix D — VM 5702 (Inference): TGI or llama.cpp (CPU-friendly)

Option 1: llama.cpp server (CPU default)

/opt/ai/inference/docker-compose.yml

services:
  llama:
    image: ghcr.io/ggerganov/llama.cpp:server
    container_name: ai-inf-prod
    volumes:
      - /opt/ai/inference/data/models:/models
    command: >
      -m /models/model.gguf
      --host 0.0.0.0 --port 8000
      --n-gpu-layers 0
      --ctx-size 4096
    ports:
      - "8000:8000"
    restart: unless-stopped

Put your GGUF at: /opt/ai/inference/data/models/model.gguf

Run:

cd /opt/ai/inference
docker compose up -d

Option 2: TGI

Use if you have GPU and want HF-native serving; omit if CPU-only.


Appendix E — VM 5703 (Agent Worker): calls inference + MCP + logs to state

/opt/ai/agent/config/agent.py

import os, time, requests
from datetime import datetime

MCP_URL = os.getenv("MCP_URL", "http://5701:3000/mcp/call")
INF_URL = os.getenv("INF_URL", "http://5702:8000")  # llama.cpp default
MODE = os.getenv("MODE", "read-only")

def call_mcp(tool, params=None):
    r = requests.post(MCP_URL, json={"tool": tool, "params": params or {}}, timeout=30)
    r.raise_for_status()
    return r.json()

def main():
    print(f"[{datetime.utcnow().isoformat()}] agent starting; MODE={MODE}")
    while True:
        try:
            state = call_mcp("dodo.get_pool_state", {"pool": "POOL_ADDRESS_HERE"})
            risk = call_mcp("dodo.risk_check", {"state": state})
            print("state:", state)
            print("risk:", risk)
        except Exception as e:
            print("error:", e)
        time.sleep(30)

if __name__ == "__main__":
    main()

/opt/ai/agent/docker-compose.yml

services:
  agent:
    image: python:3.11-slim
    container_name: ai-agent-prod
    working_dir: /app
    volumes:
      - /opt/ai/agent/config:/app
      - /opt/ai/agent/logs:/logs
    environment:
      MCP_URL: "http://5701:3000/mcp/call"
      INF_URL: "http://5702:8000"
      MODE: "read-only"
      # PG_DSN: "postgresql://ai:...@5704:5432/ai"
      # REDIS_URL: "redis://5704:6379/0"
    command: >
      sh -lc "pip install --no-cache-dir requests &&
              python agent.py"
    restart: unless-stopped

Run:

cd /opt/ai/agent
docker compose up -d
docker logs -f ai-agent-prod

Appendix F — 5701 real MCP implementation (read-only + risk scoring)

The canonical implementation lives in the ai-mcp-pmm-controller submodule (repo root):

  • config/allowlist.json — Single source of truth: chain, pool addresses, base/quote tokens, profile name, limits (slippage bps, notional caps, cooldown, oracle deviation, gas cap).
  • config/pool_profiles.json — Maps profile names (e.g. dodo_pmm_v2_like) to contract method names so the same server code works across DODO variants.
  • config/abis/erc20.json — Minimal ERC20 ABI for symbol/decimals/balance.
  • config/server.py — FastAPI MCP hub: dodo.get_pool_state, dodo.risk_check, dodo.simulate_action; write-tool gate (both ALLOW_WRITE and EXECUTION_ARMED); normalized pool state (no raw structs).
  • docker-compose.yml — Runs server with web3, env: ALLOW_WRITE, EXECUTION_ARMED, CHAIN, RPC_URL, paths to allowlist/profiles/ERC20 ABI; optional --env-file .env; non-root user: "1000:1000".

Deploy on VM 5701: Clone proxmox with --recurse-submodules (or copy ai-mcp-pmm-controller/ to the VM). Edit config/allowlist.json (replace 0xPOOL_ADDRESS_HERE, base/quote tokens), set RPC_URL in compose or .env, then from the submodule root: mkdir -p logs && docker compose up -d. For /opt/ai/mcp layout, copy config/ to /opt/ai/mcp/config and use the submodules compose (mount config as /app).

Chain-agnostic: Switch chain by changing CHAIN and RPC_URL and updating the allowlist; adjust pool_profiles.json if a pool uses different method names. To fill inventory_ratio / liquidity_base / liquidity_quote, add the correct DODO pool ABI/profile for your target pool (see Design decisions above).

Execution gates: Write tools require both ALLOW_WRITE=true and EXECUTION_ARMED=true (defaults false). Prevents accidental enable with a single flip.

Production hygiene (submodule): .gitignore excludes .env, config/allowlist.local.json, *.log, logs/. Use .env for RPC_URL/CHAIN/gates (dont bake secrets in git). Container runs as user: "1000:1000"; ensure mounted dirs are writable by that UID.

Push + pin (no surprises)

  1. Submodule: From workstation, cd ai-mcp-pmm-controller && git push origin main.
  2. Parent: git add .gitmodules ai-mcp-pmm-controller docs/... && git commit -m "..." && git push. Sanity check: parent commit should show the submodule as a gitlink (SHA pointer), not a large file add.
  3. VM 5701: git clone --recurse-submodules <PROXMOX_REPO_URL> /opt/proxmox, then deploy from ai-mcp-pmm-controller/ per submodule README.

Interface discovery and liquidity fields

  • dodo.identify_pool_interface: Implemented. Read-only tool; does not require the pool to be in the allowlist. Probes candidate getters (getMidPrice, getOraclePrice, getBaseReserve, BASE_BALANCE, getPMMState, etc.) and returns detected_profile (e.g. dodo_pmm_v2_like or unknown), functions_found, and notes. Use it with any pool address to choose the right ABI/profile before adding to allowlist.
  • To complete liquidity fields (no trial-and-error): Provide one Arbitrum pool contract address (or DODO type hint: DPP/DSP/DVM/PMM V2). Then: minimal ABI for reserves/state, pool_profiles.json additions, and server.py diff for liquidity_base, liquidity_quote, inventory_ratio.

Validate MCP hub and run interface discovery (before you have a pool)

On VM 5701:

cd /opt/proxmox/ai-mcp-pmm-controller
docker compose --env-file .env up -d
curl -fsS http://127.0.0.1:3000/health

Run interface discovery (from 5701 or from 5703 calling MCP) once you have any candidate pool address:

curl -sS http://127.0.0.1:3000/mcp/call \
  -H 'content-type: application/json' \
  -d '{"tool":"dodo.identify_pool_interface","params":{"pool":"0xPOOL"}}' | jq
  • functions_found → which getters exist on that contract
  • notes → which reserve/state methods are missing
  • detected_profile → whether dodo_pmm_v2_like fits or you need a new profile

Inventory ratio convention

Standardized so its comparable across pool types (unless the pool exposes a canonical ratio):

  • base_value = base_reserve * mid_price
  • quote_value = quote_reserve (in quote units)
  • inventory_ratio = base_value / (base_value + quote_value)

Used consistently in dodo.get_pool_state and for policy thresholds.

Optional Redis state (circuit breaker + cooldown)

When REDIS_URL is set, use this key schema; if unset, degrade to stateless mode.

Key Value (example) Purpose
cb:<chain>:<pool> { "tripped": true, "reason": "...", "ts": 170... } Circuit breaker state
cooldown:<chain>:<pool> Unix timestamp of next allowed action time Cooldown window

Wire into dodo.risk_check and (later) write tools. Implementation: optional Redis client; if REDIS_URL missing, skip reads/writes and keep behavior stateless.


Network ACL notes (allowlist by VMID)

At minimum:

  • 5703 → 5702:8000 (agent → inference)
  • 5703 → 5701:3000 (agent → MCP)
  • 5701, 5703 → 5704:5432, 6379 (MCP/agent → state)
  • Deny everything else by default.

Healthcheck commands (paste-ready)

MCP:

curl -fsS http://5701:3000/health

Inference:

curl -fsS http://5702:8000/health || true

(llama.cpp server may not expose /health; use a request to its root if needed.)

State:

pg_isready -h 5704 -U ai -d ai
redis-cli -h 5704 ping

Owner: Architecture
Review cadence: Quarterly or upon new VMID band creation
Change control: PR required; update Version + Last Updated