# AI / Agents 57xx Deployment Plan

**Last Updated:** 2026-02-26  
**Status:** Active  
**VMID band:** 5700–5999 (see [VMID_ALLOCATION_FINAL.md](VMID_ALLOCATION_FINAL.md))

---

## Overview

This plan aligns with the repo’s canonical allocation:

- **3000–3003** = RPC/monitor-adjacent (ml110 / ccip-monitor-1..4); keep as-is.
- **5400–5599** = CCIP DON (Commit/Execute/RMN + admin/monitor/relay).
- **5700–5999** = AI / Agents / Dev (model serving, MCP, agent runtimes).

Target setup: **Ubuntu/Debian VMs, Docker-based services.** If your OS differs, only the package manager and service commands change; the structure stays the same. The **5701 MCP hub** implementation (read-only + risk scoring) lives in the **ai-mcp-pmm-controller** submodule; see Appendix F.

---

## 1) QEMU Guest Agent on VMID 3000–3003 (inside the VMs)

Use when you want Proxmox to see guest info (IP, agent ping) for 3000–3003.

**Debian/Ubuntu (inside each container):**

```bash
sudo apt update
sudo apt install -y qemu-guest-agent
sudo systemctl enable --now qemu-guest-agent
```

**Verify from Proxmox host (QEMU VMs only; LXC uses `pct exec`, not `qm agent`):**

```bash
# For QEMU/KVM VMs:
qm agent 3000 ping
qm agent 3001 ping
qm agent 3002 ping
qm agent 3003 ping
```

*Note:* If 3000–3003 are **LXC** containers, use `pct exec <vmid> -- systemctl status qemu-guest-agent` (or skip agent; LXC doesn’t use QEMU guest agent the same way).

---

## 2) Stand up AI stack in the 57xx range (recommended layout)

| VMID | Role | Purpose |
|------|------|---------|
| **5701** | MCP Hub | DODO PMM tool server + policy guardrails + signer boundary (keys stay here, not on inference VM) |
| **5702** | Inference | HF model server (e.g. TGI or llama.cpp server) |
| **5703** | Agent Worker | Orchestration + decisioning; talks to 5702 (inference) and 5701 (MCP) |
| **5704** | Memory/State (optional) | Postgres / Redis / vector DB for agent state and tool-call logs |

### Ports & network assumptions

Choose ports per VM; keep all services **internal-only** (no public exposure). Example:

| VMID | Service | Port (example) | Allowed callers |
|------|---------|----------------|-----------------|
| 5701 | MCP Hub | `tcp/3000` | 5702, 5703 only |
| 5702 | Inference | `tcp/8000` | 5703 only |
| 5704 | Postgres | `tcp/5432` | 5701, 5703 (internal-only) |
| 5704 | Redis | `tcp/6379` | 5701, 5703 (internal-only) |

Firewall/ACL: allowlist only the VMIDs above; no ingress from outside the 57xx band unless explicitly required.

---

## 3) VM 5701 — MCP Hub (skeleton)

**Principle:** Signing keys and policy live here; inference VM (5702) stays keyless.

**Expose MCP tools (examples):**

- `dodo.get_pool_state`
- `dodo.quote_add_liquidity`
- `dodo.add_liquidity`
- `dodo.remove_liquidity`
- `dodo.risk_check` (slippage, max notional, cooldown, allowlist)

**Policy defaults (safe):**

- Max trade/liquidity move per action
- Max daily notional
- Pool allowlist only
- Circuit breaker on oracle deviation / pool illiquidity

### Hardening checklist (5701)

Before enabling execution or signing on the MCP hub, confirm:

- [ ] **Pool allowlist** — only allowlisted pools are exposed to tools
- [ ] **Max slippage** — cap enforced per quote/add/remove liquidity
- [ ] **Max notional per tx** — hard limit per single transaction
- [ ] **Max notional per day** — daily aggregate cap
- [ ] **Cooldown window** — minimum time between liquidity moves (e.g. per pool or global)
- [ ] **Circuit breaker triggers** — e.g. oracle deviation threshold, pool illiquidity, or manual pause
- [ ] **Keys only on 5701** — signing keys never on inference VM (5702) or agent worker (5703)

---

## 4) VM 5702 — HF inference

Pick one stack:

- **TGI (text-generation-inference)** — batteries-included, good for GPU.
- **llama.cpp server** — CPU-friendly, GGUF, efficient.

---

## 5) VM 5703 — Agent worker

- Calls inference locally (5702).
- Calls MCP server (5701) for DODO PMM tools and policy.
- Logs actions and tool calls to 5704 (or to an existing state VM if you prefer).

---

## 6) DODO PMM “agent” — read-only vs execution

| Mode | Description |
|------|-------------|
| **Read-only first (recommended)** | MCP server exposes state + quotes only; agent produces *proposed* actions. No signing or on-chain tx from the agent. |
| **Execution enabled** | MCP includes signing + on-chain tx submission behind policy caps (rate limits, slippage caps, allowlisted pools). |

Recommendation: start **read-only**; enable execution once monitoring and policy are stable.

---

## Design decisions (before DODO implementation)

These govern how the MCP hub implements tools and policy. Risk scoring and constraints should exist in **read-only** mode so the agent learns under the same rules it will later execute with.

### 1) Policy architecture (enforced before any write tool)

The MCP hub must enforce, in order:

- **Pool allowlist** — contract addresses (hardcoded or DB-driven); only allowlisted pools are exposed.
- **Max slippage** — e.g. 0.5–1%.
- **Max single-tx notional**
- **Max daily notional**
- **Cooldown window** per pool (minimum time between liquidity moves).
- **Oracle deviation threshold** — circuit breaker when oracle vs mid price exceeds limit.
- **Gas cap**
- **Circuit breaker state** — persisted in Redis or Postgres; can pause all write tools.

These checks run **before** any write tool; in read-only mode, risk scoring uses the same constraints for proposed actions and drift analysis.

### 2) Data model for `dodo.get_pool_state`

The agent must **never** see raw contract structs. MCP normalizes pool state into a single schema, e.g.:

```json
{
  "pool_address": "...",
  "chain": "arbitrum",
  "base_token": "...",
  "quote_token": "...",
  "mid_price": 1.2345,
  "oracle_price": 1.2288,
  "inventory_ratio": 0.62,
  "liquidity_base": 1250000,
  "liquidity_quote": 980000,
  "k": 0.8,
  "fee_rate": 0.003,
  "timestamp": 1700000000
}
```

All MCP tool responses (state, quotes, risk) use this or an extended version; no raw ABI structs are exposed to the agent.

### 3) Activation sequence

- **Phase 1:** Docker Compose + read-only first — build risk scoring, simulate proposed actions, record agent decision drift, evaluate behavior with zero capital exposure.
- **Phase 2:** Set `ALLOW_WRITE=true`, enable signed tx submission, keep caps very low at first.

When ready to implement: reply with **“Use Docker Compose + read-only first + chain = &lt;Arbitrum|BSC|Ethereum&gt;”** to get concrete MCP implementation (web3/ethers, DODO ABI pattern, `dodo.get_pool_state`, `dodo.risk_check`, policy guardrails, structured payload schema, simulation stub).

---

## Repo search commands (VMID / buffer usage)

From repo root, to find any remaining references to the old buffer name or AI/Agents start:

```bash
# Fast search
grep -RIn --exclude-dir=.git "VMID_BUFFER_START" .

# Variants
grep -RIn --exclude-dir=.git "BUFFER_START\|VMID_AI_AGENTS_START" .
```

If scripts still use `VMID_BUFFER_START`, keep the alias in config and migrate to `VMID_AI_AGENTS_START` over time.

---

## Appendix A — Common prerequisites (all 57xx VMs)

Assumes Ubuntu/Debian, Docker Engine + Compose plugin. Internal network ACLs enforce 57xx-only reachability per the Ports table above. In Appendices C–E, `5701`/`5702`/`5704` are used as hostnames in URLs; replace with your VM hostnames or IPs if your network does not resolve numeric hostnames.

```bash
sudo apt update
sudo apt install -y ca-certificates curl gnupg ufw

# Docker
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker $USER
# Log out/in once, or:
newgrp docker

# Standard directories
sudo mkdir -p /opt/ai/{mcp,inference,agent,state}/{config,data,logs}
sudo chown -R $USER:$USER /opt/ai
```

Health check helpers:

```bash
docker ps
docker logs --tail=100 <container>
curl -fsS http://127.0.0.1:3000/health || true
```

---

## Appendix B — VM 5704 (Memory/State): Postgres + Redis (+ optional Qdrant)

Create: `/opt/ai/state/docker-compose.yml`

```yaml
services:
  postgres:
    image: postgres:16
    container_name: ai-state-postgres
    environment:
      POSTGRES_DB: ai
      POSTGRES_USER: ai
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - /opt/ai/state/data/postgres:/var/lib/postgresql/data
    ports:
      - "5432:5432"
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ai -d ai"]
      interval: 10s
      timeout: 5s
      retries: 10

  redis:
    image: redis:7
    container_name: ai-state-redis
    command: ["redis-server", "--appendonly", "yes", "--save", "60", "1"]
    volumes:
      - /opt/ai/state/data/redis:/data
    ports:
      - "6379:6379"
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "PING"]
      interval: 10s
      timeout: 3s
      retries: 10

  # Optional vector DB (uncomment if needed)
  # qdrant:
  #   image: qdrant/qdrant:latest
  #   container_name: ai-state-qdrant
  #   volumes:
  #     - /opt/ai/state/data/qdrant:/qdrant/storage
  #   ports:
  #     - "6333:6333"
  #   restart: unless-stopped
```

Create: `/opt/ai/state/.env`

```bash
POSTGRES_PASSWORD=change_me_strong
```

Run:

```bash
cd /opt/ai/state
docker compose up -d
docker compose ps
```

---

## Appendix C — VM 5701 (MCP Hub): MCP server + policy stub

Provides: `/health` endpoint; place to implement MCP tools (`dodo.get_pool_state`, `dodo.risk_check`, etc.); **policy gate** stub (deny-by-default for write actions until enabled).

### 1) MCP server skeleton

`/opt/ai/mcp/config/server.py`

```python
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import os

app = FastAPI(title="AI MCP Hub", version="1.0")

ALLOW_WRITE = os.getenv("ALLOW_WRITE", "false").lower() == "true"

class ToolCall(BaseModel):
    tool: str
    params: dict = {}

@app.get("/health")
def health():
    return {"ok": True}

@app.post("/mcp/call")
def mcp_call(call: ToolCall):
    # Hard gate: deny all write tools unless explicitly enabled
    write_tools_prefixes = ("dodo.add_", "dodo.remove_", "dodo.execute_", "dodo.rebalance_")
    if call.tool.startswith(write_tools_prefixes) and not ALLOW_WRITE:
        raise HTTPException(status_code=403, detail="Write tools disabled (read-only mode).")

    # TODO: implement tool routing:
    # - dodo.get_pool_state
    # - dodo.quote_add_liquidity
    # - dodo.risk_check
    # - etc.
    return {"tool": call.tool, "result": {"stub": True, "params": call.params}}
```

### 2) Compose for MCP Hub

`/opt/ai/mcp/docker-compose.yml`

```yaml
services:
  mcp:
    image: python:3.11-slim
    container_name: ai-mcp-prod
    working_dir: /app
    volumes:
      - /opt/ai/mcp/config:/app
      - /opt/ai/mcp/logs:/logs
    environment:
      ALLOW_WRITE: "false"
      # Add RPC URLs and chain config when wiring DODO:
      # CHAIN: "arbitrum"
      # RPC_URL: "http://..."
    command: >
      sh -lc "pip install --no-cache-dir fastapi uvicorn pydantic &&
              uvicorn server:app --host 0.0.0.0 --port 3000"
    ports:
      - "3000:3000"
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://127.0.0.1:3000/health').read()\""]
      interval: 10s
      timeout: 5s
      retries: 10
```

Run:

```bash
cd /opt/ai/mcp
docker compose up -d
curl -fsS http://127.0.0.1:3000/health
```

**When you enable execution:** set `ALLOW_WRITE=true` **only after** policy checks, allowlist, and limits are implemented (see Hardening checklist).

---

## Appendix D — VM 5702 (Inference): TGI or llama.cpp (CPU-friendly)

### Option 1: llama.cpp server (CPU default)

`/opt/ai/inference/docker-compose.yml`

```yaml
services:
  llama:
    image: ghcr.io/ggerganov/llama.cpp:server
    container_name: ai-inf-prod
    volumes:
      - /opt/ai/inference/data/models:/models
    command: >
      -m /models/model.gguf
      --host 0.0.0.0 --port 8000
      --n-gpu-layers 0
      --ctx-size 4096
    ports:
      - "8000:8000"
    restart: unless-stopped
```

Put your GGUF at: `/opt/ai/inference/data/models/model.gguf`

Run:

```bash
cd /opt/ai/inference
docker compose up -d
```

### Option 2: TGI

Use if you have GPU and want HF-native serving; omit if CPU-only.

---

## Appendix E — VM 5703 (Agent Worker): calls inference + MCP + logs to state

`/opt/ai/agent/config/agent.py`

```python
import os, time, requests
from datetime import datetime

MCP_URL = os.getenv("MCP_URL", "http://5701:3000/mcp/call")
INF_URL = os.getenv("INF_URL", "http://5702:8000")  # llama.cpp default
MODE = os.getenv("MODE", "read-only")

def call_mcp(tool, params=None):
    r = requests.post(MCP_URL, json={"tool": tool, "params": params or {}}, timeout=30)
    r.raise_for_status()
    return r.json()

def main():
    print(f"[{datetime.utcnow().isoformat()}] agent starting; MODE={MODE}")
    while True:
        try:
            state = call_mcp("dodo.get_pool_state", {"pool": "POOL_ADDRESS_HERE"})
            risk = call_mcp("dodo.risk_check", {"state": state})
            print("state:", state)
            print("risk:", risk)
        except Exception as e:
            print("error:", e)
        time.sleep(30)

if __name__ == "__main__":
    main()
```

`/opt/ai/agent/docker-compose.yml`

```yaml
services:
  agent:
    image: python:3.11-slim
    container_name: ai-agent-prod
    working_dir: /app
    volumes:
      - /opt/ai/agent/config:/app
      - /opt/ai/agent/logs:/logs
    environment:
      MCP_URL: "http://5701:3000/mcp/call"
      INF_URL: "http://5702:8000"
      MODE: "read-only"
      # PG_DSN: "postgresql://ai:...@5704:5432/ai"
      # REDIS_URL: "redis://5704:6379/0"
    command: >
      sh -lc "pip install --no-cache-dir requests &&
              python agent.py"
    restart: unless-stopped
```

Run:

```bash
cd /opt/ai/agent
docker compose up -d
docker logs -f ai-agent-prod
```

---

## Appendix F — 5701 real MCP implementation (read-only + risk scoring)

The **canonical implementation** lives in the **ai-mcp-pmm-controller** submodule (repo root):

- **config/allowlist.json** — Single source of truth: chain, pool addresses, base/quote tokens, profile name, limits (slippage bps, notional caps, cooldown, oracle deviation, gas cap).
- **config/pool_profiles.json** — Maps profile names (e.g. `dodo_pmm_v2_like`) to contract method names so the same server code works across DODO variants.
- **config/abis/erc20.json** — Minimal ERC20 ABI for symbol/decimals/balance.
- **config/server.py** — FastAPI MCP hub: `dodo.get_pool_state`, `dodo.risk_check`, `dodo.simulate_action`; write-tool gate (**both** `ALLOW_WRITE` and `EXECUTION_ARMED`); normalized pool state (no raw structs).
- **docker-compose.yml** — Runs server with `web3`, env: `ALLOW_WRITE`, `EXECUTION_ARMED`, `CHAIN`, `RPC_URL`, paths to allowlist/profiles/ERC20 ABI; optional `--env-file .env`; non-root `user: "1000:1000"`.

**Deploy on VM 5701:** Clone proxmox with `--recurse-submodules` (or copy `ai-mcp-pmm-controller/` to the VM). Edit `config/allowlist.json` (replace `0xPOOL_ADDRESS_HERE`, base/quote tokens), set `RPC_URL` in compose or `.env`, then from the submodule root: `mkdir -p logs && docker compose up -d`. For `/opt/ai/mcp` layout, copy `config/` to `/opt/ai/mcp/config` and use the submodule’s compose (mount `config` as `/app`).

**Chain-agnostic:** Switch chain by changing `CHAIN` and `RPC_URL` and updating the allowlist; adjust `pool_profiles.json` if a pool uses different method names. To fill `inventory_ratio` / `liquidity_base` / `liquidity_quote`, add the correct DODO pool ABI/profile for your target pool (see Design decisions above).

**Execution gates:** Write tools require **both** `ALLOW_WRITE=true` and `EXECUTION_ARMED=true` (defaults false). Prevents accidental enable with a single flip.

**Production hygiene (submodule):** `.gitignore` excludes `.env`, `config/allowlist.local.json`, `*.log`, `logs/`. Use `.env` for RPC_URL/CHAIN/gates (don’t bake secrets in git). Container runs as `user: "1000:1000"`; ensure mounted dirs are writable by that UID.

### Push + pin (no surprises)

1. **Submodule:** From workstation, `cd ai-mcp-pmm-controller && git push origin main`.
2. **Parent:** `git add .gitmodules ai-mcp-pmm-controller docs/... && git commit -m "..." && git push`. Sanity check: parent commit should show the submodule as a **gitlink** (SHA pointer), not a large file add.
3. **VM 5701:** `git clone --recurse-submodules <PROXMOX_REPO_URL> /opt/proxmox`, then deploy from `ai-mcp-pmm-controller/` per submodule README.

### Interface discovery and liquidity fields

- **`dodo.identify_pool_interface`:** Implemented. Read-only tool; does not require the pool to be in the allowlist. Probes candidate getters (getMidPrice, getOraclePrice, getBaseReserve, _BASE_BALANCE_, getPMMState, etc.) and returns `detected_profile` (e.g. `dodo_pmm_v2_like` or `unknown`), `functions_found`, and `notes`. Use it with any pool address to choose the right ABI/profile before adding to allowlist.
- **To complete liquidity fields (no trial-and-error):** Provide **one Arbitrum pool contract address** (or DODO type hint: DPP/DSP/DVM/PMM V2). Then: minimal ABI for reserves/state, `pool_profiles.json` additions, and `server.py` diff for `liquidity_base`, `liquidity_quote`, `inventory_ratio`.

### Validate MCP hub and run interface discovery (before you have a pool)

**On VM 5701:**

```bash
cd /opt/proxmox/ai-mcp-pmm-controller
docker compose --env-file .env up -d
curl -fsS http://127.0.0.1:3000/health
```

**Run interface discovery** (from 5701 or from 5703 calling MCP) once you have any candidate pool address:

```bash
curl -sS http://127.0.0.1:3000/mcp/call \
  -H 'content-type: application/json' \
  -d '{"tool":"dodo.identify_pool_interface","params":{"pool":"0xPOOL"}}' | jq
```

- `functions_found` → which getters exist on that contract  
- `notes` → which reserve/state methods are missing  
- `detected_profile` → whether `dodo_pmm_v2_like` fits or you need a new profile  

### Inventory ratio convention

Standardized so it’s comparable across pool types (unless the pool exposes a canonical ratio):

- `base_value = base_reserve * mid_price`
- `quote_value = quote_reserve` (in quote units)
- **`inventory_ratio = base_value / (base_value + quote_value)`**

Used consistently in `dodo.get_pool_state` and for policy thresholds.

### Optional Redis state (circuit breaker + cooldown)

When `REDIS_URL` is set, use this key schema; if unset, degrade to stateless mode.

| Key | Value (example) | Purpose |
|-----|-----------------|---------|
| `cb:<chain>:<pool>` | `{ "tripped": true, "reason": "...", "ts": 170... }` | Circuit breaker state |
| `cooldown:<chain>:<pool>` | Unix timestamp of next allowed action time | Cooldown window |

Wire into `dodo.risk_check` and (later) write tools. Implementation: optional Redis client; if `REDIS_URL` missing, skip reads/writes and keep behavior stateless.

---

## Network ACL notes (allowlist by VMID)

At minimum:

- **5703 → 5702:8000** (agent → inference)
- **5703 → 5701:3000** (agent → MCP)
- **5701, 5703 → 5704:5432, 6379** (MCP/agent → state)
- Deny everything else by default.

---

## Healthcheck commands (paste-ready)

**MCP:**

```bash
curl -fsS http://5701:3000/health
```

**Inference:**

```bash
curl -fsS http://5702:8000/health || true
```

(llama.cpp server may not expose `/health`; use a request to its root if needed.)

**State:**

```bash
pg_isready -h 5704 -U ai -d ai
redis-cli -h 5704 ping
```

---

**Owner:** Architecture  
**Review cadence:** Quarterly or upon new VMID band creation  
**Change control:** PR required; update Version + Last Updated