Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands - CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround - CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check - NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere - MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates - LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference Co-authored-by: Cursor <cursoragent@cursor.com>
6.4 KiB
6.4 KiB
Vault Operations Guide
Last Updated: 2026-02-01
Document Version: 1.0
Status: Active Documentation
Date: 2026-01-19
Status: ✅ Complete
Purpose: Day-to-day operations guide for Vault cluster
Quick Reference
Cluster Information
- Cluster Nodes: 3 (vault-phoenix-1, vault-phoenix-2, vault-phoenix-3)
- API Endpoints: http://192.168.11.200:8200 (8640), http://192.168.11.215:8200 (8641), http://192.168.11.202:8200 (8642)
- Storage: Raft (integrated)
- Seal Type: Shamir (5 keys, threshold 3)
Daily Operations
Health Checks
Run health check script:
./scripts/vault-health-check.sh
With cluster status:
VAULT_TOKEN=<root-token> ./scripts/vault-health-check.sh
Check Cluster Status
ssh root@192.168.11.11 "pct exec 8640 -- bash -c 'export VAULT_ADDR=http://127.0.0.1:8200 && export VAULT_TOKEN=<token> && vault operator raft list-peers'"
Check Node Status
# Node 1
ssh root@192.168.11.11 "pct exec 8640 -- vault status"
# Node 2
ssh root@192.168.11.12 "pct exec 8641 -- vault status"
# Node 3
ssh root@192.168.11.11 "pct exec 8642 -- vault status"
Backup Operations
Manual Backup
VAULT_TOKEN=<root-token> ./scripts/vault-backup.sh
Automated Backups
Add to crontab:
# Daily backup at 2 AM
0 2 * * * cd /home/intlc/projects/proxmox && VAULT_TOKEN=<token> ./scripts/vault-backup.sh
Restore from Backup
# On Vault node
export VAULT_ADDR=http://127.0.0.1:8200
export VAULT_TOKEN=<root-token>
vault operator raft snapshot restore /path/to/backup.snapshot
Unsealing Operations
Unseal a Node
# On the node
export VAULT_ADDR=http://127.0.0.1:8200
vault operator unseal <key-1>
vault operator unseal <key-2>
vault operator unseal <key-3>
Unseal All Nodes
# Node 1
ssh root@192.168.11.11 "pct exec 8640 -- bash -c 'export VAULT_ADDR=http://127.0.0.1:8200 && vault operator unseal <key-1> && vault operator unseal <key-2> && vault operator unseal <key-3>'"
# Node 2
ssh root@192.168.11.12 "pct exec 8641 -- bash -c 'export VAULT_ADDR=http://127.0.0.1:8200 && vault operator unseal <key-1> && vault operator unseal <key-2> && vault operator unseal <key-3>'"
# Node 3
ssh root@192.168.11.11 "pct exec 8642 -- bash -c 'export VAULT_ADDR=http://127.0.0.1:8200 && vault operator unseal <key-1> && vault operator unseal <key-2> && vault operator unseal <key-3>'"
Secret Management
Create/Update Secret
vault kv put secret/phoenix/database/postgres \
username=phoenix \
password=new_password \
host=db.example.com \
port=5432 \
database=phoenix
Read Secret
vault kv get secret/phoenix/database/postgres
List Secrets
vault kv list secret/phoenix/
Delete Secret
vault kv delete secret/phoenix/old-secret
Policy Management
List Policies
vault policy list
Read Policy
vault policy read phoenix-api-policy
Update Policy
vault policy write phoenix-api-policy - <<EOF
# Updated policy content
path "secret/data/phoenix/api/*" {
capabilities = ["read"]
}
EOF
AppRole Management
List AppRoles
vault list auth/approle/role
Get Role ID
vault read auth/approle/role/phoenix-api/role-id
Generate Secret ID
vault write -f auth/approle/role/phoenix-api/secret-id
Rotate Secret ID
# Generate new secret ID
NEW_SECRET_ID=$(vault write -field=secret_id -f auth/approle/role/phoenix-api/secret-id)
# Update service configuration with new secret ID
# Then delete old secret IDs if needed
Monitoring
Enable Audit Logging
vault audit enable file file_path=/var/log/vault/audit.log
View Logs
# Service logs
ssh root@192.168.11.11 "pct exec 8640 -- journalctl -u vault -f"
# Audit logs
ssh root@192.168.11.11 "pct exec 8640 -- tail -f /var/log/vault/audit.log"
Metrics (if enabled)
curl http://192.168.11.200:8200/v1/sys/metrics?format=prometheus
Troubleshooting
Node Not Joining Cluster
- Check network connectivity:
ping 10.160.0.40
ping 10.160.0.41
ping 10.160.0.42
- Check Vault logs:
ssh root@192.168.11.11 "pct exec 8640 -- journalctl -u vault -n 50"
- Verify configuration:
ssh root@192.168.11.11 "pct exec 8640 -- cat /etc/vault.d/vault.hcl"
Service Won't Start
- Check service status:
ssh root@192.168.11.11 "pct exec 8640 -- systemctl status vault"
- Check configuration:
ssh root@192.168.11.11 "pct exec 8640 -- vault server -config=/etc/vault.d/vault.hcl -verify-only"
- Check logs:
ssh root@192.168.11.11 "pct exec 8640 -- journalctl -u vault -n 100"
Cluster Split-Brain
If cluster loses quorum:
- Identify nodes with latest data
- Remove failed nodes from cluster:
vault operator raft remove-peer <node-id>
- Rejoin nodes:
# Nodes will auto-rejoin via retry_join configuration
Maintenance
Restart Node
# Stop node
ssh root@192.168.11.11 "pct stop 8640"
# Start node
ssh root@192.168.11.11 "pct start 8640"
# Unseal after restart
ssh root@192.168.11.11 "pct exec 8640 -- bash -c 'export VAULT_ADDR=http://127.0.0.1:8200 && vault operator unseal <key-1> && vault operator unseal <key-2> && vault operator unseal <key-3>'"
Update Vault
- Backup cluster
- Update on one node at a time
- Restart node
- Unseal node
- Verify cluster health
- Repeat for other nodes
Scale Cluster
To add a node:
- Create new container
- Install Vault
- Configure with same cluster settings
- Start Vault
- Node will auto-join via retry_join
Emergency Procedures
Complete Cluster Failure
- Restore from latest backup
- Initialize new cluster if needed
- Restore Raft snapshot
- Unseal all nodes
Lost Unseal Keys
If unseal keys are lost:
- Use recovery keys (if configured)
- Or reinitialize cluster (data will be lost)
Data Corruption
- Stop affected node
- Restore from backup
- Restart node
- Verify data integrity
Related Documentation
Status: ✅ Complete
Last Updated: 2026-01-19