- Config, docs, scripts, and backup manifests - Submodule refs unchanged (m = modified content in submodules) Made-with: Cursor
7.7 KiB
Storage Recommendations by Fill Rate and Growth
Last updated: 2026-02-28
Based on current usage, history in logs/storage-growth/history.csv, and physical drive layout across ml110, r630-01, and r630-02.
Completed (2026-02-28): Storage growth cron verified; prune (VMID 5000 + r630-01 CTs) run; ml110 sdb added to VG pve and data thin pool extended to ~1.7 TB (ml110 data now ~11% used). Phase 1 migration (r630-01 data → thin1): 8 CTs migrated (10233, 10120, 10100, 10101, 10235, 10236, 7804, 8640); r630-01 data 65.8% (was 72%), thin1 50.6%.
1. Thresholds and monitoring
| Level | Use % | Action |
|---|---|---|
| Healthy | < 75% | Continue normal collection; review quarterly. |
| Watch | 75–84% | Weekly review; plan prune or migration. |
| WARN | 85–94% | Prune and/or migrate within 1–2 weeks; do not add new large CTs. |
| CRIT | ≥ 95% | Immediate action; LVM thin pools can fail or go read-only. |
Current scripts: check-disk-all-vmids.sh uses WARN 85%, CRIT 95% for container root usage. These recommendations apply to host storage (pvesm / LVM) as well.
2. Observed fill behavior (from history)
| Host | Storage | Trend (recent) | Implied rate / note |
|---|---|---|---|
| ml110 | data | ~28.7% → ~25% (Feb 15 → 27) | Slight decrease (prune/dedup). Plenty of free space. |
| r630-01 | data | 88% → 100% → 72% → 65.8% (Phase 1 migration) | After Phase 1 (8 CTs data→thin1). Main growth host (validators, RPCs, many CTs). |
| r630-02 | thin1-r630-02 | ~26.5% stable | Low growth. |
| r630-02 | thin2 | ~4.8% → ~9% after 5000 migration | Now holds Blockscout (5000); monitor. |
| r630-02 | thin5 | Was 84.6% → 0% after migration | Empty; available for future moves. |
Conclusion: The pool that fills fastest and needs the most attention is r630-01 data (72% now; many CTs, Besu/DB growth). ml110 data is stable and has headroom. r630-02 is manageable if you avoid concentrating more large CTs on a single thin pool.
3. Recommendations by host and pool
ml110
-
data / local-lvm (~25%)
- Rate: Low/slow.
- Recommendations:
- Keep running
collect-storage-growth-data.sh --append(e.g. cron every 6h). - Prune logs in CTs periodically (e.g. with
fix-storage-r630-01-and-thin5.sh-style logic for ml110 or a dedicated prune script). - No urgency; review again when approaching 70%.
- Keep running
-
Unused sdb (931G)
- Recommendation: Use it before adding new disks elsewhere.
- Option A: Add sdb to VG
pveand extend thedatathin pool (or create a second thin pool). Frees pressure on sda and doubles effective data capacity. - Option B: Create a separate VG + thin pool on sdb for new or migrated CTs.
- Option A: Add sdb to VG
- Document the chosen layout and any new Proxmox storage names in
storage.cfgand inPHYSICAL_DRIVES_AND_CONFIG.md.
- Recommendation: Use it before adding new disks elsewhere.
r630-01
-
data / local-lvm (~72%)
- Rate: Highest risk; this pool has the most CTs and Besu/DB growth.
- Recommendations:
- Short term:
- Run log/journal prune on all r630-01 CTs regularly (e.g.
fix-storage-r630-01-and-thin5.shPhase 2, or a cron job). - Keep storage growth collection (e.g. every 6h) and review weekly when > 70%.
- Run log/journal prune on all r630-01 CTs regularly (e.g.
- Before 85%:
- Move one or more large CTs to thin1 on r630-01 (thin1 ~43% used, has space) if VMIDs allow, or plan migration to r630-02 thin pools.
- Identify biggest CTs:
check-disk-all-vmids.shandlvson r630-01 (data pool).
- Before 90%:
- Decide on expansion (e.g. add disks to RAID10 and extend md0/LVM) or permanent migration of several CTs to r630-02.
- Short term:
- Do not let this pool sit above 85% for long; it has already hit 100% once.
-
thin1 (~43%)
- Rate: Moderate.
- Recommendations: Use as spillover for data pool migrations when possible. Monitor monthly; act if > 75%.
r630-02
-
thin1-r630-02 (~26%)
- Rate: Low.
- Recommendation: Monitor; no change needed unless you add many CTs here.
-
thin2 (~9% after 5000 migration)
- Rate: May grow with Blockscout (5000) and other CTs.
- Recommendations:
- Run VMID 5000 prune periodically:
vmid5000-free-disk-and-logs.sh. - If thin2 approaches 75%, consider moving one CT to thin5 (now empty) or thin6.
- Run VMID 5000 prune periodically:
-
thin3, thin4, thin6 (roughly 11–22%)
- Rate: Low to moderate.
- Recommendation: Include in weekly pvesm/lvs review; no special action unless one pool trends > 75%.
-
thin5 (0% after migration)
- Recommendation: Keep as reserve for migrations from thin2 or other pools when they approach WARN.
4. Operational schedule (by fill rate)
| When | Action |
|---|---|
| Always | Cron: collect-storage-growth-data.sh --append every 6h; weekly: prune-storage-snapshots.sh (e.g. Sun 08:00). |
| Weekly | Review pvesm status and lvs (or run audit-proxmox-rpc-storage.sh); check any pool > 70%. |
| 75% ≤ use < 85% | Plan and run prune; plan migration for largest CTs on that pool; consider using ml110 sdb (if not yet in use). |
| 85% ≤ use < 95% | Execute prune and migration within 1–2 weeks; do not add new large VMs/CTs to that pool. |
| ≥ 95% | Immediate prune + migration; consider emergency migration to ml110 (after adding sdb) or r630-02. |
5. Scripts to support these recommendations
| Script | Purpose |
|---|---|
scripts/monitoring/collect-storage-growth-data.sh --append |
Record fill over time (for rate). |
scripts/maintenance/schedule-storage-growth-cron.sh --install |
Install 6h collect + weekly prune. |
scripts/audit-proxmox-rpc-storage.sh |
Current pvesm + RPC rootfs mapping. |
scripts/maintenance/check-disk-all-vmids.sh |
Per-CT disk usage (find big consumers). |
scripts/maintenance/fix-storage-r630-01-and-thin5.sh |
Prune 5000 + r630-01 CT logs; optional migrate 5000. |
scripts/maintenance/migrate-ct-r630-01-data-to-thin1.sh <VMID> |
Migrate one CT from r630-01 data → thin1 (same host). |
scripts/maintenance/vmid5000-free-disk-and-logs.sh |
Prune Blockscout (5000) only. |
6. Adding ml110 sdb to increase capacity (suggested steps)
- On ml110:
vgextend pve /dev/sdb(if sdb is already a PV) orpvcreate /dev/sdb && vgextend pve /dev/sdb. - Extend the data thin pool:
lvextend -L +900G /dev/pve/data(or uselvextend -l +100%FREEand adjust as needed). - Re-run
pvesm statusand update documentation. - No CT migration required; existing LVs on data can use the new space.
(If sdb is a raw disk with no PV, partition or use full disk as PV per your policy; then add to pve and extend the data LV as above.)
7. Summary table by risk
| Host | Pool | Current (approx) | Risk | Priority recommendation |
|---|---|---|---|---|
| ml110 | data | ~11% (post-extension) | Low | Done: sdb added; pool ~1.7 TB. Monitor as before. |
| ml110 | sdb | In use (extended data) | — | Done: sdb added to pve, data thin pool extended (~1.7 TB total). |
| r630-01 | data | ~72% | High | Prune weekly; plan migrations before 85%; consider thin1 spillover. |
| r630-01 | thin1 | ~43% | Medium | Use for migrations from data; monitor monthly. |
| r630-02 | thin1-r630-02 | ~26% | Low | Monitor. |
| r630-02 | thin2 | ~9% | Low | Prune 5000 periodically; watch growth. |
| r630-02 | thin5 | 0% | Low | Keep as reserve for migrations. |
| r630-02 | thin3, thin4, thin6 | ~11–22% | Low | Include in weekly review. |
These recommendations are based on the rate of filling observed in history and current configurations; adjust thresholds or schedule if your growth pattern changes.