Files
proxmox/reports/storage/STORAGE_REVIEW_SUMMARY.md
defiQUG 8b67fcbda1 Organize docs directory: move 25 files to appropriate locations
- Created docs/00-meta/ for documentation meta files (11 files)
- Created docs/archive/reports/ for reports (5 files)
- Created docs/archive/issues/ for issue tracking (2 files)
- Created docs/bridge/contracts/ for Solidity contracts (3 files)
- Created docs/04-configuration/metamask/ for Metamask configs (3 files)
- Created docs/scripts/ for documentation scripts (2 files)
- Root directory now contains only 3 essential files (89.3% reduction)

All recommended actions from docs directory review complete.
2026-01-06 03:32:20 -08:00

16 KiB

Proxmox Storage Review - Complete Summary and Recommendations

Date: January 6, 2026
Review Scope: All Proxmox nodes and storage configurations
Status: Complete


Executive Summary

This document provides a comprehensive review of all storage across all Proxmox nodes with detailed recommendations for optimization, capacity planning, and performance improvements.

Key Findings

  • Total Containers: 51 containers across 3 accessible nodes
  • Critical Issues: 1 storage pool at 97.78% capacity (r630-02 thin1-r630-02)
  • Storage Distribution: Uneven - ml110 has 37 containers, others underutilized
  • Available Storage: ~1.2TB total available across all nodes
  • Unreachable Nodes: r630-03 and r630-04 (require investigation)

Current Storage Status by Node

ml110 (192.168.11.10) - Management Node

Status: Operational
Containers: 37
CPU: 6 cores @ 1.60GHz (older, slower)
Memory: 125GB (55GB used, 69GB available - 44% usage)

Storage Details

Storage Name Type Status Total Used Available Usage %
local dir Active 94GB 7.5GB 85.5GB 8.02%
local-lvm lvmthin Active 813GB 227GB 586GB 27.92%
thin1-thin6 lvmthin Disabled - - - N/A

Volume Group: pve - 930.51GB total, 16GB free ⚠️

Thin Pool: data - 794.30GB (27.92% used, 1.13% metadata)

Physical Disks:

  • sda: 931.5GB
  • sdb: 931.5GB

Issues Identified

  1. ⚠️ Low Volume Group Free Space: Only 16GB free in VG (1.7%)

    • Impact: Cannot create new VMs/containers without expansion
    • Recommendation: Expand VG or migrate VMs to other nodes
  2. ⚠️ Multiple Disabled Storage Pools: thin1-thin6 are disabled

    • Impact: Storage pools configured but not usable
    • Recommendation: Clean up unused storage definitions or enable if needed
  3. ⚠️ Overloaded Node: 37 containers on slower CPU

    • Impact: Performance degradation
    • Recommendation: Migrate containers to r630-01/r630-02

Recommendations

CRITICAL:

  1. Expand Volume Group - Add physical volumes or migrate VMs
  2. Monitor Storage Closely - Only 16GB free space remaining

HIGH PRIORITY:

  1. Migrate Containers - Move 15-20 containers to r630-01/r630-02
  2. Clean Up Storage Config - Remove or enable disabled storage pools

RECOMMENDED:

  1. Storage Monitoring - Set alerts at 80% usage
  2. Backup Strategy - Implement regular backups before migration

r630-01 (192.168.11.11) - Production Node

Status: Operational
Containers: 3
CPU: 32 cores @ 2.40GHz (excellent)
Memory: 503GB (7.5GB used, 496GB available - 1.5% usage)

Storage Details

Storage Name Type Status Total Used Available Usage %
local dir Active 536GB 0.1GB 536GB 0.02%
local-lvm lvmthin Active 200GB 5.8GB 194GB 2.92%
thin1 lvmthin Active 208GB 0GB 208GB 0.00%
thin2-thin6 lvmthin Disabled - - - N/A

Volume Group: pve - 465.77GB total, 57GB free

Thin Pools:

  • data: 200GB (2.92% used, 11.42% metadata)
  • thin1: 208GB (0.00% used, 10.43% metadata)

Physical Disks:

  • sda, sdb: 558.9GB each (boot drives)
  • sdc-sdh: 232.9GB each (6 data drives)

Issues Identified

  1. ⚠️ Disabled Storage Pools: thin2-thin6 are disabled

    • Impact: Additional storage not available
    • Recommendation: Enable if needed or remove from config
  2. Excellent Capacity: 57GB free in VG, 408GB available storage

    • Status: Ready for VM deployment

Recommendations

HIGH PRIORITY:

  1. Enable Additional Storage - Enable thin2-thin6 if needed (or remove from config)
  2. Migrate VMs from ml110 - This node is ready for 15-20 containers

RECOMMENDED:

  1. Storage Optimization - Consider using thin1 for new deployments
  2. Performance Tuning - Optimize for high-performance workloads

r630-02 (192.168.11.12) - Production Node

Status: Operational
Containers: 11
CPU: 56 cores @ 2.00GHz (excellent - best CPU)
Memory: 251GB (16GB used, 235GB available - 6.4% usage)

Storage Details

Storage Name Type Status Total Used Available Usage %
local dir Active 220GB 4.0GB 216GB 1.81%
local-lvm lvmthin Disabled - - - N/A
thin1 lvmthin ⚠️ Inactive - - - 0.00%
thin1-r630-02 lvmthin 🔴 CRITICAL 226GB 221GB 5.0GB 97.78%
thin2 lvmthin Active 226GB 0GB 226GB 0.00%
thin3 lvmthin Active 226GB 0GB 226GB 0.00%
thin4 lvmthin Active 226GB 28.7GB 197GB 12.69%
thin5 lvmthin Active 226GB 0GB 226GB 0.00%
thin6 lvmthin Active 226GB 0GB 226GB 0.00%

Volume Groups:

  • thin1-thin6: Each 230.87GB with 0.12GB free

Thin Pools:

  • thin1-r630-02: 226.13GB (97.78% used, 3.84% metadata) 🔴 CRITICAL
  • thin4: 226.13GB (12.69% used, 1.15% metadata)
  • thin2, thin3, thin5, thin6: All empty (0.00% used)

Physical Disks:

  • sda-sdh: 232.9GB each (8 data drives)

Issues Identified

  1. 🔴 CRITICAL: Storage Nearly Full - thin1-r630-02 at 97.78% capacity

    • Impact: Cannot create new VMs/containers on this storage
    • Action Required: IMMEDIATE - Migrate VMs or expand storage
    • Available: Only 5GB free
  2. ⚠️ Inactive Storage: thin1 is inactive

    • Impact: Storage pool not usable
    • Recommendation: Activate or remove from config
  3. ⚠️ Disabled Storage: local-lvm is disabled

    • Impact: Standard storage name not available
    • Recommendation: Enable if volume group exists
  4. Excellent Capacity Available: thin2, thin3, thin5, thin6 are empty (904GB total)

Recommendations

CRITICAL (IMMEDIATE ACTION REQUIRED):

  1. Migrate VMs from thin1-r630-02 - Move VMs to thin2, thin3, thin5, or thin6
  2. Expand thin1-r630-02 - If migration not possible, expand the pool
  3. Monitor Closely - Set alerts for this storage pool

HIGH PRIORITY:

  1. Activate thin1 - Enable if needed or remove from config
  2. Enable local-lvm - If volume group exists, enable for standard naming
  3. Balance Storage Usage - Distribute VMs across thin2-thin6

RECOMMENDED:

  1. Storage Monitoring - Set up automated alerts
  2. Migration Plan - Document VM migration procedures

r630-03 (192.168.11.13) - Unknown Status

Status: Not Reachable
Action Required: Investigate connectivity issues

Recommendations

  1. Check Network Connectivity - Verify network connection
  2. Check Power Status - Verify node is powered on
  3. Check SSH Access - Verify SSH service is running
  4. Review Storage - Once accessible, perform full storage review

r630-04 (192.168.11.14) - Unknown Status

Status: Not Reachable
Action Required: Investigate connectivity issues

Recommendations

  1. Check Network Connectivity - Verify network connection
  2. Check Power Status - Verify node is powered on
  3. Check SSH Access - Verify SSH service is running
  4. Review Storage - Once accessible, perform full storage review

Critical Issues Summary

🔴 CRITICAL - Immediate Action Required

  1. r630-02 thin1-r630-02 Storage at 97.78% Capacity

    • Impact: Cannot create new VMs/containers
    • Action: Migrate VMs to other storage pools (thin2-thin6 available)
    • Timeline: IMMEDIATE
  2. ml110 Volume Group Low on Space (16GB free)

    • Impact: Limited capacity for new VMs
    • Action: Migrate VMs to r630-01/r630-02 or expand storage
    • Timeline: Within 1 week

⚠️ HIGH PRIORITY

  1. Uneven Workload Distribution

    • ml110: 37 containers (overloaded)
    • r630-01: 3 containers (underutilized)
    • r630-02: 11 containers (underutilized)
    • Action: Migrate 15-20 containers from ml110 to r630-01/r630-02
  2. Disabled/Inactive Storage Pools

    • Multiple storage pools disabled across nodes
    • Action: Enable if needed or clean up storage.cfg
  3. Unreachable Nodes

    • r630-03 and r630-04 not accessible
    • Action: Investigate and restore connectivity

Storage Capacity Analysis

Total Storage Capacity

Node Total Storage Used Storage Available Storage Usage %
ml110 907GB 234.5GB 671.5GB 25.9%
r630-01 744GB 5.9GB 738GB 0.8%
r630-02 1,358GB 253.7GB 1,104GB 18.7%
Total 3,009GB 494GB 2,515GB 16.4%

Storage Distribution

  • ml110: 27.92% of local-lvm used (good, but VG low on space)
  • r630-01: 2.92% of local-lvm used (excellent - ready for deployment)
  • r630-02: 97.78% of thin1-r630-02 used (CRITICAL), but other pools empty

Capacity Planning

Current Capacity: ~2.5TB available
Projected Growth: Based on current usage patterns
Recommendation: Plan for expansion when total usage reaches 70%


Detailed Recommendations

1. Immediate Actions (This Week)

r630-02 Storage Crisis

# 1. List VMs on thin1-r630-02
ssh root@192.168.11.12 "pvesm list thin1-r630-02"

# 2. Migrate VMs to thin2 (or thin3, thin5, thin6)
# Example migration:
pct migrate <VMID> r630-02 --storage thin2

# 3. Verify migration
pvesm status

ml110 Storage Expansion

Option A: Migrate VMs (Recommended)

# Migrate containers to r630-01
pct migrate <VMID> r630-01 --storage thin1

# Migrate containers to r630-02
pct migrate <VMID> r630-02 --storage thin2

Option B: Expand Volume Group

# Add physical volume (if disks available)
pvcreate /dev/sdX
vgextend pve /dev/sdX
lvextend -l +100%FREE pve/data

2. Storage Optimization (Next 2 Weeks)

Enable Disabled Storage Pools

ml110:

# Review and clean up disabled storage
ssh root@192.168.11.10
pvesm status
# Remove unused storage definitions or enable if needed

r630-01:

# Enable thin2-thin6 if volume groups exist
ssh root@192.168.11.11
# Check if VGs exist
vgs
# Enable storage pools if VGs exist
for i in thin2 thin3 thin4 thin5 thin6; do
    pvesm set $i --disable 0 2>/dev/null || echo "$i not available"
done

r630-02:

# Activate thin1 if needed
ssh root@192.168.11.12
pvesm set thin1 --disable 0

Balance Workload Distribution

Migration Plan:

  • ml110 → r630-01: Migrate 10-12 medium workload containers
  • ml110 → r630-02: Migrate 10-12 heavy workload containers
  • r630-02 thin1-r630-02 → thin2-thin6: Migrate VMs to balance storage

Target Distribution:

  • ml110: 15-17 containers (management/lightweight)
  • r630-01: 15-17 containers (medium workload)
  • r630-02: 15-17 containers (heavy workload)

3. Long-term Improvements (Next Month)

Storage Monitoring

Set Up Automated Alerts:

# Create monitoring script
cat > /usr/local/bin/storage-alert.sh << 'EOF'
#!/bin/bash
# Check storage usage and send alerts
for node in ml110 r630-01 r630-02; do
    ssh root@$node "pvesm status" | awk '$NF > 80 {print "ALERT: $1 on $node at "$NF"%"}'
done
EOF

# Add to crontab (check every hour)
0 * * * * /usr/local/bin/storage-alert.sh

Backup Strategy

  1. Implement Regular Backups

    • Daily backups for critical VMs
    • Weekly full backups
    • Monthly archive backups
  2. Backup Storage

    • Use separate storage for backups
    • Consider NFS for shared backup storage
    • Implement backup rotation (keep 30 days)

Performance Optimization

  1. Storage Performance Tuning

    • Use LVM thin for all VM disks
    • Monitor I/O performance
    • Optimize thin pool metadata size
  2. Network Storage Consideration

    • Evaluate NFS for shared storage
    • Consider Ceph for high availability
    • Plan for shared storage migration

Storage Type Recommendations

By Use Case

Use Case Recommended Storage Current Status Action
VM/Container Disks LVM Thin (lvmthin) Used Continue using
ISO Images Directory (dir) Used Continue using
Container Templates Directory (dir) Used Continue using
Backups Directory or NFS ⚠️ Not configured Implement
High-Performance VMs LVM Thin or ZFS LVM Thin Consider ZFS for future

Storage Performance Best Practices

  1. Use LVM Thin for VM Disks Currently implemented
  2. Monitor Thin Pool Metadata ⚠️ Set up monitoring
  3. Balance Storage Across Nodes ⚠️ Needs improvement
  4. Implement Backup Storage Not implemented

Security Recommendations

  1. Storage Access Control

    • Review /etc/pve/storage.cfg node restrictions
    • Ensure proper node assignments
    • Verify storage permissions
  2. Backup Security

    • Encrypt backups containing sensitive data
    • Store backups off-site
    • Test backup restoration regularly

Monitoring Recommendations

Storage Monitoring Metrics

  1. Storage Usage - Alert at 80%
  2. Thin Pool Metadata - Alert at 80%
  3. Volume Group Free Space - Alert at 10%
  4. Storage I/O Performance - Monitor latency

Automated Alerts

Set up alerts for:

  • Storage usage >80%
  • Thin pool metadata >80%
  • Volume group free space <10%
  • Storage errors or failures

Migration Recommendations

Workload Distribution Strategy

Current State:

  • ml110: 37 containers (overloaded, slower CPU)
  • r630-01: 3 containers (underutilized, excellent CPU)
  • r630-02: 11 containers (underutilized, best CPU)

Target State:

  • ml110: 15-17 containers (management/lightweight)
  • r630-01: 15-17 containers (medium workload)
  • r630-02: 15-17 containers (heavy workload)

Benefits:

  • Better performance (ml110 CPU is slower)
  • Better resource utilization
  • Improved redundancy
  • Better storage distribution

Migration Priority

  1. CRITICAL: Migrate VMs from r630-02 thin1-r630-02 (97.78% full)
  2. HIGH: Migrate 15-20 containers from ml110 to r630-01/r630-02
  3. MEDIUM: Balance storage usage across all thin pools on r630-02

Action Plan Summary

Week 1 (Critical)

  • Migrate VMs from r630-02 thin1-r630-02 to thin2-thin6
  • Set up storage monitoring alerts
  • Investigate r630-03 and r630-04 connectivity

Week 2-3 (High Priority)

  • Migrate 15-20 containers from ml110 to r630-01/r630-02
  • Enable/clean up disabled storage pools
  • Balance storage usage across nodes
  • Implement backup strategy
  • Set up comprehensive storage monitoring
  • Optimize storage performance
  • Document storage procedures

Conclusion

This comprehensive storage review identifies:

Current Status: Storage well configured with LVM thin pools
⚠️ Critical Issues: 1 storage pool at 97.78% capacity
Capacity Available: ~2.5TB total available storage
⚠️ Distribution: Uneven workload distribution

Immediate Actions Required:

  1. Migrate VMs from r630-02 thin1-r630-02 (CRITICAL)
  2. Migrate containers from ml110 to balance workload
  3. Set up storage monitoring and alerts

Long-term Goals:

  1. Implement backup strategy
  2. Optimize storage performance
  3. Plan for storage expansion
  4. Consider shared storage for HA

Report Generated: January 6, 2026
Next Review: February 6, 2026 (Monthly)