Files
Sankofa/docs/storage/CEPH_INSTALLATION.md
defiQUG 9daf1fd378 Apply Composer changes: comprehensive API updates, migrations, middleware, and infrastructure improvements
- Add comprehensive database migrations (001-024) for schema evolution
- Enhance API schema with expanded type definitions and resolvers
- Add new middleware: audit logging, rate limiting, MFA enforcement, security, tenant auth
- Implement new services: AI optimization, billing, blockchain, compliance, marketplace
- Add adapter layer for cloud integrations (Cloudflare, Kubernetes, Proxmox, storage)
- Update Crossplane provider with enhanced VM management capabilities
- Add comprehensive test suite for API endpoints and services
- Update frontend components with improved GraphQL subscriptions and real-time updates
- Enhance security configurations and headers (CSP, CORS, etc.)
- Update documentation and configuration files
- Add new CI/CD workflows and validation scripts
- Implement design system improvements and UI enhancements
2025-12-12 18:01:35 -08:00

6.8 KiB

Ceph Installation Guide for Proxmox

Last Updated: 2024-12-19
Infrastructure: 2-node Proxmox cluster (ML110-01, R630-01)

Overview

Ceph is a distributed storage system that provides object, block, and file storage. This guide covers installing Ceph on the Proxmox infrastructure to provide distributed storage for VMs.

Architecture

Cluster Configuration

Nodes:

  • ML110-01 (192.168.11.10): Ceph Monitor, OSD, Manager
  • R630-01 (192.168.11.11): Ceph Monitor, OSD, Manager

Network: 192.168.11.0/24

Ceph Components

  1. Monitors (MON): Track cluster state (minimum 1, recommended 3+)
  2. Managers (MGR): Provide monitoring and management interfaces
  3. OSDs (Object Storage Daemons): Store data on disks
  4. MDS (Metadata Servers): For CephFS (optional)

Storage Configuration

For 2-node setup:

  • Reduced redundancy (size=2, min_size=1)
  • Suitable for development/testing
  • For production, add a third node or use external storage

Prerequisites

Hardware Requirements

Per Node:

  • CPU: 4+ cores recommended
  • RAM: 4GB+ for Ceph services
  • Storage: Dedicated disks/partitions for OSDs
  • Network: 1Gbps+ (10Gbps recommended)

Software Requirements

  • Proxmox VE 9.1+
  • SSH access to all nodes
  • Root or sudo access
  • Network connectivity between nodes

Installation Steps

Step 1: Prepare Nodes

# On both nodes, update system
apt update && apt upgrade -y

# Install prerequisites
apt install -y chrony python3-pip

Step 2: Configure Hostnames and Network

# On ML110-01
hostnamectl set-hostname ml110-01
echo "192.168.11.10 ml110-01 ml110-01.sankofa.nexus" >> /etc/hosts
echo "192.168.11.11 r630-01 r630-01.sankofa.nexus" >> /etc/hosts

# On R630-01
hostnamectl set-hostname r630-01
echo "192.168.11.10 ml110-01 ml110-01.sankofa.nexus" >> /etc/hosts
echo "192.168.11.11 r630-01 r630-01.sankofa.nexus" >> /etc/hosts

Step 3: Install Ceph

# Add Ceph repository
wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
echo "deb https://download.ceph.com/debian-quincy/ bullseye main" > /etc/apt/sources.list.d/ceph.list

# Update and install
apt update
apt install -y ceph ceph-common ceph-mds

Step 4: Create Ceph User

# On both nodes, create ceph user
useradd -d /home/ceph -m -s /bin/bash ceph
echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph
chmod 0440 /etc/sudoers.d/ceph

Step 5: Configure SSH Key Access

# On ML110-01 (deployment node)
su - ceph
ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa
ssh-copy-id ceph@ml110-01
ssh-copy-id ceph@r630-01

Step 6: Initialize Ceph Cluster

# On ML110-01 (deployment node)
cd ~
mkdir ceph-cluster
cd ceph-cluster

# Create cluster configuration
ceph-deploy new ml110-01 r630-01

# Edit ceph.conf to add network and reduce redundancy for 2-node
cat >> ceph.conf << EOF
[global]
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128
public network = 192.168.11.0/24
cluster network = 192.168.11.0/24
EOF

# Install Ceph on all nodes
ceph-deploy install ml110-01 r630-01

# Create initial monitor
ceph-deploy mon create-initial

# Deploy admin key
ceph-deploy admin ml110-01 r630-01

Step 7: Add OSDs

# List available disks
ceph-deploy disk list ml110-01
ceph-deploy disk list r630-01

# Prepare disks (replace /dev/sdX with actual disk)
ceph-deploy disk zap ml110-01 /dev/sdb
ceph-deploy disk zap r630-01 /dev/sdb

# Create OSDs
ceph-deploy osd create --data /dev/sdb ml110-01
ceph-deploy osd create --data /dev/sdb r630-01

Step 8: Deploy Manager

# Deploy manager daemon
ceph-deploy mgr create ml110-01 r630-01

Step 9: Verify Cluster

# Check cluster status
ceph -s

# Check OSD status
ceph osd tree

# Check health
ceph health

Proxmox Integration

Step 1: Create Ceph Storage Pool in Proxmox

# On Proxmox nodes, create Ceph storage
pvesm add cephfs ceph-storage --monhost 192.168.11.10,192.168.11.11 --username admin --fsname cephfs

Step 2: Create RBD Pool for Block Storage

# Create RBD pool
ceph osd pool create rbd 128 128

# Initialize pool for RBD
rbd pool init rbd

# Create storage in Proxmox
pvesm add rbd rbd-storage --pool rbd --monhost 192.168.11.10,192.168.11.11 --username admin

Step 3: Configure Proxmox Storage

  1. Via Web UI:

    • Datacenter → Storage → Add
    • Select "RBD" or "CephFS"
    • Configure connection details
  2. Via CLI:

    # RBD storage
    pvesm add rbd ceph-rbd --pool rbd --monhost 192.168.11.10,192.168.11.11 --username admin --content images,rootdir
    
    # CephFS storage
    pvesm add cephfs ceph-fs --monhost 192.168.11.10,192.168.11.11 --username admin --fsname cephfs --content iso,backup
    

Configuration Files

ceph.conf

[global]
fsid = <cluster-fsid>
mon initial members = ml110-01, r630-01
mon host = 192.168.11.10, 192.168.11.11
public network = 192.168.11.0/24
cluster network = 192.168.11.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128

Monitoring

Ceph Dashboard

# Enable dashboard module
ceph mgr module enable dashboard

# Create dashboard user
ceph dashboard ac-user-create admin <password> administrator

# Access dashboard
# https://ml110-01.sankofa.nexus:8443

Prometheus Integration

# Enable prometheus module
ceph mgr module enable prometheus

# Metrics endpoint
# http://ml110-01.sankofa.nexus:9283/metrics

Maintenance

Adding OSDs

ceph-deploy disk zap <node> /dev/sdX
ceph-deploy osd create --data /dev/sdX <node>

Removing OSDs

ceph osd out <osd-id>
ceph osd crush remove osd.<osd-id>
ceph auth del osd.<osd-id>
ceph osd rm <osd-id>

Cluster Health

# Check status
ceph -s

# Check detailed health
ceph health detail

# Check OSD status
ceph osd tree

Troubleshooting

Common Issues

  1. Clock Skew: Ensure NTP is configured

    systemctl enable chronyd
    systemctl start chronyd
    
  2. Network Issues: Verify connectivity

    ping ml110-01
    ping r630-01
    
  3. OSD Issues: Check OSD status

    ceph osd tree
    systemctl status ceph-osd@<id>
    

Security

Firewall Rules

# Allow Ceph ports
ufw allow 6789/tcp  # Monitors
ufw allow 6800:7300/tcp  # OSDs
ufw allow 8443/tcp  # Dashboard

Authentication

  • Use cephx authentication (default)
  • Rotate keys regularly
  • Limit admin access