Files
Sankofa/docs/meta/FILES_TO_PRUNE.md
defiQUG fe0365757a Update documentation structure and enhance .gitignore
- Added generated index files and report directories to .gitignore to prevent unnecessary tracking of transient files.
- Updated README links to reflect new documentation paths for better navigation.
- Improved documentation organization by ensuring all links point to the correct locations, enhancing user experience and accessibility.
2025-12-12 21:18:55 -08:00

7.4 KiB

Files to Prune

Generated: 2025-01-09
Status: Analysis Complete

This document identifies files that can be safely removed from the project to reduce clutter and improve maintainability.


Summary

  • Build Artifacts: .next/, node_modules/ (excluded by .gitignore, safe to delete locally)
  • Large Generated Files: 2 large JSON index files (6.2MB total)
  • Old Status Files: 14 status files that could be archived
  • Duplicate Files: 4 duplicate file groups
  • Archive Files: 50+ archived files (intentionally kept, but could be reviewed)
  • Webpack Cache: Old webpack cache files

1. Large Generated Index Files (6.2MB total)

These files are generated by scripts and can be regenerated at any time. Consider removing them from version control and regenerating as needed.

  • docs/MARKDOWN_REFERENCE.json (5.1 MB) - Machine-readable index
  • docs/MARKDOWN_INDEX.json (1.1 MB) - Intermediate index file

Rationale: These are generated files that can be recreated with scripts/generate-markdown-reference.py. They should not be in version control.

Action:

# Add to .gitignore
echo "docs/MARKDOWN_REFERENCE.json" >> .gitignore
echo "docs/MARKDOWN_INDEX.json" >> .gitignore

# Remove from git (keep locally)
git rm --cached docs/MARKDOWN_REFERENCE.json
git rm --cached docs/MARKDOWN_INDEX.json

2. Old Status Files (14 files)

These status files contain historical information and could be moved to archive. However, they may still be referenced.

In docs/proxmox/status/:

  • COMPLETE_STATUS.md
  • COMPLETE_STATUS_FINAL.md
  • COMPLETE_STATUS_REPORT.md
  • COMPLETE_SUMMARY.md
  • COMPLETION_SUMMARY.md
  • FINAL_STATUS.md
  • FINAL_STATUS_UPDATE.md
  • NEXT_STEPS_COMPLETED.md
  • TASK_COMPLETION_SUMMARY.md

In docs/status/implementation/:

  • ALL_TASKS_COMPLETE.md
  • IMPLEMENTATION_COMPLETE.md
  • NEXT_STEPS_COMPLETE.md
  • NEXT_STEPS_FINAL_STATUS.md

In docs/status/:

  • NEXT_STEPS_COMPLETION.md

Action: Review these files to see if they're still actively referenced. If not, move them to docs/archive/status/ or docs/proxmox/archive/.


3. Duplicate Files (4 groups)

These files have identical content and should be consolidated.

  1. Infrastructure Data Files (3 duplicates):

    • public/docs/infrastructure/data/cost_estimates.json ← Remove (keep docs/infrastructure/data/cost_estimates.json)
    • public/docs/infrastructure/data/deployment_timeline.json ← Remove (keep docs/infrastructure/data/deployment_timeline.json)
    • public/docs/infrastructure/data/compliance_requirements.json ← Remove (keep docs/infrastructure/data/compliance_requirements.json)
  2. DNS Records Files:

    • cloudflare/dns/sankofa.nexus-records.yaml vs cloudflare/dns/d-bis.org-records.yaml
      • These appear to be duplicates but may serve different domains. Review before deletion.

4. Archive Files (50+ files)

These files are intentionally archived but could be reviewed for consolidation.

Location: docs/archive/

Recommendation: Keep archived files but consider:

  • Consolidating multiple "COMPLETE" or "COMPLETION" files into single summaries
  • Creating a single "ARCHIVE_SUMMARY.md" that references all archived content
  • Compressing old archives into ZIP files

Action: Low priority - these serve historical reference purposes.


5. Build Artifacts (Already in .gitignore)

These are already excluded from git but may exist locally:

  • .next/ - Next.js build cache
  • node_modules/ - Dependencies (should never be committed)
  • dist/ - Build outputs
  • build/ - Build outputs

Action: Safe to delete locally, will be regenerated:

# Clean build artifacts
rm -rf .next node_modules dist build coverage
pnpm install  # Regenerate dependencies

6. Webpack Cache Files

Old webpack cache files that are no longer needed:

  • .next/cache/webpack/client-development/index.pack.gz.old
  • .next/cache/webpack/server-development/index.pack.gz.old
  • portal/.next/cache/webpack/client-development/index.pack.gz.old
  • portal/.next/cache/webpack/server-development/index.pack.gz.old

Action: Safe to delete - these are cache files:

find . -name "*.old" -path "*/.next/cache/*" -delete

7. Large ZIP Archive

  • docs/6g_gpu_full_package.zip - Large binary file in docs directory

Action:

  • If needed for documentation, move to a separate downloads/assets directory
  • If not needed, delete
  • Consider hosting externally or in a release artifacts repository

8. Node Modules Lock Files (Already Handled)

Found several yarn.lock files in node_modules/:

  • These are from dependencies and are fine
  • Main project uses pnpm-lock.yaml (correct)

Action: No action needed - these are dependency lock files.


9. Backup Files (Already in .gitignore)

Found one backup file in node_modules:

  • api/node_modules/.pnpm/form-data@2.3.3/node_modules/form-data/README.md.bak

Action: This is in node_modules, so it's fine. No action needed.


Immediate Action Items (High Priority)

  1. Remove large JSON index files from git (6.2MB)
  2. Remove duplicate infrastructure data files (3 files)
  3. Clean webpack cache files (4 .old files)
  4. Review and archive old status files (14 files)

Cleanup Scripts

Two cleanup scripts have been created to automate the pruning process:

1. Basic Cleanup Script (scripts/cleanup-prune-files.sh)

Removes duplicate files and cache artifacts.

Usage:

# Dry run (see what would be deleted)
./scripts/cleanup-prune-files.sh --dry-run

# Run cleanup (with optional backup)
./scripts/cleanup-prune-files.sh --backup
./scripts/cleanup-prune-files.sh --all

# Specific operations
./scripts/cleanup-prune-files.sh --duplicates  # Remove duplicates only
./scripts/cleanup-prune-files.sh --cache       # Remove cache files only

What it does:

  • Removes duplicate infrastructure data files from public/
  • Removes webpack cache .old files
  • Optional backup creation before deletion
  • Dry-run mode for safety

2. Archive Old Status Files (scripts/cleanup-archive-old-status.sh)

Moves old status files to archive directories.

Usage:

# Dry run (see what would be moved)
./scripts/cleanup-archive-old-status.sh --dry-run

# Actually move files
./scripts/cleanup-archive-old-status.sh

What it does:

  • Moves old status files from docs/proxmox/status/ to docs/proxmox/archive/
  • Moves old status files from docs/status/implementation/ to docs/archive/status/
  • Preserves file structure in archive

Files to Keep

DO NOT DELETE:

  • Archive files in docs/archive/ (historical reference)
  • Status files that are actively referenced
  • Documentation files (even if old)
  • Configuration files
  • Source code files

Recommendations Summary

Category Count Priority Action
Large JSON indexes 2 High Remove from git, add to .gitignore
Duplicate files 4 groups Medium Remove duplicates
Old status files 14 Medium Review and archive
Webpack cache 4 Low Delete
Large ZIP file 1 Low Review and relocate/delete
Archive files 50+ None Keep (historical)

Last Updated: 2025-01-09