Files
Sankofa/docs/fairness-audit/ORCHESTRATION_ENGINE.md

182 lines
5.3 KiB
Markdown
Raw Permalink Normal View History

# Fairness Audit Orchestration Engine
## Overview
The Fairness Audit Orchestration Engine uses a **3-variable model** to size and schedule fairness audit processes. The engine designs from outputs backwards, calculating the total process load and validating feasibility against requested timelines.
## The 3-Variable Model
### Variables
1. **I (Input)**: Input size/effort
- Dataset size
- Number of sensitive attributes
- Date range complexity
- Filter complexity
2. **O (Output)**: Total output effort
- Sum of all selected outputs (reports, dashboards, exports, alerts)
- Each output type has a weight
3. **T (Timeline)**: Runtime allocation
- Execution mode (now, scheduled, continuous)
- SLA/time limit
- Deadline
## Backend Logic
### Formula
```
Total Process Load ≈ O + 2I ≈ 3.2I
```
Where:
- **O** = Sum of all output weights
- **2I** = Two input passes (ingestion + enrichment + fairness evaluation)
- **3.2I** = Target total load (design target: O ≈ 1.2 × I)
### Calculation Flow
1. **Start with Outputs**
- User selects desired outputs
- Engine sums output weights → **O**
2. **Calculate Input Load**
- Engine analyzes input specification
- Calculates input complexity → **I**
3. **Calculate Total Load**
- Total = O + 2I
- Validates against target: ≈ 3.2I
4. **Estimate Time**
- Uses processing rates to estimate runtime
- Validates against timeline constraints
5. **Feasibility Check**
- Compares estimated time vs. requested timeline
- Checks output load vs. recommended (1.2 × I)
- Provides warnings and suggestions
## Output Types and Weights
| Output Type | Weight | Description |
|------------|--------|-------------|
| Fairness Audit PDF | 2.5 | Comprehensive fairness audit report |
| Metrics Export (SPD, TPR, FPR) | 1.0 | Statistical parity difference, rates |
| Flagged Cases CSV | 1.5 | Cases flagged for potential bias |
| Executive Summary Slides | 2.0 | Executive presentation slides |
| Detailed Report (JSON) | 1.2 | Machine-readable detailed analysis |
| Alert Configuration | 0.8 | Automated alert rules |
| Dashboard Export | 1.8 | Interactive dashboard |
| Compliance Report | 2.2 | Regulatory compliance documentation |
## Input Load Calculation
```typescript
Input Load = Base (100)
+ Sensitive Attributes (20 each)
+ Date Range (5 per day)
+ Filters (10 each)
```
Or use pre-calculated `estimatedSize` if available.
## Processing Rates
- **Input Processing**: 15 units/second
- **Output Processing**: 8 units/second
- **Average Rate**: ~11.5 units/second
## User-Facing Messages
### Feasible Configuration
> "This fairness audit will process approximately X input units and generate Y output units, taking approximately Z to complete."
### Feasible with Warnings
> "This audit is feasible but has some considerations: [warnings]. Estimated time: Z."
### Not Feasible
> "This audit configuration may not be feasible within the requested timeline. [warnings]. Estimated time: Z."
## Example Scenarios
### Scenario 1: Small Dataset, Few Outputs
- **Input**: 100 units (small dataset, 2 attributes)
- **Outputs**: Metrics Export (1.0) + Flagged Cases CSV (1.5) = 2.5 units
- **Total Load**: 2.5 + (2 × 100) = 202.5 units
- **Estimated Time**: ~18 seconds
- **Result**: ✅ Feasible
### Scenario 2: Large Dataset, Many Outputs
- **Input**: 500 units (large dataset, 5 attributes, 30-day range)
- **Outputs**: All 8 outputs = 13.0 units
- **Total Load**: 13.0 + (2 × 500) = 1013.0 units
- **Estimated Time**: ~88 seconds
- **Result**: ⚠️ May need timeline adjustment
### Scenario 3: Output-Heavy Request
- **Input**: 200 units
- **Outputs**: All outputs = 13.0 units
- **Target Output**: 200 × 1.2 = 240 units
- **Actual Output**: 13.0 units
- **Result**: ✅ Within target (O < 1.2 × I)
## Implementation
### Backend Engine
- Location: `api/src/services/fairness-orchestration/engine.ts`
- Provides: `orchestrate()`, calculation functions, feasibility checks
### Frontend Component
- Location: `portal/src/components/fairness/FairnessOrchestrationWizard.tsx`
- 3-column layout: Output | Input | Timeline
- Real-time orchestration calculation
- Visual feedback on feasibility
### Client Library
- Location: `portal/src/lib/fairness-orchestration.ts`
- Shared types and calculation functions
- Can be used client-side or called via API
## API Endpoints (To Be Implemented)
```
POST /api/fairness/orchestrate
Body: OrchestrationRequest
Response: OrchestrationResult
GET /api/fairness/outputs
Response: OutputType[]
POST /api/fairness/run
Body: OrchestrationRequest
Response: Job ID and status
```
## Configuration
### Adjustable Constants
```typescript
INPUT_PASS_MULTIPLIER = 2.0 // 2 × I for input passes
TOTAL_LOAD_MULTIPLIER = 3.2 // Target: O + 2I ≈ 3.2I
OUTPUT_TARGET_MULTIPLIER = 1.2 // Design target: O ≈ 1.2 × I
INPUT_PROCESSING_RATE = 15 // units/second
OUTPUT_PROCESSING_RATE = 8 // units/second
```
### Tuning Recommendations
- **High-volume scenarios**: Increase processing rates
- **Complex outputs**: Adjust output weights
- **Strict SLAs**: Add buffer time (20% recommended)
## Related Documentation
- [Orchestration Engine Design](./ORCHESTRATION_DESIGN.md)
- [Output Weight Guidelines](./OUTPUT_WEIGHTS.md)
- [User Guide](../fairness-audit/USER_GUIDE.md)