Files
dbis_core/LEDGER_CORRECTNESS_BOUNDARIES.md

236 lines
7.2 KiB
Markdown
Raw Normal View History

# Ledger Correctness Boundaries - Implementation Summary
This document summarizes the implementation of ledger correctness boundaries that enforce the separation between authoritative ledger operations and external synchronization.
## Overview
DBIS Core maintains an **authoritative ledger** (issuance, settlement, balances) while also orchestrating **dual-ledger synchronization** with external SCB ledgers. This requires two different correctness regimes:
1. **Authoritative ledger correctness** (must be atomic, invariant-safe)
2. **External synchronization correctness** (must be idempotent, replayable, eventually consistent)
## Architecture Changes
### 1. Atomic Ledger Posting (Postgres as Ledger Engine)
**Problem**: Balance updates were happening in separate Prisma calls, risking race conditions and inconsistent state.
**Solution**: Created `post_ledger_entry()` SQL function that:
- Enforces idempotency via unique constraint on `(ledger_id, reference_id)`
- Updates balances atomically within the same transaction as entry creation
- Uses deadlock-safe lock ordering
- Computes block hash with hash chaining
- Validates sufficient funds at DB level
**Location**: `db/migrations/005_post_ledger_entry.sql`
### 2. Dual-Ledger Outbox Pattern
**Problem**: Original implementation posted to SCB ledger first, then DBIS. If SCB was unavailable, DBIS couldn't commit. This violated "DBIS is authoritative" principle.
**Solution**: Implemented transactional outbox pattern:
- DBIS commits first (authoritative)
- Outbox event created in same transaction
- Async worker processes outbox jobs
- Idempotent retries with exponential backoff
- State machine enforces valid transitions
**Files**:
- `db/migrations/002_dual_ledger_outbox.sql` - Outbox table
- `db/migrations/003_outbox_state_machine.sql` - State machine constraints
- `src/workers/dual-ledger-outbox.worker.ts` - Worker service
- `src/workers/run-dual-ledger-outbox.ts` - Worker runner
### 3. Guarded Access Module
**Problem**: Any code could directly mutate `ledger_entries` or `bank_accounts`, bypassing correctness guarantees.
**Solution**: Created `LedgerPostingModule` that is the **only** allowed path to mutate ledger:
- All mutations go through atomic SQL function
- Direct balance updates are banned
- Singleton pattern enforces single access point
**Location**: `src/core/ledger/ledger-posting.module.ts`
### 4. Refactored GSS Master Ledger Service
**Changes**:
- **DBIS-first**: Posts to DBIS ledger first (authoritative)
- **Transactional**: DBIS post + outbox creation + master record in single transaction
- **Non-blocking**: Returns immediately; SCB sync happens async
- **Explicit states**: `DBIS_COMMITTED``SETTLED` (when SCB sync completes)
**Location**: `src/core/settlement/gss/gss-master-ledger.service.ts`
## Migration Files
All migrations are in `db/migrations/`:
1. **001_ledger_idempotency.sql** - Unique constraint on `(ledger_id, reference_id)`
2. **002_dual_ledger_outbox.sql** - Outbox table with indexes
3. **003_outbox_state_machine.sql** - Status transition enforcement
4. **004_balance_constraints.sql** - Balance integrity constraints
5. **005_post_ledger_entry.sql** - Atomic posting function
## State Machine
### Outbox States
```
QUEUED → SENT → ACKED → FINALIZED
↓ ↓ ↓
FAILED ← FAILED ← FAILED
(retry)
```
### Master Ledger States
- `PENDING` - Initial state
- `DBIS_COMMITTED` - DBIS ledger posted, SCB sync queued
- `SETTLED` - Both ledgers synchronized
- `FAILED` - Posting failed
## Key Constraints
### Database Level
1. **Idempotency**: `UNIQUE (ledger_id, reference_id)` on `ledger_entries`
2. **Balance integrity**:
- `available_balance >= 0`
- `reserved_balance >= 0`
- `available_balance <= balance`
- `(available_balance + reserved_balance) <= balance`
3. **State transitions**: Trigger enforces valid outbox status transitions
### Application Level
1. **Guarded access**: Only `LedgerPostingModule` can mutate ledger
2. **Atomic operations**: All posting via SQL function
3. **Transactional outbox**: Outbox creation in same transaction as posting
## Usage
### Posting to Master Ledger
```typescript
import { gssMasterLedgerService } from '@/core/settlement/gss/gss-master-ledger.service';
const result = await gssMasterLedgerService.postToMasterLedger({
nodeId: 'SSN-1',
sourceBankId: 'SCB-1',
destinationBankId: 'SCB-2',
amount: '1000.00',
currencyCode: 'USD',
assetType: 'fiat',
sovereignSignature: '...',
}, 'my-reference-id');
// Returns immediately with DBIS hash
// SCB sync happens async via outbox worker
```
### Running Outbox Worker
```bash
# Run worker process
npm run worker:dual-ledger-outbox
# Or use process manager
pm2 start src/workers/run-dual-ledger-outbox.ts
```
## Testing
### Verify Migrations
```sql
-- Check idempotency constraint
SELECT constraint_name
FROM information_schema.table_constraints
WHERE table_name = 'ledger_entries'
AND constraint_name LIKE '%reference%';
-- Check outbox table
SELECT COUNT(*) FROM dual_ledger_outbox;
-- Test posting function
SELECT * FROM post_ledger_entry(
'Test'::TEXT,
'account1'::TEXT,
'account2'::TEXT,
100::NUMERIC,
'USD'::TEXT,
'fiat'::TEXT,
'Type_A'::TEXT,
'test-ref-123'::TEXT,
NULL::NUMERIC,
NULL::JSONB
);
```
### Verify State Machine
```sql
-- Try invalid transition (should fail)
UPDATE dual_ledger_outbox
SET status = 'QUEUED'
WHERE status = 'FINALIZED';
-- ERROR: Invalid outbox transition: FINALIZED -> QUEUED
```
## Next Steps
1. **Apply migrations** in order (see `db/migrations/README.md`)
2. **Update Prisma schema** (already done - `dual_ledger_outbox` model added)
3. **Deploy worker** to process outbox jobs
4. **Implement SCB API client** in `DualLedgerOutboxWorker.callScbLedgerApi()`
5. **Add monitoring** for outbox queue depth and processing latency
6. **Add reconciliation** job to detect and fix sync failures
## Breaking Changes
### API Changes
- `postToMasterLedger()` now returns immediately with `dualCommit: false`
- `sovereignLedgerHash` is `null` initially (populated by worker)
- Status is `DBIS_COMMITTED` instead of `settled` initially
### Database Changes
- New constraint on `ledger_entries` (idempotency)
- New balance constraints (may fail if data is inconsistent)
- New `dual_ledger_outbox` table
### Code Changes
- Direct use of `ledgerService.postDoubleEntry()` for GSS should be replaced with `ledgerPostingModule.postEntry()`
- Direct balance updates via Prisma are now banned (use `ledgerPostingModule`)
## Rollback Plan
If needed, migrations can be rolled back:
```sql
-- Drop function
DROP FUNCTION IF EXISTS post_ledger_entry(...);
-- Drop outbox table
DROP TABLE IF EXISTS dual_ledger_outbox CASCADE;
-- Remove constraints
ALTER TABLE ledger_entries
DROP CONSTRAINT IF EXISTS ledger_entries_unique_ledger_reference;
ALTER TABLE bank_accounts
DROP CONSTRAINT IF EXISTS bank_accounts_reserved_nonnegative,
DROP CONSTRAINT IF EXISTS bank_accounts_available_nonnegative,
DROP CONSTRAINT IF EXISTS bank_accounts_balance_consistency;
```
## References
- Architecture discussion: See user query about "hard mode" answer
- Transactional Outbox Pattern: https://microservices.io/patterns/data/transactional-outbox.html
- Prisma transaction docs: https://www.prisma.io/docs/concepts/components/prisma-client/transactions