8.1 KiB
8.1 KiB
Virtual Banker Architecture
Overview
The Virtual Banker is a multi-layered system that provides a digital human banking experience with full video realism, real-time voice interaction, and embeddable widget capabilities.
System Architecture
┌─────────────────────────────────────────────────────────────┐
│ Client Layer │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Embeddable Widget (React/TypeScript) │ │
│ │ - Chat UI │ │
│ │ - Voice Controls │ │
│ │ - Avatar View │ │
│ │ - WebRTC Client │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Edge Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ CDN │ │ API Gateway │ │ WebRTC │ │
│ │ (Widget) │ │ (Auth/Rate) │ │ Gateway │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Core Services │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Session │ │ Orchestrator │ │ LLM Gateway │ │
│ │ Service │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ RAG Service │ │ Tool/Action │ │ Safety/ │ │
│ │ │ │ Service │ │ Compliance │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Media Services │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ ASR Service │ │ TTS Service │ │ Avatar │ │
│ │ (Streaming) │ │ (Streaming) │ │ Renderer │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Data Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ PostgreSQL │ │ Redis │ │ Vector DB │ │
│ │ (State) │ │ (Cache) │ │ (pgvector) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
Data Flow
Voice Turn Flow
- User speaks → Widget captures audio via microphone
- Audio stream → WebRTC gateway → ASR service
- ASR → Transcribes to text (partial + final)
- Orchestrator → Sends transcript to LLM with context
- LLM → Generates response + tool calls + emotion tags
- TTS → Converts text to audio stream
- Avatar → Generates visemes, expressions, gestures
- Widget → Plays audio, displays captions, animates avatar
Text Turn Flow
- User types → Widget sends text message
- Orchestrator → Processes message (same as step 4+ above)
Components
Backend Services
Session Service
- Creates and manages sessions
- Issues ephemeral tokens
- Loads tenant configurations
- Tracks session state
Conversation Orchestrator
- Maintains conversation state machine
- Routes messages to appropriate services
- Handles barge-in (interruptions)
- Synchronizes audio/video
LLM Gateway
- Multi-tenant prompt templates
- Function/tool calling
- Output schema enforcement
- Model routing
RAG Service
- Document ingestion and embedding
- Vector similarity search
- Reranking
- Citation formatting
Tool/Action Service
- Tool registry and execution
- Banking service integrations
- Human-in-the-loop confirmations
- Audit logging
Frontend Widget
Components
- ChatPanel: Main chat interface
- VoiceControls: Push-to-talk, hands-free, volume
- AvatarView: Video stream display
- Captions: Real-time captions overlay
- Settings: User preferences
Hooks
- useSession: Session management
- useConversation: Message handling
- useWebRTC: WebRTC connection
Avatar System
Unreal Engine
- Digital human character
- Blendshapes for visemes/expressions
- Animation blueprints
- PixelStreaming for video output
Render Service
- Controls Unreal instances
- Manages GPU resources
- Streams video via WebRTC
Security
- JWT/SSO authentication
- Ephemeral session tokens
- PII redaction
- Content filtering
- Rate limiting
- Audit trails
Accessibility
- WCAG 2.1 AA compliance
- Keyboard navigation
- Screen reader support
- Captions (always available)
- Reduced motion support
- ARIA labels
Scalability
- Stateless services (behind load balancer)
- Redis for session caching
- PostgreSQL for persistent state
- GPU cluster for avatar rendering
- CDN for widget assets