# Virtual Banker Architecture ## Overview The Virtual Banker is a multi-layered system that provides a digital human banking experience with full video realism, real-time voice interaction, and embeddable widget capabilities. ## System Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ Client Layer │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ Embeddable Widget (React/TypeScript) │ │ │ │ - Chat UI │ │ │ │ - Voice Controls │ │ │ │ - Avatar View │ │ │ │ - WebRTC Client │ │ │ └──────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Edge Layer │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ CDN │ │ API Gateway │ │ WebRTC │ │ │ │ (Widget) │ │ (Auth/Rate) │ │ Gateway │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Core Services │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Session │ │ Orchestrator │ │ LLM Gateway │ │ │ │ Service │ │ │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ RAG Service │ │ Tool/Action │ │ Safety/ │ │ │ │ │ │ Service │ │ Compliance │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Media Services │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ ASR Service │ │ TTS Service │ │ Avatar │ │ │ │ (Streaming) │ │ (Streaming) │ │ Renderer │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Data Layer │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ PostgreSQL │ │ Redis │ │ Vector DB │ │ │ │ (State) │ │ (Cache) │ │ (pgvector) │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ └─────────────────────────────────────────────────────────────┘ ``` ## Data Flow ### Voice Turn Flow 1. **User speaks** → Widget captures audio via microphone 2. **Audio stream** → WebRTC gateway → ASR service 3. **ASR** → Transcribes to text (partial + final) 4. **Orchestrator** → Sends transcript to LLM with context 5. **LLM** → Generates response + tool calls + emotion tags 6. **TTS** → Converts text to audio stream 7. **Avatar** → Generates visemes, expressions, gestures 8. **Widget** → Plays audio, displays captions, animates avatar ### Text Turn Flow 1. **User types** → Widget sends text message 2. **Orchestrator** → Processes message (same as step 4+ above) ## Components ### Backend Services #### Session Service - Creates and manages sessions - Issues ephemeral tokens - Loads tenant configurations - Tracks session state #### Conversation Orchestrator - Maintains conversation state machine - Routes messages to appropriate services - Handles barge-in (interruptions) - Synchronizes audio/video #### LLM Gateway - Multi-tenant prompt templates - Function/tool calling - Output schema enforcement - Model routing #### RAG Service - Document ingestion and embedding - Vector similarity search - Reranking - Citation formatting #### Tool/Action Service - Tool registry and execution - Banking service integrations - Human-in-the-loop confirmations - Audit logging ### Frontend Widget #### Components - **ChatPanel**: Main chat interface - **VoiceControls**: Push-to-talk, hands-free, volume - **AvatarView**: Video stream display - **Captions**: Real-time captions overlay - **Settings**: User preferences #### Hooks - **useSession**: Session management - **useConversation**: Message handling - **useWebRTC**: WebRTC connection ### Avatar System #### Unreal Engine - Digital human character - Blendshapes for visemes/expressions - Animation blueprints - PixelStreaming for video output #### Render Service - Controls Unreal instances - Manages GPU resources - Streams video via WebRTC ## Security - JWT/SSO authentication - Ephemeral session tokens - PII redaction - Content filtering - Rate limiting - Audit trails ## Accessibility - WCAG 2.1 AA compliance - Keyboard navigation - Screen reader support - Captions (always available) - Reduced motion support - ARIA labels ## Scalability - Stateless services (behind load balancer) - Redis for session caching - PostgreSQL for persistent state - GPU cluster for avatar rendering - CDN for widget assets