Files
FusionAGI/docs/interfaces.md
defiQUG c052b07662
Some checks failed
Tests / test (3.10) (push) Has been cancelled
Tests / test (3.11) (push) Has been cancelled
Tests / test (3.12) (push) Has been cancelled
Tests / lint (push) Has been cancelled
Tests / docker (push) Has been cancelled
Initial commit: add .gitignore and README
2026-02-09 21:51:42 -08:00

13 KiB

FusionAGI Interface Layer

Complete multi-modal interface system for admin control and user interaction.

Overview

FusionAGI now provides two comprehensive interface layers:

  1. Admin Control Panel - System management and configuration
  2. Multi-Modal User Interface - Full sensory user experience
flowchart TB
    subgraph admin [Admin Control Panel]
        Voice[Voice Library]
        Conv[Conversation Tuning]
        Agent[Agent Config]
        Monitor[System Monitoring]
        Gov[Governance / MAA]
    end

    subgraph core [FusionAGI Core]
        Orch[Orchestrator]
        Mem[Memory]
        Tools[Tools]
    end

    subgraph ui [Multi-Modal User Interface]
        Text[Text]
        VoiceUI[Voice]
        Visual[Visual]
        Haptic[Haptic]
        Session[Session Mgmt]
        Task[Task Integration]
    end

    admin --> Orch
    Orch --> Mem
    Orch --> Tools
    ui --> Orch
    Session --> Task

Admin Control Panel

Administrative interface for managing all aspects of FusionAGI.

Features

  • Voice Library Management: Add, configure, and organize TTS voice profiles
  • Conversation Tuning: Configure natural language styles and personalities
  • Agent Configuration: Manage agent settings, permissions, and behavior
  • System Monitoring: Real-time health metrics and performance tracking
  • Governance: Policy management and audit log access
  • Manufacturing Authority: MAA configuration and oversight

Usage

AdminControlPanel accepts optional voice_library and conversation_tuner (default None); when omitted, internal defaults are created.

from fusionagi import Orchestrator, EventBus, StateManager
from fusionagi.interfaces import AdminControlPanel, VoiceLibrary, ConversationTuner
from fusionagi.governance import PolicyEngine, AuditLog

# Initialize core components
bus = EventBus()
state = StateManager()
orch = Orchestrator(event_bus=bus, state_manager=state)

# Create admin panel
admin = AdminControlPanel(
    orchestrator=orch,
    event_bus=bus,
    state_manager=state,
    voice_library=VoiceLibrary(),
    conversation_tuner=ConversationTuner(),
)

# Add voice profiles
from fusionagi.interfaces.voice import VoiceProfile

voice = VoiceProfile(
    name="Professional Assistant",
    language="en-US",
    gender="neutral",
    style="professional",
    pitch=1.0,
    speed=1.0,
)
admin.add_voice_profile(voice)

# Configure conversation styles
from fusionagi.interfaces.conversation import ConversationStyle

style = ConversationStyle(
    formality="neutral",
    verbosity="balanced",
    empathy_level=0.8,
    technical_depth=0.6,
)
admin.register_conversation_style("technical_support", style)

# Monitor system
status = admin.get_system_status()
print(f"Status: {status.status}, Active tasks: {status.active_tasks}")

# Export configuration
config = admin.export_configuration()

Multi-Modal User Interface

Unified interface supporting multiple sensory modalities simultaneously.

Supported Modalities

  • Text: Chat, commands, structured input
  • Voice: Speech-to-text, text-to-speech
  • Visual: Images, video, AR/VR (extensible)
  • Haptic: Touch feedback, vibration patterns (extensible)
  • Gesture: Motion control, hand tracking (extensible)
  • Biometric: Emotion detection, physiological signals (extensible)

Features

  • Seamless modality switching
  • Simultaneous multi-modal I/O
  • Accessibility support
  • Context-aware modality selection
  • Real-time feedback across all active modalities

Usage

from fusionagi.interfaces import MultiModalUI, VoiceInterface, ConversationManager
from fusionagi.interfaces.base import ModalityType

# Initialize components
voice = VoiceInterface(stt_provider="whisper", tts_provider="elevenlabs")
conv_manager = ConversationManager()

# Create multi-modal UI
ui = MultiModalUI(
    orchestrator=orch,
    conversation_manager=conv_manager,
    voice_interface=voice,
)

# Create user session with preferred modalities
session_id = ui.create_session(
    user_id="user123",
    preferred_modalities=[ModalityType.TEXT, ModalityType.VOICE],
    accessibility_settings={"screen_reader": True},
)

# Send multi-modal output
await ui.send_to_user(
    session_id,
    "Hello! How can I help you today?",
    modalities=[ModalityType.TEXT, ModalityType.VOICE],
)

# Receive user input (any active modality)
message = await ui.receive_from_user(session_id, timeout_seconds=30.0)

# Submit task with interactive feedback
task_id = await ui.submit_task_interactive(
    session_id,
    goal="Analyze sales data and create report",
)

# Conversational interaction
response = await ui.converse(session_id, "What's the status of my task?")

Voice Interface

Speech interaction with configurable voice profiles.

Voice Library

from fusionagi.interfaces import VoiceLibrary, VoiceProfile

library = VoiceLibrary()

# Add multiple voices
voices = [
    VoiceProfile(
        name="Friendly Assistant",
        language="en-US",
        gender="female",
        style="friendly",
        pitch=1.1,
        speed=1.0,
    ),
    VoiceProfile(
        name="Technical Expert",
        language="en-US",
        gender="male",
        style="professional",
        pitch=0.9,
        speed=0.95,
    ),
    VoiceProfile(
        name="Multilingual Guide",
        language="es-ES",
        gender="neutral",
        style="calm",
    ),
]

for voice in voices:
    library.add_voice(voice)

# Set default
library.set_default_voice(voices[0].id)

# Filter voices
spanish_voices = library.list_voices(language="es-ES")
female_voices = library.list_voices(gender="female")

Speech-to-Text Providers

Supported STT providers (extensible):

  • Whisper: OpenAI Whisper (local or API)
  • Azure: Azure Cognitive Services
  • Google: Google Cloud Speech-to-Text
  • Deepgram: Deepgram API

Text-to-Speech Providers

Supported TTS providers (extensible):

  • System: OS-native TTS (pyttsx3)
  • ElevenLabs: ElevenLabs API
  • Azure: Azure Cognitive Services
  • Google: Google Cloud TTS

Conversation Management

Natural language conversation with tunable styles.

Conversation Styles

from fusionagi.interfaces import ConversationTuner, ConversationStyle

tuner = ConversationTuner()

# Define conversation styles
styles = {
    "customer_support": ConversationStyle(
        formality="neutral",
        verbosity="balanced",
        empathy_level=0.9,
        proactivity=0.8,
        technical_depth=0.4,
    ),
    "technical_expert": ConversationStyle(
        formality="formal",
        verbosity="detailed",
        empathy_level=0.5,
        technical_depth=0.9,
        humor_level=0.1,
    ),
    "casual_friend": ConversationStyle(
        formality="casual",
        verbosity="balanced",
        empathy_level=0.8,
        humor_level=0.7,
        technical_depth=0.3,
    ),
}

for name, style in styles.items():
    tuner.register_style(name, style)

# Tune for specific context
tuned_style = tuner.tune_for_context(
    domain="technical",
    user_preferences={"verbosity": "concise"},
)

Conversation Sessions

from fusionagi.interfaces import ConversationManager, ConversationTurn

manager = ConversationManager(tuner=tuner)

# Create session
session_id = manager.create_session(
    user_id="user123",
    style_name="customer_support",
    language="en",
    domain="technical_support",
)

# Add conversation turns
manager.add_turn(ConversationTurn(
    session_id=session_id,
    speaker="user",
    content="My system is not responding",
    sentiment=-0.3,
))

manager.add_turn(ConversationTurn(
    session_id=session_id,
    speaker="agent",
    content="I understand that's frustrating. Let me help you troubleshoot.",
    sentiment=0.5,
))

# Get conversation history
history = manager.get_history(session_id, limit=10)

# Get context for LLM
context = manager.get_context_summary(session_id)

Extending with New Modalities

To add a new sensory modality:

  1. Create Interface Adapter:
from fusionagi.interfaces.base import InterfaceAdapter, InterfaceCapabilities, InterfaceMessage, ModalityType

class HapticInterface(InterfaceAdapter):
    def __init__(self):
        super().__init__("haptic")
    
    def capabilities(self) -> InterfaceCapabilities:
        return InterfaceCapabilities(
            supported_modalities=[ModalityType.HAPTIC],
            supports_streaming=True,
            supports_interruption=True,
        )
    
    async def send(self, message: InterfaceMessage) -> None:
        # Send haptic feedback (vibration pattern, etc.)
        pattern = message.content
        await self._send_haptic_pattern(pattern)
    
    async def receive(self, timeout_seconds: float | None = None) -> InterfaceMessage | None:
        # Receive haptic input (touch, pressure, etc.)
        data = await self._read_haptic_sensor(timeout_seconds)
        return InterfaceMessage(
            id=f"haptic_{uuid.uuid4().hex[:8]}",
            modality=ModalityType.HAPTIC,
            content=data,
        )
  1. Register with UI:
haptic = HapticInterface()
ui.register_interface(ModalityType.HAPTIC, haptic)
  1. Enable for Session:
ui.enable_modality(session_id, ModalityType.HAPTIC)

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Admin Control Panel                      │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐       │
│  │ Voice Library│ │Conversation  │ │ Agent Config │       │
│  │  Management  │ │    Tuning    │ │              │       │
│  └──────────────┘ └──────────────┘ └──────────────┘       │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐       │
│  │   System     │ │  Governance  │ │     MAA      │       │
│  │  Monitoring  │ │   & Audit    │ │   Control    │       │
│  └──────────────┘ └──────────────┘ └──────────────┘       │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                  FusionAGI Core System                      │
│         (Orchestrator, Agents, Memory, Tools)               │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                 Multi-Modal User Interface                  │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐      │
│  │   Text   │ │  Voice   │ │  Visual  │ │  Haptic  │      │
│  │Interface │ │Interface │ │Interface │ │Interface │      │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘      │
│  ┌──────────┐ ┌──────────┐                                 │
│  │ Gesture  │ │Biometric │                                 │
│  │Interface │ │Interface │                                 │
│  └──────────┘ └──────────┘                                 │
└─────────────────────────────────────────────────────────────┘

Integration with FusionAGI Core

The interface layer integrates seamlessly with FusionAGI's core components:

  • Orchestrator: Task submission and monitoring
  • Event Bus: Real-time updates and notifications
  • Agents: Direct agent interaction and configuration
  • Memory: Conversation history and user preferences
  • Governance: Policy enforcement and audit logging
  • MAA: Manufacturing authority oversight

Next Steps

  1. Implement STT/TTS Providers: Integrate with actual speech services
  2. Build Web UI: Create web-based admin panel and user interface
  3. Add Visual Modality: Support images, video, AR/VR
  4. Implement Haptic: Add haptic feedback support
  5. Gesture Recognition: Integrate motion tracking
  6. Biometric Sensors: Add emotion and physiological monitoring
  7. Mobile Apps: Native iOS/Android interfaces
  8. Accessibility: Enhanced screen reader and assistive technology support

License

MIT