Session Management
The LLMSessionManager class in main_logic/core.py is the central coordinator for each character's conversation sessions. Each character has its own manager instance.
Session lifecycle
new connection ──> start_session() ──> stream_data() ──> end_session()
│ │
│ hot-swap to
│ pre-warmed session
│
Creates OmniRealtimeClient
Starts TTS worker thread
Prepares next session (background)Key attributes
| Attribute | Type | Purpose |
|---|---|---|
websocket | WebSocket | Current client connection |
lanlan_name | str | Character identifier |
session | OmniRealtimeClient | Current LLM session |
is_active | bool | Whether session is running |
input_mode | str | "audio" or "text" |
voice_id | str | Character's TTS voice ID |
tts_request_queue | Queue | Outgoing TTS requests |
tts_response_queue | Queue | Incoming TTS audio |
agent_flags | dict | Agent capability flags |
hot_swap_audio_cache | list | Audio buffered during swap |
Hot-swap mechanism
The hot-swap system ensures zero-downtime session transitions:
Prepare: While the current session handles user input, a new
OmniRealtimeClientsession is created in the background with the latest character configuration.Cache: When
end_session()is called, any in-flight audio output is stored inhot_swap_audio_cache.Swap:
_perform_final_swap_sequence()atomically replaces the old session with the new one.Flush: Cached audio is sent to the client, providing a seamless transition.
This means the character can update its personality, voice, or model settings between conversation turns without the user experiencing any delay.
Audio processing
Audio flows through a resampling pipeline:
LLM output (24kHz PCM) ──> soxr resampler ──> 48kHz PCM ──> base64 ──> WebSocketThe resampler uses soxr (high-quality sample rate conversion) to convert from the LLM's native 24kHz output to the browser's expected 48kHz playback rate.
Agent integration
The session manager coordinates with the agent system through callbacks:
- Agent results arrive via ZeroMQ on the
MainServerAgentBridge - Results are dispatched to the relevant
LLMSessionManagerviapending_agent_callbacks trigger_agent_callbacks()injects agent results into the next LLM conversation turn- The LLM can then reference agent findings in its response to the user
Translation support
translate_if_needed() provides automatic translation when the user's language differs from the character's configured language. This uses the TranslationService which falls back through googletrans → translatepy → LLM-based translation.
