Key Data Flows:
Audio Flow:
- Microphone → Audio Manager → WebSocket Manager → OpenAI
- OpenAI → WebSocket Manager → Audio Manager → Speakers
Event Flow:
- OpenAI → WebSocket Manager → Event Handler → Appropriate Component
Function Call Flow:
- OpenAI → WebSocket Manager → Event Handler → Function Call Handler → Custom Tools
- Custom Tools → Function Call Handler → WebSocket Manager → OpenAI
State Management:
- All components update and read from State Manager
- State Manager ensures consistency across components
graph TB
subgraph Client Application
Main[Main Application]
end
subgraph RealtimeAPI Core
WSM[WebSocket Manager]
EH[Event Handler]
AM[Audio Manager]
FCH[Function Call Handler]
State[State Manager]
end
subgraph External Services
OpenAI[OpenAI WebSocket API]
Audio[Audio I/O Hardware]
Tools[Custom Tools/Functions]
end
WebSocket Flow
WSM <-->|WebSocket Messages| OpenAI
Audio Flow
AM <-->|Audio Stream| Audio
AM -->|Audio Data| WSM
State Updates
State -.->|State Updates| WSM
State -.->|State Updates| AM
State -.->|State Updates| FCH
classDef core fill:#f9f,stroke:#333,stroke-width:2px
classDef external fill:#bbf,stroke:#333,stroke-width:2px
class WSM,EH,AM,FCH,State core
class OpenAI,Audio,Tools external
Basics of web socket for Python
How web socket is used for audio in Python?