mirror of
https://github.com/Frierenclaw/heiter.git
synced 2026-06-22 03:20:04 +00:00
Wise memory keeper. A high-throughput FastAPI inference server that manages LLM generation and context streaming.
| src | ||
| .gitignore | ||
| LICENSE | ||
| README.md | ||
🦋 Heiter (The Inference Server)
"The wise memory keeper. A repository of knowledge and character."
Heiter is the intelligence layer and the "brain" of the Frieren AI Ecosystem. It acts as a high-throughput, asynchronous inference server that manages language model context, character personality, and provides OpenAI-compatible streaming endpoints for the pipeline.
🔮 Responsibilities
- The Grimoire (LLM Layer): Exposes a fully OpenAI-compatible API (
/v1/chat/completions) to serve the underlying language model text generation seamlessly. - The Scribe (Sber STT): Integrates high-accuracy Speech-to-Text capabilities via sberr tts (local!) to instantly process user speech chunks.
- The Voice (Edge TTS): Provides lightweight, natural, and low-latency Text-to-Speech generation powered by Edge TTS, keeping the server independent from heavy paid API dependencies.
- The Overseer (Async & Workers): Built on native
asyncioand optimized with background workers to handle multiple concurrent sessions without blocking text/audio generation.
📐 Architecture Integration
🔮 Core Specifications
| Component / Layer | Magic Spell (Tech Stack) | Responsibility & Integration |
|---|---|---|
| The Grimoire (LLM Layer) | FastAPI + OpenAI API | Exposes a fully OpenAI-compatible API (/v1/chat/completions) to serve the underlying language model text generation seamlessly. |
| The Scribe (STT) | Sber STT (LOCAL!) | Integrates high-accuracy local Speech-to-Text capabilities to instantly process incoming user speech chunks with minimal latency. |
| The Voice (TTS) | edge-tts | Provides lightweight, natural, and low-latency Text-to-Speech generation, keeping the server independent from heavy paid cloud APIs. |
| The Overseer (Async Core) | Python 3.14 + asyncio | Powered by native asyncio and optimized background workers to handle multiple concurrent sessions without blocking text/audio generation. |
🛠 Tech Stack & Spells
- Core Server: Python 3.14 (Driven by
asyncio&Uvicorn) - API Framework:
FastAPI - Speech Synthesis:
edge-tts - Speech Recognition: Sber STT Integration (LOCAL!)
- HTTP Client:
httpx
🚀 Quick Start
- Clone the grimoire:
git clone https://github.com/Frierenclaw/heiter.git
cd heiter
-
Prepare the ingredients: Install docker
-
Configure your secrets: Create a
.env -
Cast the spell:
docker compose up --build
Part of the Frieren AI Ecosystem.