Wise memory keeper. A high-throughput FastAPI inference server that manages LLM generation and context streaming.
Find a file
2026-06-14 02:14:07 +03:00
src fix double app init issue 2026-06-11 05:22:52 +03:00
.gitignore Заготовка для системы аунтефикации пользователей 2026-05-04 20:26:01 +03:00
LICENSE include license 2026-06-14 02:14:07 +03:00
README.md Readme fix 2026-06-03 21:51:00 +03:00

🦋 Heiter (The Inference Server)

"The wise memory keeper. A repository of knowledge and character."

Heiter is the intelligence layer and the "brain" of the Frieren AI Ecosystem. It acts as a high-throughput, asynchronous inference server that manages language model context, character personality, and provides OpenAI-compatible streaming endpoints for the pipeline.

🔮 Responsibilities

  • The Grimoire (LLM Layer): Exposes a fully OpenAI-compatible API (/v1/chat/completions) to serve the underlying language model text generation seamlessly.
  • The Scribe (Sber STT): Integrates high-accuracy Speech-to-Text capabilities via sberr tts (local!) to instantly process user speech chunks.
  • The Voice (Edge TTS): Provides lightweight, natural, and low-latency Text-to-Speech generation powered by Edge TTS, keeping the server independent from heavy paid API dependencies.
  • The Overseer (Async & Workers): Built on native asyncio and optimized with background workers to handle multiple concurrent sessions without blocking text/audio generation.

📐 Architecture Integration

🔮 Core Specifications

Component / Layer Magic Spell (Tech Stack) Responsibility & Integration
The Grimoire (LLM Layer) FastAPI + OpenAI API Exposes a fully OpenAI-compatible API (/v1/chat/completions) to serve the underlying language model text generation seamlessly.
The Scribe (STT) Sber STT (LOCAL!) Integrates high-accuracy local Speech-to-Text capabilities to instantly process incoming user speech chunks with minimal latency.
The Voice (TTS) edge-tts Provides lightweight, natural, and low-latency Text-to-Speech generation, keeping the server independent from heavy paid cloud APIs.
The Overseer (Async Core) Python 3.14 + asyncio Powered by native asyncio and optimized background workers to handle multiple concurrent sessions without blocking text/audio generation.

🛠 Tech Stack & Spells

  • Core Server: Python 3.14 (Driven by asyncio & Uvicorn)
  • API Framework: FastAPI
  • Speech Synthesis: edge-tts
  • Speech Recognition: Sber STT Integration (LOCAL!)
  • HTTP Client: httpx

🚀 Quick Start

  1. Clone the grimoire:
   git clone https://github.com/Frierenclaw/heiter.git
   cd heiter
  1. Prepare the ingredients: Install docker

  2. Configure your secrets: Create a .env

  3. Cast the spell:

   docker compose up --build

Part of the Frieren AI Ecosystem.