Voice AI

Sarathi Voice Agent

Production voice assistant for NE India in Assamese, Bodo & Hindi. Local STT/TTS on a $24/month server.

← All Projects

About

Sarathi Voice Agent is a production voice assistant that helps citizens in Northeast India navigate government services in their native languages. It runs speech-to-text and text-to-speech models locally (no cloud APIs) because the languages it supports (Assamese, Bodo) aren't well-served by major cloud providers. Cross-lingual RAG allows regional language queries to match English knowledge base documents without a translation step.

Key Features

  • Local STT (IndicConformer, 0.5s) and TTS (VITS ONNX) -- no cloud speech APIs
  • Cross-lingual RAG: Assamese/Bodo/Hindi queries match English documents
  • Streaming audio via NDJSON -- text and audio play in parallel
  • Circuit breaker fallback: Pinecone to ChromaDB
  • Runs on a $24/month DigitalOcean server (4 vCPU, 8GB RAM)
  • Live at regional-agent.sarathi.studio serving real citizens

Impact

Live in production, serving real citizens. Demonstrates that meaningful AI can run on commodity hardware for underserved populations.