Building intelligent products at the intersection of AI, design, and engineering.
Vision-aware 3D AI avatar that sees, hears, and responds in real-time. GPT-4 Vision for scene understanding, MediaPipe + face-api.js for emotion & gesture detection, Whisper STT → Pinecone RAG → ElevenLabs TTS pipeline, HeyGen lip-sync. Demoed live at the Dubai AI Summit.
Enterprise voice consulting agent with sub-4s end-to-end latency. Whisper STT → Pinecone RAG → GPT-4o-mini → ElevenLabs TTS pipeline with barge-in support (interrupt mid-response). Live metrics including readiness score, fit score, and ROI projections for lead qualification.
Full enterprise AI SaaS for the French rental market (~40K LOC). 14-node LangGraph agent for intent classification & legal Q&A. Document AI with DeepSeek OCR + extraction + risk scoring. Legal RAG over 3K Légifrance law chunks. Multi-channel (WhatsApp + in-platform), Stripe billing, GDPR-compliant on Azure France Central.
Production government voice assistant for NE India supporting Assamese, Bodo & Hindi. Fully local speech pipeline — IndicConformer STT + VITS ONNX TTS running on a $24/month server. Cross-lingual RAG with Pinecone + ChromaDB, Gemini LLM, streaming NDJSON responses.
Real-time crisis intelligence platform built in 1 week during a live UAE emergency. Aggregates verified news from 10+ official UAE sources, fact-checks social media claims via Claude API, sentiment tracking, and AI-powered Q&A with citations. Real-time updates via SSE streaming.
Local-first privacy AI desktop app that runs 100% offline. Fully air-gapped LLM chat via Ollama, AES-256 encryption at rest, document analysis (PDF, TXT, MD, JSON), zero telemetry. Built for users who need AI without data leaving their machine.
Production WhatsApp luxury watch market scraper (live, deployed on DigitalOcean). Real-time message capture from trading groups, deduplication, contact enrichment, Google Sheets export. Circuit breaker pattern + rate limiting for resilience. Next.js dashboard + 7K LOC FastAPI backend.
AI resume tailoring tool (8.2K LOC). Upload resume PDF → paste job description → GPT-4 rewrites bullets with targeted keywords while preserving tone. Smart constraints: won’t fabricate experience, won’t add fake jobs, won’t modify dates. Keyword match report + cover letter generation + PDF export.
LangGraph · RAG Pipelines · Pinecone · pgvector · ChromaDB · OpenAI · Claude API · Gemini · Whisper · ElevenLabs · Ollama · ONNX · Embeddings · Prompt Engineering · Reinforcement Learning
React · Next.js · TypeScript · Python · FastAPI · Django · Node.js · Express · PostgreSQL · SQLite · Redis · Prisma · SQLAlchemy · REST APIs · WebSockets · GraphQL
GPT-4 Vision · MediaPipe · face-api.js · Moondream · EasyOCR · Presidio PII · Speech-to-Text · Text-to-Speech · HeyGen · Emotion Detection · On-device ML · Cross-lingual NLP
Docker · Google Cloud · Azure · AWS · Vercel · DigitalOcean · Railway · Celery · Nginx · PM2 · CI/CD · Git
Electron · React Native · Stripe · SendGrid · WhatsApp API · Google Sheets API · JWT · OAuth · GDPR · Tailwind · Vite · Figma
Building enterprise AI products at a Dubai-based agency. Developed SIMO — a vision-aware interactive AI avatar with voice, emotion detection, and gesture recognition showcased at the Dubai AI Summit. Building AllysAI Consulting Agent with sub-4s latency and real-time lead qualification.
Founded an AI agency serving 95+ clients. Built the Sarathi Voice Agent — a production voice assistant for government services in Assamese, Bodo, and Hindi running local speech models on a $24/month server. Also shipped SALAMA (crisis intelligence platform in 1 week), an election intelligence system for a real 2026 campaign, and a digital library serving 30+ schools.
Built real-time analytics dashboards (React, Next.js, D3.js) visualizing 190K+ user events through a Python ETL pipeline on Google Cloud (Pub/Sub → BigQuery). Developed microservices with FastAPI & Node.js on GKE. Implemented user-churn prediction with XGBoost that lifted 7-day retention by 12%.
Integrated a BERT-based sentiment API (Hugging Face Transformers) into Node backend, reducing inference latency 40%. Created Python pipeline for social-metric embeddings (UMAP) powering a React heat-map surfacing trending tokens to 50K daily users. Built custom analytics bot for 10K-member developer Discord.
Static retrieval pipelines break when users ask multi-hop questions. Here's how adding planning agents transforms accuracy and user trust.
Users don't wait 12 seconds for a "smarter" answer. How I cut LLM response times by 70% without sacrificing quality through streaming and caching.
Orchestrating 5+ LLM agents sounds cool until they hallucinate in a loop. A practical framework for guardrails, fallbacks, and graceful degradation.
FastAPI + LangChain + Next.js + Vercel. A repeatable blueprint for going from idea to deployed demo in 48 hours flat.
Not every problem needs a custom model. I break down cost, latency, and accuracy tradeoffs with real benchmarks from production systems.
Neither purely ML nor purely product. Why the most impactful builders in the next decade will be hybrids who speak both languages fluently.