The AI Product Engineer — Roshan Kharel

The Day Someone Told Me to Specialize

About two years ago, I was in a conversation with a senior engineer I respected. I'd just walked him through a project where I'd built a custom vision pipeline, designed the user interface around its latency constraints, and shipped the whole thing to production. His response: "That's cool, but you need to pick a lane. Are you an ML engineer or a product engineer?"

I understood the logic. The industry loves clean titles. ML engineers train models. Product engineers ship features. Data engineers build pipelines. Everyone has a box. Stay in it.

But I kept running into the same problem: the best ML work I saw was going unused because nobody on the product side understood what the model could actually do. And the best product ideas were being watered down because nobody on the engineering side knew how to design around a model's real constraints — not the theoretical ones from a research paper, but the actual, messy, production-grade constraints.

So I refused to pick. And that refusal became a superpower.

What an AI Product Engineer Actually Does

Let me be clear about what I mean, because "AI Product Engineer" sounds like a LinkedIn buzzword, and I hate LinkedIn buzzwords.

I'm not talking about someone who calls an OpenAI API and puts a chat interface on it. That's a wrapper, not a product. I'm talking about someone who understands model capabilities deeply enough to design products around their constraints — and who understands users deeply enough to know which constraints matter and which don't.

Here's a concrete example. When I was building SIMO Avatar, the AI-generated avatar system, the vision model we used had a latency of about 3.2 seconds per generation at the quality level we needed. A pure ML engineer might look at that number and say "we need to optimize the model." A pure product engineer might say "users won't wait 3 seconds, kill the feature."

What I did instead: I designed the UX flow so that avatar generation happened asynchronously during onboarding. While the user was filling out their profile details — name, bio, preferences — the model was running in the background. By the time they hit "Continue," their avatar was ready. The 3.2-second latency was completely invisible.

That solution didn't require a faster model or a compromised product. It required someone who understood both the model's behavior and the user's journey well enough to see the gap where one could hide inside the other.

The SALAMA Lesson: When Users Need to See the Uncertainty

SALAMA taught me the opposite lesson. It's an AI-powered fact-checking system for media content — a tool where getting it wrong has real consequences. Misinformation doesn't just annoy users, it erodes trust in institutions.

The model produced confidence scores for its fact-check assessments. Early in development, we had a debate: should we show these scores to users, or just show a binary "True / False / Unverified" badge?

The ML engineer's instinct was to hide the scores. "Users don't understand probability. They'll misinterpret a 72% confidence as the model being wrong 28% of the time." And technically, that's a valid concern.

But I'd spent time watching journalists actually use fact-checking tools. They don't want a magic oracle that says "TRUE" with false certainty. They want to understand how much digging they still need to do. A claim rated at 95% confidence? Probably safe to reference. A claim at 62%? That's a flag to investigate further, not a verdict.

We shipped the confidence scores. We designed a visual system — a colored bar that went from red through amber to green — that communicated uncertainty intuitively without requiring users to parse numbers. We added a brief explanation of why the model was uncertain: "Limited source corroboration" or "Conflicting information from verified sources."

# SALAMA's confidence display logic
def format_confidence(score: float, factors: list[str]) -> FactCheckDisplay:
    return FactCheckDisplay(
        verdict=classify_verdict(score),
        confidence_bar=ConfidenceBar(
            value=score,
            color=interpolate_color(RED, AMBER, GREEN, score),
        ),
        explanation=generate_uncertainty_reason(factors),
        # This was the key product decision:
        show_raw_score=True,
        disclaimer="AI-assisted assessment. Editorial judgment recommended."
    )

That feature — showing uncertainty honestly — became SALAMA's most praised aspect in user testing. Journalists trusted the tool more because it was honest about what it didn't know. A pure ML engineer would have hidden the scores. A pure product person might not have understood why the scores existed in the first place. The right answer required both perspectives.

The Four Skills You Actually Need

I've been thinking about what makes this role work, and it boils down to four overlapping competencies. You don't need to be world-class at any of them. You need to be competent at all four.

1. Enough ML to Know What's Possible and What's BS

You need to read a model card and understand what it actually means. You need to know that "97% accuracy on the benchmark" doesn't mean 97% accuracy on your data. You need to understand why fine-tuning a 7B model on 200 examples won't give you GPT-4 performance. You need to smell BS when a vendor tells you their model "understands" your domain.

I'm not saying you need to derive backpropagation from scratch. I'm saying you need to have trained enough models, read enough papers, and failed enough times to have calibrated intuition. When someone proposes a feature that requires real-time video understanding at 60fps on a mobile device, you need the gut sense to say "not yet" — and know approximately when "yet" might be.

2. Enough Product Sense to Know What Users Actually Need

The graveyard of AI products is full of technically impressive demos that nobody uses. I've seen teams spend six months building a model that generates beautiful marketing copy, only to discover that marketers don't actually want generated copy — they want a first draft they can edit. Completely different UX. Completely different product.

With AllysAI, our accessibility tool, we learned this the hard way. Our first version analyzed web pages and produced a detailed accessibility report — a comprehensive, technically accurate document listing every WCAG violation. Users hated it. Not because it was wrong, but because it was overwhelming. A page with 47 accessibility issues? Where do you even start?

We rebuilt the output around prioritized, actionable suggestions. "Fix these 3 critical issues first. They affect 80% of your users with disabilities." Same model underneath. Radically different product. Usage went up 4x.

3. Enough Engineering to Ship It Fast

Ideas are worthless without execution. In the AI space especially, the window between "interesting idea" and "commoditized feature" is shrinking to months. If you can't prototype, test, and ship quickly, someone else will.

This means: you can build a reliable API, you understand caching and queuing, you know how to deploy a model behind a load balancer, you can write a database migration without sweating, and you can set up monitoring that tells you when your model is hallucinating in production before your users tell you on Twitter.

The engineering bar isn't "can you build a distributed training cluster." It's "can you take something from prototype to production in a week without it falling over."

4. Enough Design Sense to Make It Feel Good

AI products have a unique design challenge: they're probabilistic. They're wrong sometimes, and they need to communicate uncertainty, loading states, and failures gracefully. A regular web app either loads the data or shows an error. An AI product might return something that's 70% right, and the interface needs to communicate that without destroying user trust.

The streaming text effect in ChatGPT? That's a design decision born from understanding the model's token-by-token generation constraint. The progress indicators in Midjourney? Same thing. The best AI UX doesn't fight the model's nature — it embraces it.

The People Who Get Stuck

I see two failure modes constantly.

The pure ML engineer who builds an incredible model and then throws it over the wall. "Here's the API, it takes a JSON payload and returns a prediction." No thought given to how a user would actually interact with it, what happens when the model is uncertain, or how the product should degrade when the model fails. They've built a brain with no body.

The pure product person who specs features based on what they've seen in demos and blog posts. "We need a feature that automatically categorizes all incoming documents with 99% accuracy." They don't understand that 99% accuracy on a benchmark is not 99% accuracy on their messy, real-world data. They spec the impossible and then blame engineering when it doesn't work. They've designed a body for a brain that doesn't exist.

The AI Product Engineer lives in the gap between these two. It's not a comfortable place. You're too product-minded for the ML team and too technical for the product team. But it's exactly where the best AI products get designed.

This Is the Role the Industry Needs

We're past the phase where AI was a research novelty. We're in the phase where AI needs to become useful products that real people use every day. That transition requires people who can hold the model in one hand and the user in the other and figure out how they fit together.

The most dangerous person in AI right now is someone who can both build the model and ship the product. That's who I'm trying to be.

I don't have it all figured out. Every project teaches me something new about where these skills intersect. Building Sarathi taught me that ML constraints can drive product innovation. SALAMA taught me that product decisions can make ML outputs more trustworthy. SIMO Avatar taught me that the best engineering often happens in the seams between model and interface.

If you're an ML engineer reading this: learn to ship products. Not just models, not just APIs — products. Spend time watching users. Feel the pain of a confused customer. It will make your technical work 10x more impactful.

If you're a product engineer: learn enough ML to have calibrated intuition. You don't need a PhD. You need to have fine-tuned a model once, hit a wall, and understood why. That experience alone will make you a better product thinker in the AI era.

And if someone tells you to pick a lane — smile, nod, and keep building in both.