Multi-Model Routing in Data Pipelines: A Decision Framework

Shannon Lowder

16 Oct 2025 — 1 min read

A year ago, most production pipelines that used LLMs had a single model choice baked in. Today, with the open-weight landscape as competitive as it is and multiple model families offering genuinely different capability-cost tradeoffs, building a routing layer is worth the effort. Here's the framework I actually use.

Classify Your Tasks First

Before you can route, you need to understand what your pipeline is actually asking models to do. Most LLM calls in a data pipeline fall into a few categories: classification (is this record valid?), extraction (pull these fields from this text), generation (write a SQL query for this intent), diagnosis (why did this job fail?), and planning (what's the remediation sequence?). Each has different accuracy requirements, latency tolerances, and cost sensitivities.

The Routing Table

Once you've classified your tasks, the routing table becomes fairly mechanical. High-volume classification routes to Haiku or a small open-weight model because cost dominates and accuracy is sufficient. Structured extraction also routes to Haiku for fast, consistent JSON output. Complex SQL generation routes to Sonnet or o3 because accuracy matters more than cost. Root cause diagnosis routes to Sonnet with thinking enabled because reasoning depth matters. Stakeholder reports route to Sonnet or GPT-5 because prose quality matters.

Make the Router Testable

Your routing logic should be a function you can unit test independently of the models. Hard-coding model names in individual pipeline nodes makes it painful to update when a better cheap model ships. A routing function that accepts a task type and returns a model identifier means one change propagates everywhere.

Also: log which model handled which request in your pipeline audit trail. When something goes wrong and you're diagnosing whether the model made a bad decision or the routing sent the wrong model, you want that data. I'm here to help design the routing layer.

The Context Problem Neither Agent Mesh Nor OpenSharing Solves

I wrote recently about Azure Agent Mesh and OpenSharing — two infrastructure layers that between them cover how enterprises register, discover, share, and execute agents. Between them, they address a lot of the plumbing that has been missing from the enterprise agent stack. But there's a gap neither of

Unity AI Gateway and What a Governed Model Access Layer Actually Buys You

Unity AI Gateway, announced at DAIS this week, is the feature I've been waiting for since Agent Bricks shipped last year. It's a centralized governance layer for model access in Databricks — you configure which models are approved for use in your environment, who can call them,

You Don't Need Fable. You Need a Router.

The performance gap between open-weight models and closed frontier models has spent the last year collapsing faster than anyone predicted. Epoch AI's tracking puts open weights at roughly a three-to-four-month lag behind state-of-the-art closed models on average. For coding tasks, the gap has effectively closed — DeepSeek V3.2

DAIS 2026: Genie One and the Context Problem Databricks Is Solving

The central message from DAIS this week, delivered by Ali Ghodsi in the opening keynote, was direct: AI doesn't have an intelligence problem, it has a context problem. If your CFO can't get an AI system to explain why margins changed, that's not a