From Rubber Duck to Orchestrator: The Case for LLMs as Pipeline Decision-Makers

I want to trace the arc of the last four years, because I think the endpoint is different from where most people in this space are looking.

It started with TabNine completing Spark boilerplate. Then GPT-3 as a one-shot thinking tool. Then Copilot making the IDE ambient with AI. Then ChatGPT making the rubber duck sessions genuinely conversational. Then GPT-4 reasoning about dependencies in ways the earlier models couldn't. Then the metadata-driven config generation experiments, with their illuminating 80% ceiling.

Each step was incremental. Each step also pointed in the same direction. And the direction isn't "better code generation." It's something else.

The Question That Kept Coming Back

Every time I finished a ChatGPT design session on a pipeline architecture problem, I had the same thought: the model understood the problem well enough to help me decide what to build. It understood the data shapes, the dependency constraints, the downstream consumer requirements I'd described. It reasoned about the tradeoffs.

The thing it couldn't do was act on that understanding. I took the decision back to my editor and implemented it. The model gave me the reasoning; I gave the model the action.

That boundary — reason but don't act — is where I've been probing. The question isn't "can a model generate better pipeline code?" The question is "can a model make the orchestration decision about which pipeline to run, when, and why?"

What That Requires

For an LLM to make an orchestration decision rather than just inform one, it needs three things:

Awareness of data state. What partitions exist, what's been processed, what's late, what's complete. This is metadata — the same structured information a well-instrumented pipeline already captures. The audit table pattern from earlier this year is a readable-by-LLM description of data state.

A vocabulary of available actions. Which pipelines can run, what their dependencies are, what their expected outputs are. A pipeline inventory — exactly the kind of structured catalog a config-driven architecture already maintains.

A decision framework. Given state and available actions, what should happen next? This is the reasoning task the model is actually good at. Given that upstream partition A is late and downstream job B has an SLA in four hours, what's the right response? An LLM with the state and the action vocabulary can reason about this.

The Agent Model

This is the agent pattern that's been emerging in LLM tooling over the last year: an LLM with tool access that can observe state, reason about it, and take actions based on that reasoning — in a loop, until the goal is achieved or a human needs to step in.

Applied to data orchestration: an agent that reads the audit table, reads the pipeline inventory, identifies what's late or missing, determines what should run, and either triggers those runs directly or queues them for human approval. Not replacing Airflow — Airflow still handles the execution mechanics. Operating above Airflow, making the decisions about what Airflow should do next, in response to actual data state rather than a fixed schedule.

Why This Is the Natural Extension of Everything Before It

The config-driven pipeline from 2015 separated "what data to process" from "how to process it." The metadata layer work separated "what happened" from "what should happen next." The LLM rubber duck sessions separated "reasoning about the problem" from "implementing the solution." Every one of these was a step toward making the reasoning layer explicit, structured, and separable from the execution layer.

An LLM orchestration agent is the natural assembly of those pieces: structured metadata as the LLM's observation space, the pipeline inventory as its action vocabulary, and the model's reasoning capability as the decision engine. The metadata-driven architecture I've been building since 2015 turns out to have been building toward this all along — I just didn't know it.

I'm actively building toward this. There are hard problems: how to bound the agent's decision authority, how to make its reasoning auditable, how to handle the cases where the model's reasoning is wrong in ways that cascade. None of those are insurmountable. They're engineering problems, and engineering problems have engineering solutions.

More on this as it takes shape. As always, I'm here to help — and increasingly, so is the model.

Read more