Adopting LangGraph: Plan, Implement, Review
Adopting a framework means committing to its mental model. The longer you use it, the more that mental model shapes how you think about the problem space. LangGraph's mental model — workflows as directed graphs with typed state flowing through nodes — turned out to be a good fit for the orchestration problem I was solving. Getting there took some adjustment.
The Graph Mental Model
If you've been building data pipelines for a long time, the graph model isn't foreign. A DAG in Airflow is a directed graph. The dependency resolution is graph traversal. The execution order is topological sort on the DAG. LangGraph applies the same model to AI task sequences.
The difference from Airflow is in what flows between nodes. In Airflow, the DAG defines execution order and nodes communicate through XComs — essentially a key-value store with some awkward ergonomics. In LangGraph, the state is a typed object that flows through the graph. Each node receives the current state, optionally modifies it, and passes it to the next node. The state is the shared context for the entire workflow execution.
This matters for AI workflows because the "context" problem — maintaining relevant information across multiple model calls — maps directly to state management in the graph. What the first model call produces gets added to state. The second model call receives that state and can use it. No separate context management required; the graph handles it.
The Plan-Implement-Review Loop
The core workflow I was building had a specific shape: given a task description, generate a plan; given a plan, generate an implementation; given an implementation, generate a review; if the review passes, done; if the review fails, revise the implementation and re-review. That's a graph with a loop.
Expressing that in LangGraph was more natural than I expected. The conditional edge — go to END if the review passes, go back to the implementation node if it fails — is the loop mechanism. The state carries both the review output and the iteration count, so the loop has an exit condition when the retry budget is exhausted.
The piece that required the most iteration was the implementation node. Generating code from a plan is a single model call at the simple end. At the production end, it's a multi-step process: retrieve relevant context, refine the plan based on the context, generate code against the refined plan, post-process the code to ensure it matches project conventions. Each of those steps is itself a node, and the implementation "step" in the top-level graph became a subgraph.
# The implementation subgraph
impl_graph = StateGraph(ImplementationState)
impl_graph.add_node("retrieve_context", retrieve_from_cortex_forge)
impl_graph.add_node("refine_plan", refine_plan_with_context)
impl_graph.add_node("generate_code", generate_code_from_plan)
impl_graph.add_node("post_process", apply_project_conventions)
impl_graph.set_entry_point("retrieve_context")
impl_graph.add_edge("retrieve_context", "refine_plan")
impl_graph.add_edge("refine_plan", "generate_code")
impl_graph.add_edge("generate_code", "post_process")
impl_graph.add_edge("post_process", END)
implementation_node = impl_graph.compile()
Composing subgraphs into larger graphs is where LangGraph's model starts to pay off. The parent graph treats the compiled subgraph as a single node. The complexity of the implementation step is hidden behind a clean interface. The parent graph doesn't care how implementation works; it cares about the state before and after.
The Cortex Forge Integration
Integrating the knowledge retrieval system with the orchestration layer was the piece that made the whole stack feel coherent. The retrieval step — retrieve_from_cortex_forge in the subgraph above — is a node that queries the knowledge base using the current task context as the retrieval query and adds the results to the implementation state.
Every model call in the implementation subgraph sees the retrieved context. The plan refinement step uses it to adjust the plan for project-specific constraints. The code generation step uses it to apply the right naming conventions and patterns. The post-processing step uses it to verify project conventions are applied correctly.
This is the end-to-end integration I had been working toward since the knowledge system was first designed: retrieved context flowing automatically into model calls at the right points in the workflow, without manual assembly. It works. The quality improvement on project-specific tasks is visible and consistent.
What the First Production Run Showed
The first production run of the full orchestrated workflow — against a real Forgejo issue on a real project — produced output I would not have been embarrassed to submit as a first-pass implementation. Not ready to merge without review, but close. The plan was coherent with the project architecture. The implementation used project naming conventions. The review caught two legitimate issues that I confirmed were real problems, not model hallucinations.
That result took a lot of infrastructure to produce. The knowledge base, the retrieval layer, the orchestration graph, the model routing logic, the output verification — none of it was trivial to build. But the output quality was qualitatively different from what a single-shot model call produces, and the difference was directly traceable to the infrastructure.
The work wasn't done. It still isn't done. But the direction was validated. As always, I'm here to help if you want to dig into the LangGraph integration details or the state management decisions.