The New Governance Stack for AI
The governance conversation in data engineering has historically centered on data: who can access what table, what's the lineage of this column, does this dataset comply with GDPR. That conversation is mature — the tooling is good, the processes are established, most serious data organizations have solved it or are actively working on it.
Governance for AI is a decade behind. And the regulatory pressure is arriving before the tooling is ready.
Why Governance Must Shift from Data to Models to Agents
The governance surface area has expanded in layers. First we needed to govern data: who sees what, where does it come from, how is it transformed. Then we needed to govern ML models: what trained this model, who approved it, what version is in production, is it drifting. Now we need to govern agents: what tools can this agent call, under what conditions, with what authority, and with what audit trail.
Each layer builds on the previous one. You can't govern a model without governing the data it was trained on. You can't govern an agent without governing the models it invokes. The governance stack grows upward, and most organizations are still building the foundation while being asked to also govern the top floors.
A Blueprint for Modern AI Governance
The minimum viable AI governance framework has four components:
Model provenance — every production model has a documented lineage: training data source, training run ID, evaluation results, who approved promotion to production, and the date of promotion. Unity Catalog model registry handles most of this if you instrument correctly.
Access control — model endpoints have explicit access policies: which applications, service accounts, and users can invoke them. This is not "the endpoint is available internally" — it's a positive list of authorized callers with audit logging on every invocation.
Output monitoring — production AI systems have quality metrics tracked over time. For classification models, that's precision and recall against a weekly labeled sample. For generative models, that's factual consistency, appropriate refusals, and human review of sampled outputs. Monitoring is a defined operational responsibility, not an ad-hoc activity.
Incident response — when a model produces problematic output, there's a defined process: how quickly must it be taken offline, who has authority to remove it from production, what's the rollback path, and what's the communication plan for affected stakeholders.
Where Databricks Is Ahead and Behind
Ahead: model registry governance through UC, lineage tracking from training through serving, access control on serving endpoints. These are solid and production-ready.
Behind: agent governance is essentially unstructured right now. There's no UC-native concept of an agent with governed tool access. There's no audit trail for multi-step agent actions at the granularity that enterprise compliance requires. There's no defined incident response tooling for agent misbehavior. These are year-2025 and year-2026 problems, and they're arriving faster than the tooling. As always, I'm here to help.