The Next Generation of Data Intelligence Platforms
Databricks rebranded themselves as a "Data Intelligence Platform" at DAIS 2024, and they've been using the term consistently since. It's a more ambitious claim than "data lakehouse platform" — and it invites examination. What does "data intelligence" actually mean, and does Databricks' product deliver it?
My Critique of the Term
Intelligence, in the human sense, involves understanding, reasoning, and adaptation. A platform can facilitate these things by making the data available to systems that do the understanding and reasoning — but the platform itself isn't the intelligence. The intelligence is the combination of good data, good models, and people who ask the right questions.
What I think Databricks means by "data intelligence": a platform where data is fully governed and accessible, models can be trained and served on that data, and natural language interfaces make querying accessible beyond SQL practitioners. That's a coherent and valuable product vision. It's not intelligence in the strong sense, but it's a meaningful extension of what data platforms have historically provided.
The rebranding is partly positioning against Snowflake and BigQuery, who are making similar moves — both have AI/ML tiers and both are investing in LLM-based query interfaces. "Data Intelligence" is the competitive differentiator claim. Whether it holds up depends on execution over the next 18 months.
What a Real Intelligence Layer Should Look Like
If I'm designing for genuine data intelligence — not the marketing version — it has three properties that the current platforms only partially exhibit.
Self-awareness — the platform should know what it knows and what it doesn't. When you ask a question it can't answer confidently, it should say so rather than generating a plausible-sounding wrong answer. Current LLM-based query interfaces are still too willing to hallucinate when the answer isn't in the data.
Proactive insight generation — not just answering questions but surfacing things you should know but didn't think to ask. "Revenue in the West region dropped 12% last week — here's the attribution analysis." This requires the platform to maintain ongoing awareness of patterns and anomalies, not just respond to queries.
Closed-loop learning — the system should get better at answering your questions over time based on feedback, not just based on static embeddings of your schema. When a query produces a wrong result and a user corrects it, that correction should improve future responses. The current generation of data intelligence products doesn't do this reliably.
The Verdict
Databricks in 2025 is building toward a data intelligence platform. They're not there yet. The components exist — UC for governance, Mosaic AI for model management, Vector Search for retrieval, AI/BI for natural language querying — but the integration is still at the "components that can work together" stage rather than the "coherent intelligence layer" stage. The next 18 months of product development will determine whether the vision becomes real. I'm watching it closely. As always, I'm here to help.