The Return of Classical ML: Why It Still Matters

There's a quiet exhaustion settling into data teams that have spent 2024 chasing GenAI use cases that haven't shipped, building RAG pipelines that don't quite work well enough, and defending to stakeholders why the AI chatbot isn't ready yet. Meanwhile, the gradient boosted models and regression pipelines and time series forecasters that were running before any of this started are still running, still producing value, still accurate.

Classical ML didn't stop working because LLMs arrived. I want to say that out loud.

Where ML Still Outperforms GenAI Approaches

Structured data with well-defined prediction targets is still the domain where classical ML produces better results with less complexity than anything LLM-based. Credit risk scoring, demand forecasting, customer churn prediction, predictive maintenance, fraud detection — these are all problems where gradient boosting, random forests, and linear models produce highly accurate, interpretable, auditable results on tabular data.

There's no version of a RAG pipeline that outperforms a well-tuned XGBoost model on a structured classification task. The LLM doesn't know your historical transaction patterns better than a model trained directly on them. This sounds obvious. It's apparently not obvious to the teams I keep seeing spin up LLM-based solutions for problems that are fundamentally tabular prediction problems.

The Critique: GenAI Overshadowing Fundamentals

Data science teams are under pressure to work on GenAI because executives want to see GenAI projects. This means teams that were productively improving ML models — adding features, retraining on fresh data, improving calibration — are now context-switching to RAG pipelines and agent frameworks that aren't yet mature enough for production.

The ROI math is unflattering for GenAI right now: a team that spends six months improving their fraud detection model by 3% might prevent $2M in annual losses. A team that spends six months building an internal chatbot might produce a demo that's in POC purgatory by month eight. The classical ML investment has a clearer, faster, more predictable return.

My Prediction: ML + GenAI Hybrid Systems Dominate 2025

The teams that are going to be most effective in 2025 are the ones that don't treat ML and GenAI as competing approaches but as complementary layers in the same system. Classical ML handles the structured prediction tasks it's been doing for years. GenAI handles the unstructured understanding, generation, and reasoning tasks where it genuinely adds value. The architecture looks like: ML model scores the risk, GenAI model writes the explanation for the human reviewer.

This is already how the best enterprise AI systems I've seen are built. It's not glamorous at a conference where everyone is showing pure GenAI demos — but it's what works in production. The hybrid system isn't a compromise; it's the right design. As always, I'm here to help.

Read more