Data Factory in Microsoft Fabric: What's the Same and What's New

The Technical Comparison You Actually Need

The Fabric announcement was big on positioning and short on technical specifics. Let me do the comparison properly: standalone Azure Data Factory versus Data Factory in Microsoft Fabric, capability by capability. What carries forward, what changes, and what doesn't exist in Fabric yet.

What's Identical

The core pipeline authoring experience is unchanged. Same drag-and-drop canvas. Same activity palette. Same JSON under the hood. If you have muscle memory for ADF Studio, it works in Fabric Data Factory.

Activities

Copy Activity, Data Flow, ForEach, Until, If Condition, Switch, Lookup, Get Metadata, Web Activity, Azure Function, Stored Procedure, Notebook (Spark), Set Variable, Append Variable, Wait, Fail, Delete, Filter, Execute Pipeline — all present, same configuration options, same behavior.

Connector Library

The same connector library that powers ADF Copy Activity powers Fabric Data Factory Copy Activity. Same 100+ connectors. Same connector configuration options. Same authentication methods. This is shared infrastructure; Microsoft maintains it in one place.

Expression Language

Every expression you've written in ADF works in Fabric Data Factory. @pipeline().parameters, @activity().output, @concat(), @formatDateTime(), @json(), all of it. Same functions, same syntax, same behavior. No rewrite.

Parameterization Model

Pipeline parameters, activity-level dynamic content, parameter passing via Execute Pipeline — same model. The metadata-driven framework pattern (control table → Lookup → ForEach → parameterized pipeline) works identically in Fabric Data Factory.

Trigger Types

Schedule triggers, tumbling window triggers, event-based triggers (storage event, custom event) — all present. Same configuration options.

What's New in Fabric Data Factory

OneLake Integration

OneLake is Fabric's shared storage layer — one logical data lake per Fabric tenant, built on ADLS Gen2, with Delta Lake as the default format. Fabric Data Factory Copy Activity can write directly to OneLake lakehouses and warehouses as a destination, with OneLake as a first-class sink type rather than "ADLS Gen2 path."

In practice: when your pipeline's output is going into a Fabric lakehouse (for downstream Spark processing or Power BI) rather than a standalone ADLS Gen2 account, the OneLake integration simplifies the sink configuration. One linked service, semantically meaningful paths, automatic Delta Lake format handling.

Native Fabric Experience Integration

Triggering a Spark notebook in Fabric Data Engineering from a Fabric Data Factory pipeline requires no linked service configuration — you're in the same workspace. The Notebook activity binds to notebooks in the current workspace directly. This removes the linked service + secret management overhead that ADF notebook execution required (Databricks workspace URL, access token, cluster selection).

Same for Fabric Data Warehouse stored procedures — no linked service, direct workspace binding.

Power Query Dataflows as a Pipeline Activity

Power Query dataflows (the M formula / Power Query Online transformation experience) are now a first-class pipeline activity in Fabric Data Factory. In standalone ADF, the equivalent was "Wrangling Data Flow" — a limited version of Power Query accessible from a Data Flow activity. In Fabric, Power Query dataflows are independent items in the workspace, and you can trigger them from pipelines. This gives non-Spark transformation a more prominent role in the Fabric pipeline model.

Workspace-Level Git Integration

This is the biggest operational change. In standalone ADF, git integration is configured per factory — the factory connects to a specific repo, branch, and folder. The CI/CD model uses the adf_publish branch and ARM templates.

In Fabric, git integration is configured at the workspace level. The workspace syncs to a git repo (Azure DevOps or GitHub). All Fabric items in the workspace — pipelines, notebooks, datasets, reports — are synced to git together. No more adf_publish branch. No more ARM templates. The deployment model is direct workspace item sync.

This is genuinely simpler. No generated ARM templates, no 5000-line machine-generated JSON to diff, no npm publish step. The tradeoff: you give up the ARM template deployment model, which means existing Azure DevOps pipelines built around ADF ARM template deployment need to be redesigned for Fabric's workspace sync model.

What Fabric Data Factory Doesn't Have (Yet)

Azure-SSIS Integration Runtime

Not available in Fabric. If you need to run legacy SSIS packages in managed cloud infrastructure, you stay on standalone ADF. This is the clearest reason to maintain standalone ADF for specific workloads.

Self-Hosted Integration Runtime: Different Setup

Self-Hosted IR exists in Fabric Data Factory, but the setup process and management experience are different from standalone ADF. If you have Self-Hosted IR connecting to on-premises sources, test the Fabric setup process before migrating — it's not a direct lift.

The Migration Decision

Fabric Data Factory is in preview through late 2023. GA is expected in November. My guidance: don't migrate production workloads until GA. Run proof-of-concept workloads in Fabric now to understand the git integration model change (which is the most significant operational difference), the SSIS-IR gap, and the Self-Hosted IR setup differences.

For new workloads starting after Fabric GA: evaluate Fabric Data Factory first, especially if you're also using other Fabric experiences (Spark, Warehouse, Power BI). The co-location benefits are real.

For existing ADF workloads: migrate on a deliberate timeline after GA, not before. I'll cover the migration specifics in the next post.

Read more