Delta Sharing: The Open Protocol for Cross-Organization Data Exchange

Databricks announced Delta Sharing at Data + AI Summit last month, and I've been running it in a test environment since. If you haven't seen the announcement: Delta Sharing is an open protocol for sharing live data across organizations without copying it. Recipient gets a read endpoint, not a data dump. The data stays in your storage — you control access, you can revoke it, and the recipient always sees current data.

It's live data sharing without the horror of maintaining a manual export process or giving someone credentials to your actual data lake.

The Problem It Solves

Before Delta Sharing, sharing data with external partners meant one of a few bad options: export to a flat file and send it (stale the moment it leaves), set up a read-only database user and hope they don't abuse it, or build a custom API layer over your data. All of these have maintenance overhead. All of them are either stale or require you to trust the recipient with credentials.

Delta Sharing is different. The recipient doesn't need Databricks. They don't need access to your storage account. They get a credential URL that grants read access to specific tables, and they query through an HTTP endpoint using the open Delta Sharing protocol. The underlying files stay in your lake.

Setting Up a Share

-- Create a share (Databricks SQL)
CREATE SHARE partner_analytics_share
COMMENT 'Monthly aggregated order data for PartnerCo';

-- Add tables to the share
ALTER SHARE partner_analytics_share
ADD TABLE catalog.gold.monthly_order_summary;

-- You can also partition-filter what they see
ALTER SHARE partner_analytics_share
ADD TABLE catalog.gold.regional_sales_summary
  PARTITION (region IN ('West', 'Central'));

Creating a Recipient

-- Create the recipient
CREATE RECIPIENT partnerco_data_team
COMMENT 'PartnerCo analytics team — read-only access to order summaries';

-- Grant them access to the share
GRANT SELECT ON SHARE partner_analytics_share TO RECIPIENT partnerco_data_team;

-- Get the activation link — share this with the recipient, not your credentials
DESCRIBE RECIPIENT partnerco_data_team;
-- Returns: activation_link | activation_url | ...

The activation link is a one-time URL. When the recipient activates it, they get a profile file containing their bearer token and the sharing server URL. You never share your storage credentials — just that activation link.

How the Recipient Reads the Data

The recipient doesn't need Databricks. The Delta Sharing protocol is open, and there are clients for Python, Spark, pandas, and Power BI. A data engineer at the partner organization can read your shared table with:

import delta_sharing

# Load from the profile file they received
profile_file = "/path/to/profile.share"
client = delta_sharing.SharingClient(profile_file)

# List available shares and tables
shares = client.list_shares()
tables = client.list_all_tables()

# Load as pandas DataFrame — no Spark required
df = delta_sharing.load_as_pandas(
    f"{profile_file}#partner_analytics_share.default.monthly_order_summary"
)

# Or load as Spark DataFrame
spark_df = delta_sharing.load_as_spark(
    f"{profile_file}#partner_analytics_share.default.monthly_order_summary"
)

Revoking Access

-- Remove a recipient from a share
REVOKE SELECT ON SHARE partner_analytics_share FROM RECIPIENT partnerco_data_team;

-- Or drop the recipient entirely
DROP RECIPIENT partnerco_data_team;

Access revocation takes effect immediately. The next time the recipient tries to query, they get an authentication failure. No need to rotate storage credentials, rebuild export processes, or contact anyone. One SQL statement ends the sharing relationship.

The protocol is open — the spec is published on GitHub, and Databricks isn't the only company that can build server or client implementations. That's worth noting. This isn't vendor lock-in wrapped in nice packaging; it's a genuine open standard. As always, I'm here to help.

Read more