Unity Catalog Migrations: Moving Tables from Hive Metastore Without Breaking Pipelines

Your Hive metastore has been accumulating tables for years. Hundreds of them, maybe more — bronze ingestion tables, silver conformance tables, gold aggregations, ad-hoc analyst tables, experiment tables that never got cleaned up. Unity Catalog is now on the horizon and you need to move them. The good news: most of the move is mechanical. The bad news: there are enough edge cases to hurt you if you don't plan for them.

Understanding What You're Actually Migrating

There are two fundamentally different things living in your Hive metastore:

Tables with data in DBFS managed storage — these are tables where Databricks manages the underlying files. When you created them with spark.write.format("delta").saveAsTable("my_table") without specifying a LOCATION, the data went to dbfs:/user/hive/warehouse/. In Unity Catalog, DBFS managed storage isn't supported — you need to migrate the data to external storage.

External tables with data in ADLS/S3 — these are tables where you specified a LOCATION pointing to cloud storage. These are much easier to migrate: you mostly need to recreate the metadata in Unity Catalog pointing at the same storage path.

-- Check what you have before you start
SELECT
table_schema,
table_name,
table_type,
-- External tables have a LOCATION; managed tables don't (or point to DBFS)
location
FROM information_schema.tables
WHERE table_catalog = 'hive_metastore'
ORDER BY table_schema, table_name;

Migrating External Tables

For external tables in ADLS/S3, the migration path is:

  1. Ensure the target UC catalog and schema exist
  2. Create an external location in UC that covers the storage path
  3. Use the SYNC command or recreate the table definition

-- Option 1: CREATE TABLE LIKE (preserves schema, you specify new location)
CREATE TABLE prod_analytics.silver.customer_orders
LIKE hive_metastore.silver.customer_orders
LOCATION 'abfss://[email protected]/customer_orders/';

-- Option 2: If data is already at an external location, just register it
CREATE TABLE prod_analytics.silver.customer_orders
USING DELTA
LOCATION 'abfss://[email protected]/customer_orders/';

-- Verify the migration
DESCRIBE EXTENDED prod_analytics.silver.customer_orders;

Migrating DBFS Managed Tables

This is the harder case. You need to physically move the data from DBFS to cloud storage, then register the table in UC:

import subprocess

def migrate_managed_table(
source_db: str,
table_name: str,
target_catalog: str,
target_schema: str,
target_location: str
) -> None:
# Step 1: Read from existing table
source_df = spark.table(f"hive_metastore.{source_db}.{table_name}")

# Step 2: Write to external location
source_df.write.format("delta").mode("overwrite").save(target_location)

# Step 3: Register in Unity Catalog
spark.sql(f"""
CREATE TABLE IF NOT EXISTS {target_catalog}.{target_schema}.{table_name}
USING DELTA
LOCATION '{target_location}'
""")

# Step 4: Verify row counts match
source_count = source_df.count()
target_count = spark.table(f"{target_catalog}.{target_schema}.{table_name}").count()
assert source_count == target_count, f"Row count mismatch: {source_count} vs {target_count}"
print(f"Migrated {table_name}: {source_count} rows")

The Permissions Gap

Hive metastore permissions and Unity Catalog permissions are different systems. When you migrate a table to UC, the old Hive GRANT statements don't carry over. You'll need to re-grant permissions in UC. Before you go live, audit who has access to what in Hive and rebuild those grants in the UC permission model:

-- Rebuild grants in Unity Catalog
GRANT SELECT ON TABLE prod_analytics.silver.customer_orders TO `[email protected]`;
GRANT MODIFY ON TABLE prod_analytics.silver.customer_orders TO `[email protected]`;

The migration itself is rarely the hard part. The hard part is the permissions audit, the downstream code changes (three-tier namespace instead of two-tier), and communicating to the teams whose notebooks are about to get broken by a table rename. Plan for that coordination time. As always, I'm here to help.

Read more