Databricks Asset Bundles: Early Look at Infrastructure as Code for Your Workspace

Databricks announced Asset Bundles at Data + AI Summit last month. If you've been deploying Databricks workloads by hand — dragging notebooks through the UI, running Databricks CLI commands in CI/CD scripts, or writing custom REST API calls to deploy jobs — Asset Bundles is worth paying attention to.

The short version: it's infrastructure as code for your Databricks workspace, using a YAML-based project structure that maps to Databricks resources. Think Terraform for the workspace layer, with a first-party deployment CLI.

What an Asset Bundle Is

A bundle is a directory with a databricks.yml file that declares your workspace resources: jobs, pipelines, notebooks, Python packages, and any other artifacts. The CLI reads the bundle definition and deploys it to a target workspace. Different target environments (dev, staging, prod) get different configurations from the same bundle source.

bundle:
name: order_processing_pipeline

workspace:
host: ${DATABRICKS_HOST}
root_path: /Shared/.bundles/${bundle.name}

variables:
cluster_size:
description: "Cluster node type for job clusters"
default: Standard_DS3_v2
env:
description: "Environment name"
default: dev

resources:
jobs:
daily_order_pipeline:
name: "Daily Order Processing [${var.env}]"
schedule:
quartz_cron_expression: "0 0 2 * * ?"
timezone_id: "UTC"
tasks:
- task_key: ingest_orders
notebook_task:
notebook_path: ./notebooks/01_ingest_orders
base_parameters:
env: ${var.env}
new_cluster:
spark_version: 12.2.x-scala2.12
node_type_id: ${var.cluster_size}
num_workers: 2
- task_key: transform_silver
depends_on:
- task_key: ingest_orders
notebook_task:
notebook_path: ./notebooks/02_transform_orders
new_cluster:
spark_version: 12.2.x-scala2.12
node_type_id: ${var.cluster_size}
num_workers: 4

Deploying the Bundle

bundle:
targets:
dev:
workspace:
host: https://dev-workspace.azuredatabricks.net
variables:
cluster_size: Standard_DS3_v2
env: dev
prod:
workspace:
host: https://prod-workspace.azuredatabricks.net
variables:
cluster_size: Standard_DS4_v2
env: prod# Validate the bundle (checks syntax and references)
databricks bundle validate --target dev

# Deploy to dev
databricks bundle deploy --target dev

# Run a job defined in the bundle
databricks bundle run daily_order_pipeline --target dev

# Deploy to prod
databricks bundle deploy --target prod

Notebooks as Bundle Artifacts

Notebooks referenced in the bundle are uploaded from the local filesystem when you deploy. The source notebook lives in your Git repo (./notebooks/01_ingest_orders.py), and the bundle CLI uploads it to the workspace path defined in the bundle. This is the Git-native workflow for notebooks that previously required either Databricks Repos or manual imports.

Current State: Developer Preview

This is a developer preview — the feature set is still evolving. Some resource types aren't fully supported yet, and the documentation is sparse in places. I'd use it now for new greenfield projects and for teams that are comfortable working with preview features. For production workloads that are already deployed and stable, I'd wait another quarter or two for the feature set to stabilize. The direction is right; give it time to bake. As always, I'm here to help.

Read more