← Back

ADF + Databricks → Medallion Architecture (Bronze/Silver/Gold)

Orchestrate ADLS → Databricks ELT into Bronze/Silver/Gold Delta tables – partitioned, Z-Ordered, and scheduled.

adf-databricks-medallion-architecture

Overview

Files arrive in ADLS raw and are ingested by ADF into Bronze as Delta. Databricks notebooks read Bronze, apply cleaning and conformance to produce Silver (row-level quality, proper types, SCD-ready joins). A second notebook aggregates/enriches into Gold for BI/ML. Tables are partitioned (e.g., by date) and Z-Ordered on common filters; periodic `OPTIMIZE`/`VACUUM` keeps storage and query performance healthy. ADF triggers orchestrate Bronze→Silver→Gold with clear run logs.

Outcome

  • Lakehouse-ready data on Azure using the Medallion pattern.
  • Reliable Delta tables with ACID, schema enforcement, and time travel.
  • Faster queries via partitioning, Z-Order, and periodic OPTIMIZE/VACUUM.

What you’ll build

  • ADLS Gen2 layout: raw/bronze, silver, gold (clear folder/table conventions).
  • ADF pipeline to ingest CSV/JSON into Bronze (landing → Bronze Delta).
  • Databricks notebooks to transform Bronze → Silver (dedupe, schema, joins) and Silver → Gold (aggregations/KPIs).
  • Delta performance ops: partitioning, Z-Order, `OPTIMIZE` + `VACUUM` schedule.
  • (Optional) Lightweight DQ checks (row counts, null %, simple constraints).
  • (Optional) ADF triggers for end-to-end orchestration + dependencies.