← Back

On-Prem RDBMS → ADF → ADLS → Synapse (Batch Ingestion + ELT)

Pull from on-prem SQL/Oracle with ADF (SHIR), land to ADLS, and load/transform in Synapse with COPY INTO + ELT.

onprem-adf-adls-synapse

Overview

ADF (with Self-Hosted IR) securely connects to your on-prem RDBMS and extracts full or delta sets into ADLS Gen2. Files are stored in a raw zone, then (optionally) standardized to Parquet in a curated zone. Synapse ingests from ADLS using COPY INTO (or PolyBase/external tables) into staging; ELT SQL transforms produce facts/dims with appropriate partitioning/indexing. ADF triggers schedule the flow and log runs; Key Vault stores credentials; simple row-count checks keep the pipeline trustworthy.

Outcome

  • Real-world migration path from on-prem DBs to Azure analytics.
  • Durable lake-first pattern (ADLS raw → curated) feeding Synapse.
  • Repeatable delta loads on a schedule with centralized monitoring.

What you’ll build

  • ADF pipelines using Self-Hosted IR to extract from SQL Server/Oracle.
  • Lake layout on ADLS Gen2: raw/curated (Parquet/CSV, partitioned).
  • Synapse staging + modeled tables; loads via COPY INTO (or PolyBase/external tables if preferred).
  • Incremental load pattern (watermark/`last_updated`) with validation (row counts).
  • (Optional) Secrets in Key Vault, alerts/monitoring in ADF/Synapse.
  • (Optional) Orchestration dependencies (raw → stage → model) with ADF triggers.