BigQuery → CSV Export → Vendor Delivery (Reverse ETL)
- Difficulty: Intermediate
- Tech stack: BigQuery, GCS (signed URLs), Airflow/Composer, SFTP (optional)
- Estimated time: 1-2 hrs
Overview
A scheduled job runs a BigQuery query to create a segment table/view, then exports it to GCS as CSV (optionally gzipped). A small task generates a manifest (row count, checksum) and applies a strict filename pattern like `vendor_name/exports/segment_YYYYMMDD.csv.gz`. Delivery is completed either by sharing a signed URL or by SFTP upload to the vendor. Files and manifests are archived with logs for auditability; failures alert via email/Slack.
Outcome
- Turn warehouse tables into vendor-ready CSV extracts on a cadence.
- Integrity & compliance with filename conventions, manifests, and checksums.
- Automated delivery via Composer (GCS drops, signed URLs, or SFTP).
What you’ll build
- BigQuery SQL to materialize segments (e.g., high-value buyers).
- Export step to GCS as CSV (optional gzip), with date-stamped naming.
- Manifest + checksum generator (row count, MD5/SHA256).
- SFTP push (shell) to vendor dropbox
- (Optional) email notification with file manifest
- Composer DAG (or Cron) to orchestrate export → validate → deliver → archive.