Redshift → UNLOAD → S3 → Partner SFTP (Reverse ETL)

Architecture Diagram

redshift-unload-s3-sftp

Overview

A scheduled workflow runs a Redshift query (or uses a view), then `UNLOAD`s results to S3 (compressed CSV or Parquet; optionally KMS-encrypted). A packaging task renames files with a strict pattern (e.g., `partnerA/orders_YYYYMMDD.csv.gz`), emits a manifest (row count, byte size), and generates a checksum; PGP encryption can be applied when required. A delivery task uploads to the partner’s SFTP with retries and verifies completion. All artifacts and logs are persisted to S3 for audit, and alerts fire on failure.

What You Will Build

  • Redshift SQL for curated extracts (facts + dims → query/view).
  • `UNLOAD` job to S3 (CSV or Parquet, compressed; optional KMS encryption).
  • Packaging step: date-stamped filenames, manifest + checksum (MD5/SHA256), optional PGP encryption.
  • Delivery step: SFTP upload (OpenSSH/Paramiko) with retries & exit-code checks.
  • Archive in S3 (`exports/history/`) and delivery logs (S3/DynamoDB).
  • Orchestration via Step Functions or MWAA (Airflow); success/failure notifications.
  • (Optional) Swap SFTP push with AWS Transfer Family (S3-backed) to let partners pull.

Tech Stack

Redshift (UNLOAD), S3, SFTP (OpenSSH/Paramiko), Step Functions / MWAA, KMS/PGP (optional)

Learning Outcomes

Reliable partner deliveries from warehouse to SFTP on a cadence.
Compliance-friendly extracts with compression, encryption, and manifests.
Operational transparency via logs/metrics and archives.

Recommended Before This

Reliable partner deliveries from warehouse to SFTP on a cadence.
Compliance-friendly extracts with compression, encryption, and manifests.
Operational transparency via logs/metrics and archives.