Redshift → UNLOAD → S3 → Partner SFTP (Reverse ETL)
- Difficulty: Intermediate
- Tech stack: Redshift (UNLOAD), S3, SFTP (OpenSSH/Paramiko), Step Functions / MWAA, KMS/PGP (optional)
- Estimated time: 2 hrs
Overview
A scheduled workflow runs a Redshift query (or uses a view), then `UNLOAD`s results to S3 (compressed CSV or Parquet; optionally KMS-encrypted). A packaging task renames files with a strict pattern (e.g., `partnerA/orders_YYYYMMDD.csv.gz`), emits a manifest (row count, byte size), and generates a checksum; PGP encryption can be applied when required. A delivery task uploads to the partner’s SFTP with retries and verifies completion. All artifacts and logs are persisted to S3 for audit, and alerts fire on failure.
Outcome
Reliable partner deliveries from warehouse to SFTP on a cadence.
Compliance-friendly extracts with compression, encryption, and manifests.
Operational transparency via logs/metrics and archives.
What you’ll build
- Redshift SQL for curated extracts (facts + dims → query/view).
- `UNLOAD` job to S3 (CSV or Parquet, compressed; optional KMS encryption).
- Packaging step: date-stamped filenames, manifest + checksum (MD5/SHA256), optional PGP encryption.
- Delivery step: SFTP upload (OpenSSH/Paramiko) with retries & exit-code checks.
- Archive in S3 (`exports/history/`) and delivery logs (S3/DynamoDB).
- Orchestration via Step Functions or MWAA (Airflow); success/failure notifications.
- (Optional) Swap SFTP push with AWS Transfer Family (S3-backed) to let partners pull.