Vertex AI → Predict → BigQuery (ML Pipeline)

Run batch predictions with Vertex AI on BigQuery features and write results back to BigQuery for reporting.

Difficulty: Intermediate
Tech stack: Vertex AI (AutoML/Custom), BigQuery, GCS, Composer (optional), Looker/Data Studio (optional)
Estimated time: 2 hrs

Overview

You’ll curate a BigQuery features table, then point a Vertex AI model (AutoML or custom) to those features for batch prediction. Predictions land in GCS as files, which are then loaded back into a BigQuery table (e.g., `predictions_churn_daily`) with keys, timestamps, and scores. A scheduler (Composer) runs the flow on cadence and can also kick off model retraining when enough new data accumulates—keeping predictions fresh for BI and downstream activation.

Outcome

Production-style ML embedded in your data pipeline (batch, reliable).
Seamless handoff: features from BigQuery → predictions back to BigQuery.
Operational cadence with scheduled prediction + periodic retraining.

What you’ll build

A features table in BigQuery (joins/aggregations over raw data).
Vertex AI dataset/model (AutoML or pre-trained custom model).
A batch prediction job (source from BigQuery or GCS) that writes outputs to GCS.
A load step that ingests predictions (scores, labels) back into BigQuery.
(Optional) Looker/Data Studio report on top of the predictions.
(Optional) Composer DAG to orchestrate export → predict → load → report refresh.
(Optional) Periodic retraining job that reuses the latest features.

Join the waitlist