You’ll configure Sqoop to pull from an OLTP database into HDFS; first a full load, then incremental loads using a watermark column. Data is stored in columnar formats and surfaced in Hive (external/managed, partitioned). Jobs run via shell/cron for repeatable batch ingestion.