Day 6 – Building Blocks
Thursday, June 12 — Building Blocks
🛠️ What I Worked On
- Finalizing the Airflow setup — DAGs, environment config, and local testing. Airflow 3 UI is quite different from earlier ones which are in the videos
🚀 To do:
Building a full-fledged ETL pipeline from scratch. Here’s what I want to include:
- Source: External APIs as data sources (JSON/CSV responses)
- Ingestion: Use AWS Lambda / API Gateway or direct cron-based fetch
- Streaming (optional): AWS Kinesis for real-time ingestion (stretch goal)
- Processing & Orchestration: Airflow + possibly AWS Glue
- Storage & Query: Snowflake as the data warehouse
- Transformations: Use dbt or Snowflake’s native SQL/Snowpark
- Dashboard: Lightweight Streamlit app or Snowsight for visualization