Saturday, June 14 - How it’s Going

🛠️ How ChatGPT wants me to learn:


🔁 Your Daily Strategy

  • Each day, divide your time like this:

Coding + SQL Practice (90 mins):

  • Practice real-world SQL and Python problems (LeetCode, StrataScratch, or mock datasets).
  • Use Snowflake or SQLite locally if needed.

System Design + Data Engineering Concepts (90 mins):

  • Pick 1–2 topics: e.g., Airflow DAGs, streaming architecture, or designing a data pipeline.
  • Write or draw the architecture and explain it aloud.

Mock Interview / Recap / Flashcards (60–90 mins):

  • Summarize what you learned.
  • Use a mirror or record yourself to practice articulating.
  • Prepare flashcards for services like Kinesis, SQS, etc.

Review New Tools / Weak Areas (Optional 30–60 mins):

  • Learn AWS services you don’t know.
  • Read about Kafka, DBT, or Flink — but only surface-level unless they come up in your role.

📚 Topics to Focus On (Prioritized)

⭐⭐⭐SQL + Python

  • How To Approach It : Practice querying Snowflake tables, window functions, CTEs, pivot/unpivot. Build a mini ETL in Python (e.g., CSV → clean → Snowflake).

⭐⭐⭐Data Pipeline Design (ETL)

  • How To Approach It : Draw DAGs (Airflow), discuss failure handling, retries, monitoring, backfills. Be ready to design a batch or streaming pipeline.

⭐⭐⭐Snowflake Internals

  • How To Approach It : Learn about Virtual Warehouses, Caching, Clustering, Query Optimizer, Storage Layers. Be able to tune a slow Snowflake query.

⭐⭐Airflow

  • How To Approach It : Understand DAGs, tasks, operators, XComs, retries, scheduling.

⭐⭐AWS (Kinesis, SQS, SNS)

  • How To Approach It : Learn core use-cases. Know how you’d stream data from Kafka → Kinesis → Snowflake.

⭐⭐Docker + Cron + Linux

  • How To Approach It : Know how to containerize a job, set cron for it, monitor logs.

⭐⭐System Design

  • How To Approach It : Know how to scale a data pipeline, design for fault-tolerance, data quality checks, backfilling.

⭐Mongo, DBT, Flink, Kafka

  • How To Approach It : Know what they are, and basic use-cases. For DBT, understand how to write and run a model.

⭐GenAI / DS

  • How To Approach It : Be ready to explain how you’d integrate GenAI into a data platform (e.g., metadata summarization, anomaly detection, natural language querying).

⭐Basic DSA

  • How To Approach It : Focus only on arrays, strings, hashmaps, sorting, and recursion..

What I studied/researched :