Day 9 – Quiet Consistency
Sunday, June 15 — Sunday - Quiet Consistency
🧩 Project Overview
Built a real-time streaming pipeline using:
- Source: KDG (Kinesis Data Generator)
- Stream: Amazon Kinesis Data Stream
- Processor: AWS Lambda (Python)
- Sink: Amazon S3
- Output Format: JSON
- Configured KDG to simulate real-time streaming events.
- Built a Lambda function triggered by Kinesis events to parse and store data as JSON in S3.
📌 Next Up
- Set up Snowpipe to auto-load JSON files from S3 into Snowflake for downstream analytics.
- Add schema enforcement and failure notifications.
- Understood the role of event source mapping between Kinesis and Lambda, how checkpointing and batch windowing affect event processing, how to monitor delivery and processing using CloudWatch logs.
What I studied/researched :
- AWS Kinesis ETL Setup
- Kinesis Data S3 ETL
- SnowPro Core Applications
- Snowflake ETL Orchestration Guide
- dbt Core vs dbt Cloud; dbt explained
- Data Engineering Concepts Explained
- Hadoop Spark AWS Summary
- Some pyspark + pandas
- Docker desktop, Linux kernel, container runtime, pem file, breakpoints, APIs in data engineering
- CHECK keyword
- what is one hot encoding
Still moving. Still learning.