DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
AWS Lambda and AWS Glue Python Shell in the Context of Lightweight ETL

AWS Lambda and AWS Glue Python Shell in the Context of Lightweight ETL

3
Comments
7 min read
SQL: Doing GROUP BY in CsvPath

SQL: Doing GROUP BY in CsvPath

Comments
5 min read
🔥 Day 3: RDDs - The Foundation of Spark

🔥 Day 3: RDDs - The Foundation of Spark

Comments
2 min read
🔥 Day 4: RDD Internals - Partitions, Shuffles & Repartitioning Demystified

🔥 Day 4: RDD Internals - Partitions, Shuffles & Repartitioning Demystified

Comments
2 min read
The Developer's Guide to Normalizing Historical Airline Flight Data for Machine Learning

The Developer's Guide to Normalizing Historical Airline Flight Data for Machine Learning

Comments
6 min read
Overview of Real-Time Data Synchronization from MySQL to VeloDB

Overview of Real-Time Data Synchronization from MySQL to VeloDB

5
Comments
5 min read
Stop Writing df.describe(): Automate EDA with D-Tale (The Lazy Engineer's Way)

Stop Writing df.describe(): Automate EDA with D-Tale (The Lazy Engineer's Way)

Comments
3 min read
CHW Monthly Activity Aggregation: Turning Visit Logs into Insight

CHW Monthly Activity Aggregation: Turning Visit Logs into Insight

Comments
5 min read
🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally

🔥 Day 2: Understanding Spark Architecture - How Spark Executes Your Code Internally

Comments
2 min read
When models suggest deprecated Pandas APIs: a small mistake that cascades

When models suggest deprecated Pandas APIs: a small mistake that cascades

Comments
3 min read
Marmot: Data catalog without the complex infrastructure

Marmot: Data catalog without the complex infrastructure

1
Comments
3 min read
TDD for dbt: unit testing the way it should be

TDD for dbt: unit testing the way it should be

2
Comments
12 min read
When code-gen suggests deprecated Pandas APIs: a case study in subtle breakage

When code-gen suggests deprecated Pandas APIs: a case study in subtle breakage

Comments
3 min read
Schema, COPY, MERGE, and Immutability — A First-Principles Guide for Data Engineers

Schema, COPY, MERGE, and Immutability — A First-Principles Guide for Data Engineers

Comments
5 min read
HackerRank 'The Pads' MySQL

HackerRank 'The Pads' MySQL

Comments
3 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.