DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
What is the difference between ETL and ETL?

What is the difference between ETL and ETL?

Comments
9 min read
dbt snapshots: moving from merges to native history

dbt snapshots: moving from merges to native history

1
Comments
5 min read
PySpark to Pandas/scikit-learn: A Practical Migration Guide for Data Engineers Learning ML

PySpark to Pandas/scikit-learn: A Practical Migration Guide for Data Engineers Learning ML

Comments
7 min read
Apache Parquet File Anatomy: Row Groups, Column Chunks, Pages, and Metadata Explained 🧱📦

Apache Parquet File Anatomy: Row Groups, Column Chunks, Pages, and Metadata Explained 🧱📦

Comments
8 min read
🚀 DB Explorer 3.0.1 — The AI‑First SQL Editor You’ll Want to Try

🚀 DB Explorer 3.0.1 — The AI‑First SQL Editor You’ll Want to Try

Comments
1 min read
My first data pipeline

My first data pipeline

Comments
1 min read
ETL vs ELT: Which One Should You Use and Why?

ETL vs ELT: Which One Should You Use and Why?

1
Comments
6 min read
Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS

Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS

Comments
4 min read
Apache Data Lakehouse Weekly: April 3–9, 2026

Apache Data Lakehouse Weekly: April 3–9, 2026

Comments
7 min read
AWS Lake Formation: Why Your Data Lake Permissions Are Probably a Mess (And How to Fix That)

AWS Lake Formation: Why Your Data Lake Permissions Are Probably a Mess (And How to Fix That)

Comments
3 min read
ETL VS ELT: WHICH ONE SHOULD YOU USE AND WHY?

ETL VS ELT: WHICH ONE SHOULD YOU USE AND WHY?

Comments
5 min read
Airflow vs Prefect vs Dagster: Picking the Right Orchestrator in 2026

Airflow vs Prefect vs Dagster: Picking the Right Orchestrator in 2026

Comments
6 min read
Advanced SQL Techniques for Data Analytics Every Data Analyst Should Know

Advanced SQL Techniques for Data Analytics Every Data Analyst Should Know

Comments
6 min read
Your Customer Table Has Duplicates You Can't See With SQL How I Built a Cross-Platform Identity Resolution Layer for a Dark Kitchen Data Platform

Your Customer Table Has Duplicates You Can't See With SQL How I Built a Cross-Platform Identity Resolution Layer for a Dark Kitchen Data Platform

3
Comments
8 min read
How to Bypass the Pandas "Object Tax": Building an 8x Faster CSV Engine in C

How to Bypass the Pandas "Object Tax": Building an 8x Faster CSV Engine in C

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.