Data Engineering intermediate ⏱ 5–7 hours
Databricks DQX: Data Quality at Scale
Use Databricks DQX to run data quality checks inside notebooks and pipelines — validate schemas, completeness, and custom rules.
DatabricksData QualityDQXPySpark
View project on GitHub
What you’ll build
A Databricks notebook demonstrating DQX checks on a real dataset: schema validation, null checks, value range assertions, and custom rules. Integrates with Delta Lake pipelines for continuous quality monitoring.
Skills you’ll practice
- Databricks DQX API: rules, checks, and result reporting
- Embedding data quality into Delta Live Tables or notebooks
- Custom rule definitions for business logic validation
- Monitoring quality trends over time