Surfalytics
All pet projects
Data Engineering beginner ⏱ 8–10 hours

Getting Started with Databricks

A structured introduction to the Databricks platform — notebooks, clusters, Delta Lake, and Unity Catalog, built with the community.

DatabricksDelta LakePySparkUnity Catalog
View project on GitHub

What you’ll build

A Databricks workspace exploration covering all the key platform capabilities: running notebooks, managing clusters, working with Delta Lake tables, Unity Catalog governance, and building a simple pipeline from raw files to a clean table.

Skills you’ll practice

  • Databricks workspace navigation: repos, jobs, clusters
  • Delta Lake fundamentals: ACID transactions, versioning, OPTIMIZE
  • Unity Catalog: schemas, volumes, and access control
  • Databricks Workflows: multi-task job orchestration