DP-750 : Implement data engineering solutions using Azure Databricks

4 Day
Download Contents

Intermediate

Regular Price : ~~$2400.00~~
Offer Price :$1999.00

Course Overview

Master end-to-end data engineering with Azure Databricks and Unity Catalog. This course moves from foundational setup to production deployment, covering environment configuration and enterprise-grade governance. Learn to build robust ingestion pipelines, implement security with Unity Catalog, and deploy optimized workloads. By the end, you will have the practical skills to implement, secure, and maintain scalable lakehouse solutions that meet rigorous enterprise requirements.

Course Outline

Learning Path1: Explore Azure Databricks

Get Started With Azure Databricks
Identify Azure Databricks Workloads
Understand Key Concepts
Data Governance Using Unity Catalog and Microsoft Purview

Learning Path 2: Select and Configure Compute in Azure Databricks

Choose an appropriate compute type
Configure compute performance
Configure compute features
Install libraries for compute
Configure compute access

Learning Path 3: Create and organize objects in Unity Catalog

Apply naming conventions
Create catalog
Create schema
Create tables and views
Create volumes
Implement DDL operations
Implement foreign catalog
Configure AI/BI Genie instructions

Learning Path 4: Secure Unity Catalog objects

Understand query lifecycle
Implement access control strategies
Understand fine-grained access control
Implement row filtering and column masking
Access Azure Key Vault secrets
Authenticate data access with service principals
Authenticate resource access with managed identities

Learning Path 5: Govern Unity Catalog objects

Create and preserve table definitions
Configure ABAC with tags and policies
Apply data retention policies
Set up and manage data lineage
Configure audit logging
Design secure Delta Sharing strategy

Learning Path 6: Design and implement data modeling with Azure Databricks

Design ingestion logic and data source configuration
Choose a data ingestion tool
Choose a data table format
Design and implement a data partitioning scheme
Choose a slowly changing dimension (SCD) type
Implement a slowly changing dimension (SCD) type 2
Design and implement a temporal (history) table to record changes over time
Choose granularity on a column or table based on requirements
Choose managed vs unmanaged tables
Design and implement a clustering strategy

Learning Path 7: Ingest data into Unity Catalog

Ingest data with Lakeflow Connect
Ingest data with notebooks
Ingest data with SQL methods
Ingest data with CDC feed
Ingest data with Spark Structured Streaming
Ingest data with Auto Loader
Ingest data with Lakeflow Spark Declarative Pipelines

Learning Path 8: Cleanse, transform, and load data into Unity Catalog

Profile data
Choose column data types
Resolve duplicates and nulls
Transform data with filters and aggregations
Transform data with joins and set operators
Transform data with denormalization and pivots
Load data with merge, insert, and append

Learning Path 9: Implement and manage data quality constraints with Azure Databricks

Implement validation checks
Implement data type checks
Detect and manage schema drift
Manage data quality with pipeline expectations.

Learning Path 10: Design and implement data pipelines with Azure Databricks

Design order of operations for a pipeline
Choose notebook vs Lakeflow Pipelines
Design Lakeflow job logic
Design error handling in pipelines and jobs
Create pipeline with notebook
Create pipeline with Lakeflow Spark Declarative Pipelines

Learning Path 11: Implement Lakeflow Jobs with Azure Databricks

Create job setup and configuration
Configure job triggers
Schedule a job
Configure job alerts
Configure automatic restarts

Learning Path 12: Implement development lifecycle processes in Azure Databricks

Apply Git version control best practices
Manage branching and pull requests
Implement testing strategy
Configure and package Declarative Automation Bundles
Deploy bundle with Databricks CLI

Learning Path 13: Monitor, troubleshoot and optimize workloads in Azure Databricks

Monitor and manage cluster consumption
Troubleshoot and repair Lakeflow Jobs
Troubleshoot Spark jobs and notebooks
Implement log streaming with Azure Log Analytics

Course Objectives

By the end of this course, learners will be able to:

Set up and configure Azure Databricks environments and compute resources
Implement data governance and security using Unity Catalog
Design and build scalable data ingestion pipelines (batch and streaming)
Transform and process data into analytics-ready formats
Design lakehouse architectures and efficient data models
Deploy, monitor, and maintain data pipelines and workloads
Optimize performance and manage enterprise-scale data solutions
Apply best practices for data quality, security, and governance

Pre-requisites

Before taking this course, learners should have:

Fundamental knowledge of data analytics concepts
Basic understanding of cloud storage and Azure fundamentals
Familiarity with SQL and data organization principles
Experience with Python (including notebooks)
Understanding of data engineering or data warehouse concepts (recommended)
Familiarity with Azure Databricks, Unity Catalog, and data access patterns (helpful)
Basic knowledge of Azure security concepts (Microsoft Entra ID)
Familiarity with Git and version control fundamentals

For any custom schedule, please email us at info@gtechlearn.com or Call us at 1-844-355-9898(Toll Free - North America) or 1800 212 9096 (Toll Free - India)

This course includes:

Official MS Learn Courseware

Exam Preps

Achievement Badge from Microsoft

Course Completion Certificate

Post Training Support

Experienced & Certified Instructors

Train from AnyWhere

Interactive Hands-On Labs

Personalized Learning Plans

Flexible Scheduling

Accredited Training

Cost-Effective Pricing