🎥 Deploying a Databricks Asset Bundle with Azure DevOps Pipelines

Video Objectives In this post we will deploy a Databricks Asset Bundle or DAB from a Git repository hosted on Azure DevOps using Azure DevOps pipelines. In summary, we will learn how to: Grant Databricks access to your Azure DevOps Git repository. Define a simple DAB that deploys a Databricks notebook. Learn how to use the Databricks CLI to validate and deploy DABs. Write a Azure DevOps pipeline to deploy this DAB. Pass parameters from the DAB into the Databricks notebook. Concerning the last point, it’s not uncommon that your code differs slightly in each Databricks environment (dev, test, prod). For example, you may have an Azure key vault my_key_vault_dev for the development workspace and my_key_vault_prod for the production workspace. We will see how to pass this workspace-dependent data from the DAB to Databricks Notebooks via widgets. ...

Hosting Great Expectations Data Docs on Azure Blob Storage

Resources Check out the complete code on GitHub. Browse the GX Data Doc on Azure Blob Storage. Use Case Last week I explored Soda as a data quality testing framework for my large enterprise client. This week I’m exploring a more mature alternative called Great Expectations or GX in short. GX generates neat HTML reports called Data Docs that give an overview of your data quality test results. The client wants to share these reports with the team - but not with the world! As the client is already using Azure, hosting the report files on Azure Blob Storage seems like a good solution. ...