Databricks for Energy & Utilities

Hi, I’m Denis Gontcharov.

I’m a data engineer who helps product owners in energy & utilities ingest time series data from legacy SCADA systems into Azure Databricks, so they can build reliable data products for operational insights and regulatory compliance.

Learn Spark the slow way. I send a newsletter whenever I learn something new in Spark or Databricks. Follow along as I fumble through Spark’s twisted labyrinth and share all the bruises.

🎥 Deploying a Databricks Asset Bundle with Azure DevOps Pipelines

Video Objectives In this post we will deploy a Databricks Asset Bundle or DAB from a Git repository hosted on Azure DevOps using Azure DevOps pipelines. In summary, we will learn how to: Grant Databricks access to your Azure DevOps Git repository. Define a simple DAB that deploys a Databricks notebook. Learn how to use the Databricks CLI to validate and deploy DABs. Write a Azure DevOps pipeline to deploy this DAB. Pass parameters from the DAB into the Databricks notebook. Concerning the last point, it’s not uncommon that your code differs slightly in each Databricks environment (dev, test, prod). For example, you may have an Azure key vault my_key_vault_dev for the development workspace and my_key_vault_prod for the production workspace. We will see how to pass this workspace-dependent data from the DAB to Databricks Notebooks via widgets. ...

🎧 Industrial Data Quality Podcast E5: Concrete AI Applications in Heavy Industry with John Walmsley

In this episode of the Industrial Data Quality Podcast, I talk with with John Walmsley of Aluminate Technologies, about what AI actually does in heavy industry today, cutting through the hype to explore real applications and challenges. John brings experience from semiconductors to medical devices to AI in heavy industry. The conversation covers three levels of industrial AI: continuous monitoring, multi-sensor analysis, and autonomous optimization. Using aluminum industry examples, we explore why AI projects get stuck in pilot phase and what it takes to scale solutions enterprise-wide. ...

🎥 Configure and Deploy Databricks Asset Bundle

In this new video I share how to overcome Azure CPU quota limits with Databricks Asset Bundles, a common roadblock many Databricks practitioners face when deploying Databricks Asset Bundles on Azure for the first time. Problem If you’re playing around with Databricks projects, Azure’s default CPU quota limits often fall short of what Databricks Asset Bundle Python template jobs and pipelines actually need to run. ...

Installing Espanso on Void Linux with Gnome on Wayland

I’m a big fan of text expanders, espanso being my favorite one. As Xorg is no longer maintained, I recently switched to Wayland and the Gnome desktop environment. Unfortunately, I found out that Wayland support is experimental. This means espanso has to be installed from source. I browsed some forums and saw that installing espanso on Wayland is tricky. After some tinkering, I managed to get espanso running on my system: Void Linux x86_64 6.12.30_1 Gnome 47.4 on Wayland Install rustup Install rust via rustup. ...

🎧 Industrial Data Quality Podcast E4: Evolution of OT Data Integration with Lonnie Bowling

I’m excited to share the latest episode of the Industrial Data Podcast featuring Lonnie Bowling, founder of Diemus Consulting. We explore the fascinating evolution of OT data integration, from basic SCADA trend-viewing to today’s cloud-based AI analytics. Lonnie shares how historians like OSIsoft PI transformed from simple tools to enterprise-wide platforms, while discussing the ongoing industry consolidation (AVEVA acquisition of OSIsoft) reshaping the landscape. Despite technological advances, we’re wrestling with old problems: the challenges of proprietary formats, inconsistent naming conventions, and questionable data quality persist as companies migrate to platforms like Databricks for advanced analytics. ...