<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Databricks Asset Bundle on Denis Gontcharov</title>
    <link>https://gontcharov.eu/tags/databricks-asset-bundle/</link>
    <description>Recent content in Databricks Asset Bundle on Denis Gontcharov</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Tue, 22 Jul 2025 11:43:12 +0200</lastBuildDate><atom:link href="https://gontcharov.eu/tags/databricks-asset-bundle/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>🎥 Deploying a Databricks Asset Bundle with Azure DevOps Pipelines</title>
      <link>https://gontcharov.eu/posts/youtube/databricks-dab-azure-devops-pipelines/</link>
      <pubDate>Tue, 22 Jul 2025 11:43:12 +0200</pubDate>
      
      <guid>https://gontcharov.eu/posts/youtube/databricks-dab-azure-devops-pipelines/</guid>
      <description>&lt;h1 id=&#34;video&#34;&gt;Video&lt;/h1&gt;
&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
      &lt;iframe allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&#34; loading=&#34;eager&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; src=&#34;https://www.youtube.com/embed/jVxip1rm3SA?autoplay=0&amp;amp;controls=1&amp;amp;end=0&amp;amp;loop=0&amp;amp;mute=0&amp;amp;start=0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; title=&#34;YouTube video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;

&lt;h1 id=&#34;objectives&#34;&gt;Objectives&lt;/h1&gt;
&lt;p&gt;In this post we will deploy a Databricks Asset Bundle or DAB from a Git repository hosted on Azure DevOps using Azure DevOps pipelines. In summary, we will learn how to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Grant Databricks access to your Azure DevOps Git repository.&lt;/li&gt;
&lt;li&gt;Define a simple DAB that deploys a Databricks notebook.&lt;/li&gt;
&lt;li&gt;Learn how to use the Databricks CLI to validate and deploy DABs.&lt;/li&gt;
&lt;li&gt;Write a Azure DevOps pipeline to deploy this DAB.&lt;/li&gt;
&lt;li&gt;Pass parameters from the DAB into the Databricks notebook.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Concerning the last point, it&amp;rsquo;s not uncommon that your code differs slightly in each Databricks environment (dev, test, prod). For example, you may have an Azure key vault &lt;code&gt;my_key_vault_dev&lt;/code&gt; for the development workspace and &lt;code&gt;my_key_vault_prod&lt;/code&gt; for the production workspace. We will see how to pass this workspace-dependent data from the DAB to Databricks Notebooks via widgets.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h1 id="video">Video</h1>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/jVxip1rm3SA?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<h1 id="objectives">Objectives</h1>
<p>In this post we will deploy a Databricks Asset Bundle or DAB from a Git repository hosted on Azure DevOps using Azure DevOps pipelines. In summary, we will learn how to:</p>
<ul>
<li>Grant Databricks access to your Azure DevOps Git repository.</li>
<li>Define a simple DAB that deploys a Databricks notebook.</li>
<li>Learn how to use the Databricks CLI to validate and deploy DABs.</li>
<li>Write a Azure DevOps pipeline to deploy this DAB.</li>
<li>Pass parameters from the DAB into the Databricks notebook.</li>
</ul>
<p>Concerning the last point, it&rsquo;s not uncommon that your code differs slightly in each Databricks environment (dev, test, prod). For example, you may have an Azure key vault <code>my_key_vault_dev</code> for the development workspace and <code>my_key_vault_prod</code> for the production workspace. We will see how to pass this workspace-dependent data from the DAB to Databricks Notebooks via widgets.</p>
<h1 id="project-overview">Project Overview</h1>
<p>The project directory in the Git repository consists of just three files and a README:</p>
<pre tabindex="0"><code class="language-stdout" data-lang="stdout">.
├── README.md --&gt; Documentation
├── azure_devops_pipeline.yml --&gt; Azure DevOps pipeline YAML file
├── databricks.yml --&gt; The DAB YAML file with a notebook task
└── demo_notebook.ipynb --&gt; The minimal Databricks notebook
</code></pre><p>On a high level, we define a Databricks notebook. This notebook will be executed as part of a Databricks job defined in the DAB. This DAB will be automatically deployed to our Databricks workspace using the Azure DevOps Pipeline.</p>
<h1 id="databricks-notebook">Databricks Notebook</h1>
<p>The notebook that is executed by the workflow consists of just two lines:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">value</span> <span class="o">=</span> <span class="n">dbutils</span><span class="o">.</span><span class="n">widgets</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;demo_parameter&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
</span></span></code></pre></div><p>We simply read and print a value from a <a href="https://docs.databricks.com/aws/en/jobs/parameter-use">Databricks notebook parameter</a>. This value is set in the DAB file, and can therefore differ for each environment (e.g. development, test, production). For example, the <code>git_branch</code> for our hypothetical <em>&ldquo;dev&rdquo;</em> environment could be <em>&ldquo;develop&rdquo;</em>.</p>
<h1 id="databricks-asset-bundle-dab">Databricks Asset Bundle (DAB)</h1>
<p>Having defined the notebook above, we now define a Databricks job that executes the notebook as a notebook task.</p>
<h2 id="databricks-asset-bundle-yaml">Databricks Asset Bundle YAML</h2>
<p>The code below defines the Databricks job. Pay attention to the following important elements:</p>
<ol>
<li>The DAB defines two variables <code>git_branch</code> and <code>demo_parameter_value</code>. The value for these two variables is defined in the target <code>free</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.</li>
<li>We define a text parameter <code>demo_parameter</code> for the notebook and assign it a value via <code>${var.demo_parameter_value}</code> by referring to the variable created in the previous point.</li>
<li>We use the <code>git_branch</code> parameter from the previous point to pull the code from the head of the main branch (instead of a Databricks workspace). The <code>git_url</code> points to our Azure DevOps Git repository<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>.</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">bundle</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;DAB-Demo&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">uuid</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;05622722-fb3a-4a17-8f1f-c3c1d37ececb&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">variables</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">git_branch</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Git branch to use for job source code&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">demo_parameter_value</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Text value to pass as a Databricks notebook parameter&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">presets</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">tags</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">application</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Demo Notebook&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">targets</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">free</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">mode</span><span class="p">:</span><span class="w"> </span><span class="l">development</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">workspace</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">https://dbc-e667f434-e97e.cloud.databricks.com</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">variables</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">git_branch</span><span class="p">:</span><span class="w"> </span><span class="l">main</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">demo_parameter_value</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Hello, World!&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">resources</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">run_demo_notebook</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">run_demo_notebook_job</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">tasks</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">task_key</span><span class="p">:</span><span class="w"> </span><span class="l">run_demo_notebook_task</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">notebook_task</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">notebook_path</span><span class="p">:</span><span class="w"> </span><span class="l">demo_notebook</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">base_parameters</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">              </span><span class="nt">demo_parameter</span><span class="p">:</span><span class="w"> </span><span class="l">${var.demo_parameter_value}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nt">source</span><span class="p">:</span><span class="w"> </span><span class="l">GIT</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">git_source</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">git_url</span><span class="p">:</span><span class="w"> </span><span class="l">https://gontcharovd@dev.azure.com/gontcharovd/databricks-dab-demo/_git/databricks-dab-demo</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">git_provider</span><span class="p">:</span><span class="w"> </span><span class="l">azureDevOpsServices</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">git_branch</span><span class="p">:</span><span class="w"> </span><span class="l">${var.git_branch}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">schedule</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">quartz_cron_expression</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;0 0 7 * * ?&#34;</span><span class="w">  </span><span class="c"># Daily at 7:00 AM UTC</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">timezone_id</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;UTC&#34;</span><span class="w">
</span></span></span></code></pre></div><h2 id="authorize-databricks-to-pull-code-from-azure-devops-repo">Authorize Databricks to pull code from Azure DevOps repo</h2>
<p>Databricks needs to authenticate with Azure DevOps to pull the Git repository in the workspace. This requires creating a Personal Access Token (PAT) in Azure DevOps.</p>
<p>In Azure DevOps, navigate to &ldquo;user settings&rdquo; in the top-right corner (next to your account profile picture). Click on &ldquo;Personal access tokens&rdquo;. Create a new token with read/write access for Code for your organization or project. Copy the value.</p>
<p>In Databricks, click on your account profile picture in the top-right corner. Go to &ldquo;Settings&rdquo; and to &ldquo;Linked accounts&rdquo;. Click on &ldquo;Add Git credential&rdquo;. Fill out the fields (picture below) and paste the PAT value copied in earlier.</p>
<p><img loading="lazy" src="/posts/youtube/databricks-dab-azure-devops-pipelines/authentication.png" type="" alt=""  /></p>
<h2 id="manual-dab-deployment">Manual DAB Deployment</h2>
<p>Now that we have defined the DAB and authorized Databricks to access our Azure DevOps repo, we can deploy the DAB and run the created Databricks job. As a first step, we will deploy manually using the <a href="https://learn.microsoft.com/en-us/azure/databricks/dev-tools/cli/install">Databricks CLI</a>.</p>
<p>After installation, login and create a profile &ldquo;free&rdquo;. Replace the <code>host</code> URL with the correct link to your Databricks (free) workspace.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">databricks auth login
</span></span></code></pre></div><p>Let&rsquo;s validate the bundle:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">databricks bundle validate -t free
</span></span></code></pre></div><p>Output:</p>
<pre tabindex="0"><code class="language-stdout" data-lang="stdout">Name: DAB-Demo
Target: free
Workspace:
  Host: https://dbc-e667f434-e97e.cloud.databricks.com
  User: denis@gontcharov.eu
  Path: /Workspace/Users/denis@gontcharov.eu/.bundle/DAB-Demo/free

Validation OK!
</code></pre><p>Everything looks good. Let&rsquo;s deploy the bundle:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">databricks bundle deploy -t free
</span></span></code></pre></div><p>Output:</p>
<pre tabindex="0"><code class="language-stdout" data-lang="stdout">Uploading bundle files to /Workspace/Users/denis@gontcharov.eu/.bundle/DAB-Demo/free/files...
Deploying resources...
Updating deployment state...
Deployment complete!
</code></pre><h2 id="running-the-workflow">Running the workflow</h2>
<p>We can see the final workflow in the Jobs &amp; Pipelines view in the Databricks UI:</p>
<p><img loading="lazy" src="/posts/youtube/databricks-dab-azure-devops-pipelines/workflow.png" type="" alt=""  /></p>
<p>Click on the &ldquo;Play&rdquo; button to execute the job:</p>
<p><img loading="lazy" src="/posts/youtube/databricks-dab-azure-devops-pipelines/notebook_run.png" type="" alt=""  /></p>
<p>Notice how the value <em>&ldquo;Hello, World!&rdquo;</em> came from the DAB file.</p>
<h1 id="azure-devops-pipeline">Azure DevOps Pipeline</h1>
<p>Now that we verified that manual deployment works, we want to automate the deployment process. Concretely, we want to redeploy the DAB whenever a commit/merge is made on the main branch. This is accomplished by a Azure DevOps pipelines that we will configure in the next part.</p>
<h2 id="pipeline-yaml">Pipeline YAML</h2>
<p>The code below defines the Azure DevOps pipeline that deploys the resources defined in the DAB to the &ldquo;free&rdquo; target. Notice the following points:</p>
<ol>
<li>The pipeline is triggered whenever a change to the files <em>demo_notebook.ipynb</em>, <em>databricks.yaml</em>, or <em>azure_devops_pipeline.yml</em> on the <code>main</code> branch is made.</li>
<li>The <code>condition</code> statement is important to trigger a particular job for a particular branch.</li>
<li>The job steps rely on two variables <code>DATABRICKS_TOKEN</code> and <code>DATABRICKS_WORKSPACE</code> defined in the <code>databricks-free-variables-group</code>. We will define these variables later.</li>
</ol>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yml" data-lang="yml"><span class="line"><span class="cl"><span class="nt">trigger</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">branches</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">include</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">main</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">paths</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">include</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">demo_notebook.ipynb</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">databricks.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">azure_devops_pipeline.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">jobs</span><span class="p">:</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">job</span><span class="p">:</span><span class="w"> </span><span class="l">DeployFree</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">displayName</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;Deploy to free Databricks workspace&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">condition</span><span class="p">:</span><span class="w"> </span><span class="l">eq(variables[&#39;Build.SourceBranch&#39;], &#39;refs/heads/main&#39;)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">variables</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">group</span><span class="p">:</span><span class="w"> </span><span class="l">databricks-free-variable-group</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">script</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">          curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">displayName</span><span class="p">:</span><span class="w"> </span><span class="s1">&#39;Install Databricks CLI&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">task</span><span class="p">:</span><span class="w"> </span><span class="l">Bash@3</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">displayName</span><span class="p">:</span><span class="w"> </span><span class="s1">&#39;Validate Databricks Bundle for $(DATABRICKS_WORKSPACE)&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">inputs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">targetType</span><span class="p">:</span><span class="w"> </span><span class="s1">&#39;inline&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">script</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">            export DATABRICKS_TOKEN=&#34;$(DATABRICKS_TOKEN)&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">            databricks bundle validate -t $(DATABRICKS_WORKSPACE)
</span></span></span><span class="line"><span class="cl"><span class="sd">            </span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">task</span><span class="p">:</span><span class="w"> </span><span class="l">Bash@3</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">displayName</span><span class="p">:</span><span class="w"> </span><span class="s1">&#39;Deploy Databricks Bundle to $(DATABRICKS_WORKSPACE)&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">inputs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">targetType</span><span class="p">:</span><span class="w"> </span><span class="s1">&#39;inline&#39;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">script</span><span class="p">:</span><span class="w"> </span><span class="p">|</span><span class="sd">
</span></span></span><span class="line"><span class="cl"><span class="sd">            export DATABRICKS_TOKEN=&#34;$(DATABRICKS_TOKEN)&#34;
</span></span></span><span class="line"><span class="cl"><span class="sd">            databricks bundle deploy -t $(DATABRICKS_WORKSPACE)</span><span class="w">
</span></span></span></code></pre></div><p>The job consists of three steps:</p>
<ol>
<li>First we install the Databricks CLI on the Azure DevOps pipeline agent that runs the job.</li>
<li>We then validate the DAB like we did manually in the previous part.</li>
<li>Finally, we use the same command that we ran manually in the previous part to deploy the DAB.</li>
</ol>
<p>Note that the Databricks CLI authentication takes place using the environment variable <code>DATABRICKS_TOKEN</code>. We specify the target using the <code>-t</code> flag and the variable <code>DATABRICKS_WORKSPACE</code>. Make sure to push this code to your Azure DevOps repository.</p>
<h2 id="authorize-azure-devops-to-deploy-dabs">Authorize Azure DevOps to deploy DABs</h2>
<p>Remember how we had to authorize Databricks to access Azure DevOps Repos? Now we have to do the same but in the opposite direction: Azure DevOps needs to be authorized to deploy DABs in our Databricks workspace. This requires creating a Databricks PAT and storing it in Azure DevOps.</p>
<p>Go to the Databricks UI and create a Databricks PAT by clicking on your user profile picture in the top right corner. Click on &ldquo;settings&rdquo;, go to &ldquo;Developer&rdquo; and click on &ldquo;Manage&rdquo; under Access Tokens. Generate a new token and copy the value.</p>
<p>Navigate to Azure DevOps and open the &ldquo;Pipelines&rdquo; tab. Go to &ldquo;Library&rdquo; and create a new variable group <code>databricks-free-variable-group</code>. Create a new secret variable <code>DATABRICKS_TOKEN</code> and paste the copied PAT value. Create a second (non-secret) variable <code>DATABRICKS_WORKSPACE</code> and write the value &ldquo;free&rdquo;. This will be the target Databricks workspace in which we will deploy the DAB resources.</p>
<h2 id="creating-the-azure-devops-pipeline">Creating the Azure DevOps Pipeline</h2>
<p>Pushing the pipeline YAML code to the Azure DevOps repo is not sufficient. We have to manually create the pipeline.</p>
<p>In Azure DevOps, navigate back to the &ldquo;Pipeline&rdquo; tab. Click the &ldquo;Create Pipeline&rdquo; button. Select &ldquo;Azure Repos&rdquo; and choose &ldquo;Existing Azure Pipelines YAML file&rdquo;. Select the YAML-file containing your Azure DevOps pipeline code.</p>
<h2 id="running-the-azure-devops-pipeline">Running the Azure DevOps Pipeline</h2>
<p>Navigate to the newly created pipeline and click on &ldquo;Run pipeline&rdquo;. When you run the pipeline the first time, it will request permissions to use the variable group. Click on &ldquo;Permit&rdquo;. We see that the three steps of the job completed successfully:</p>
<p><img loading="lazy" src="/posts/youtube/databricks-dab-azure-devops-pipelines/pipeline.png" type="" alt=""  /></p>
<p>That&rsquo;s it! We can now make changes to our pipeline, push our changes to the remote repository, and automatically see them in the Databricks UI.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://docs.databricks.com/aws/en/getting-started/free-edition">Databricks Free Edition</a> only allows one environment (that we call free). In a real application, we would define multiple targets, e.g. dev, test, and prod.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Even though this code is shared as a GitHub repository, the Azure DevOps pipeline will only work with an Azure DevOps Repo. You must create this repo yourself.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    
    <item>
      <title>🎥 Configure and Deploy Databricks Asset Bundle</title>
      <link>https://gontcharov.eu/posts/youtube/databricks-asset-bundle/</link>
      <pubDate>Tue, 27 May 2025 18:59:04 +0200</pubDate>
      
      <guid>https://gontcharov.eu/posts/youtube/databricks-asset-bundle/</guid>
      <description>&lt;p&gt;In this new video I share how to overcome Azure CPU quota limits with Databricks Asset Bundles, a common roadblock many Databricks practitioners face when deploying Databricks Asset Bundles on Azure for the first time.&lt;/p&gt;
&lt;br&gt;
&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
      &lt;iframe allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&#34; loading=&#34;eager&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; src=&#34;https://www.youtube.com/embed/WTXgjis338U?autoplay=0&amp;amp;controls=1&amp;amp;end=0&amp;amp;loop=0&amp;amp;mute=0&amp;amp;start=0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; title=&#34;YouTube video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;

&lt;br&gt;
&lt;h1 id=&#34;problem&#34;&gt;Problem&lt;/h1&gt;
&lt;p&gt;If you&amp;rsquo;re playing around with Databricks projects, Azure&amp;rsquo;s default CPU quota limits often fall short of what Databricks Asset Bundle Python template jobs and pipelines actually need to run.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>In this new video I share how to overcome Azure CPU quota limits with Databricks Asset Bundles, a common roadblock many Databricks practitioners face when deploying Databricks Asset Bundles on Azure for the first time.</p>
<br>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/WTXgjis338U?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<br>
<h1 id="problem">Problem</h1>
<p>If you&rsquo;re playing around with Databricks projects, Azure&rsquo;s default CPU quota limits often fall short of what Databricks Asset Bundle Python template jobs and pipelines actually need to run.</p>
<h1 id="solution">Solution</h1>
<p>I demonstrate how to get your entire workflow running end-to-end using the Databricks VS Code extension, plus share how to adjust Azure CPU quotas to execute the job.</p>
<p>In the video, I also break down the complex Python project template structure (because let&rsquo;s be honest - there are a lot of files and directories!) and show you the actual results running in Databricks.</p>
<p>Perfect for data engineers working with Databricks who want to streamline their software development process</p>
]]></content:encoded>
    </item>
    
  </channel>
</rss>
