Data good enough for operations is not necessarily analytics-ready.
Exactly one month ago I published the first episode of my Industrial Data Quality Podcast. In my opinion, a topic of critical importance that is all too often overlooked, especially with all the buzz around AI.
In the most recent episode, I had the pleasure of inviting guest speaker Thomas Dhollander, co-founder of Timeseer.AI. Together we explored critical challenges in industrial time series data reliability and observability.
...
In this video I explain how to read data from an Airtable into a PySpark dataframe on Databricks.
Airtable is a popular spreadsheet tool used at many enterprises. It offers additional features compared to other tools such as Microsoft Excel and Google Sheets
Today I published a 30 minute talk about my career in data thus far. Iโve made many mistakes along the way, but learned a tonne from them. Although I made many jumps over the years, Iโm happy I always stuck around the central theme of data. Perhaps my talk can give you some inspiration if you feel stuck?
Topics include:
How I ended up in data after graduating in Materials Science and working in aluminium. How I started freelancing, but ended up employed again. Why I decided to freelance again What I did better the second time. Why Iโm focusing on Databricks in the future. Listen to the podcast here:
...
Welcome to my podcast! In this very first episode I introduce the topics of this podcast and explain my background in data.
Follow the show About Denis Gontcharov
In this video I demonstrate how to perform data quality checks on a Delta table in Databricks using Soda Core.
Soda Core is the open-source Python package developed by Soda. It can be compared to Great Expectations, but is much simpler in my opinion. I enjoy using Soda in my professional projects and will continue exploring this framework.
...