I’ve worked for several large enterprises that are migrating towards Databricks. This adoption consists of two work streams:
- Development of the platform
- Adoption by users
While enterprises generally put sufficient energy and resources toward the former, the latter lags behind: insufficient users are actively developing on Databricks.
In my opinion there are a couple of reasons for that. The most important one being a lack of skills. Although Databricks abstracts away much of the difficulties of working with Spark, certain fundamental knowledge must be understood to leverage the platform effectively.
Often users ask for “a CSV data extract” or “can I just convert this dataframe to Pandas”? Smart enterprises will be sure to actively invest energy and resources into onboarding new users.
How? Things that worked well for me in the past were:
- Build any new data project exclusively on Databricks - no exceptions.
- In-house knowledge-transfer and introductory Databricks workshops.
- Designated “Databricks Champions” with dedicated time to support new users.
- A platform worth using (see point 1 above): if your Databricks misses key data, users won’t use it. Fix that first.