This article explains how to connect Excel to Databricks, in a step-by-step guide.
Get started with this process only if you:
(Good luck if you aren’t a techie!)
Find these restrictions limiting? Unfortunately, there's no way to bypass this in Excel.
(Hint: You can easily import and analyze large datasets on Gigasheet, a big data spreadsheet tailored for big data. Jump to the end to learn more.)
Either way, let’s dive in.
Ensure you have these in place before you begin:
Follow these steps to connect Excel and Databricks using OAuth 2.0:
Search for ODBC data sources on your device and open it.
Go to the System DSN tab and select “Simba Spark”. Click “Configure”.
Select the mechanism:
Choose OAuth options:
Configure the HTTP path:
Set advanced options:
Now, open Excel and go to the “Data” tab.
Select “Get Data”:
Choose the DSN you just configured. Click “OK”. Next, authenticate yourself on a browser pop-up window.
Lastly, choose the columns you want to import and click “Next”.
Select “Return Data to Microsoft Excel” and press “Finish”.
After the seemingly endless steps, you can finally import data into Excel. Unless you have more than 1,048,576 rows of data. Then, you can’t.
Reminder: Excel has a row limitation!
Even with a few thousand rows, Excel becomes painfully slow.
If you use Databricks, you’re accustomed to speed and accuracy. When it comes to big data, Excel simply can’t meet your expectations.
You need a platform like Gigasheet, that is specifically designed for big data.
Gigasheet is a big data spreadsheet that helps businesses upload, analyze, and collaborate on up to a billion rows of data.
This no-code platform is a database disguised as a spreadsheet, at a cloud scale. It’s quick to connect to Databricks with only 3 steps. The best part? You can manage large datasets, without worrying about data cutoffs or technical complexities.
Why Gigasheet? Well, it
It’s easy to connect Gigasheet and Databricks. Here’s how you can do it:
Log in to your Gigasheet account (if you don’t have one, sign up for free).
Navigate to “Import from Platform”.
Select “Databricks Personal Access Token”.
Enter the necessary connection details for your Databricks environment.
Enter the name of the table you want to import to Gigasheet.
In a few minutes, you’ll look at a screen like this 👇🏻. Voila, you’re done!
Excel is a popular choice for data management. It’s ideal for day-to-day operations that involve minimal reporting and analytics. However, it struggles with big data. It lags and delays your work.
You need a platform that’s quick to set up, easy to use, and advanced enough to handle complex datasets.
A spreadsheet (also a database) for big data analytics.