Automate Data Delivery Using the Gigasheet API

Rumor has it that, despite previous pleas, a few of you are still emailing large files around. I have no one to blame but myself. So let’s talk Data Delivery again, this time using the Gigasheet API to automate the process.

‍

Yes, that’s right, the Gigasheet API is here! We’ve heard the cries, we’ve read the letters. The API documentation is being constantly updated, and now anyone can request an API key. If you have a need to programmatically interact with Gigasheet, please drop us a line!

‍

In the last blog post, we started with a master data set. We took that data set, built a filter to match the client’s request, saved off that result set as a new sheet, and shared that new sheet with the client. Instead of sending an email, the client gets access to the data, which can be filtered, grouped, visualized, and exported on the client’s demand.

‍

In this blog post, let’s go a step further. Let’s imagine that we have received more data to add to the master data set. We will want to rerun the query we used to build the sheet we previously delivered, this time creating a new data delivery off the newly updated data. And again, we want to share it with the client.

‍

Simple! This entire process can be done in one script; for the purposes of this blog post, we’ll break the workflow into discrete actions.

‍

‍

Data Delivery Using Gigasheet

Step 1: Update the dataset

With the UI, all uploads come in as new files. But, using the API, you can append rows to the bottom of an existing sheet. You do this with the targetHandle parameter on the Upload.

‍

We start with master_list.json, a 3.9 million row file with all our (anonymized) contact information.

‍

There are many ways to get the file handle, but easiest is via the sheet URL. As shown above, the file handle for this sheet is “0001a6f4_1844_4d3c_933f_e58c36f1b99f”. Simple then to set that as the targetHandle.

‍

The data to be appended to this sheet is in an S3 bucket. Named “march_addendum.json”, it follows the same column layout as master_list.json, my master data set.

I create a pre-signed link using the boto3.client python object, and pass that pre-signed url to the upload/url API endpoint. Looks something like this:

‍

endpoint = "https://api.gigasheet.com/upload/url"

GIGASHEET_API_KEY = "myAPIkey_redacted"

‍

headers = {

"accept": "application/json",

"content-type": "application/json",

"X-GIGASHEET-TOKEN": GIGASHEET_API_KEY

}

‍

payload = {

"url": march_addendum_presigned_url,

"name": "master_list.json",

"targetHandle": "0001a6f4_1844_4d3c_933f_e58c36f1b99f"

}

‍

response = requests.post(endpoint, headers=headers,json=payload)

‍

It’s that easy. Run the python, and the file updates in the UI:

‍

‍

The same file (and same 0001a6f4_1844_4d3c_933f_e58c36f1b99f file handle) now has 4,083,585 rows, meaning the march_addendum data added 225,890 rows to my existing dataset.

Step 2: Filter via API

Filtering can be a little tricky. By far, the easiest way to filter via API is using a Saved Filter that you’ve previously created. Luckily, we can use the previous blog post as a way forward.

‍

In that post, we had one client who was doing solar installations in Phoenix. The client was looking for leads to target. We built a filter that looked like this:

‍

‍

Using the State and Self-Reported Interest columns from our master dataset, we were able to identify exactly who to contact. Now, let’s save the filter.

‍

‍

In the upper right corner of the filter box, I save the filter as “Arizona Green Technologies”.

‍

When filtering with the API using a previously saved filter, first get the saved filter ID, and then use that to build the filter_state to be applied to a sheet.

‍

To get the saved filter ID:

‍

url = "https://api.gigasheet.com/filter-templates"

GIGASHEET_API_KEY = "myAPIkey_redacted"

headers = {"accept": "application/json","X-GIGASHEET-TOKEN": GIGASHEET_API_KEY}

response = requests.get(url, headers=headers)

print(response.text)

‍

This will return my saved filters. The response includes data like this:

‍

"Name": "Arizona Green Technologies",

"Id": "a89334b8_9c95_48fe_bc8d_b357ffa132ab",

"Owner": "steve.schohn@gigasheet.com",

‍

You’ll use that “Id” to get the filter_model, also using the sheet handle.

‍

filter_id = “a89334b8_9c95_48fe_bc8d_b357ffa132ab”

sheet_handle = “0001a6f4_1844_4d3c_933f_e58c36f1b99f”

url = "https://api.gigasheet.com/filter-templates/"+filter_id+"/on-sheet/"+sheet_handle

headers = {"accept": "application/json","X-GIGASHEET-TOKEN": GIGASHEET_API_KEY}

response = requests.get(url, headers=headers)

filterModel = response.json()["filterModel"]

filtered_state = {"filterModel":filterModel}

‍

The filtered_state variable now includes all the filter logic you need to proceed.

‍

‍

Step 3: Save and Share the Results

‍

To save a new copy of the sheet with the filter logic applied, you actually use an export-and-import step.

‍

Create an export of file with the filtered_state filter logic being applied using the export endpoint:

‍

body = {

'filename': "Arizona_Solar_March_Update.csv",

'gridState': filtered_state

}

url = "https://api.gigasheet.com/dataset/"+sheet_handle+"/export"

response = requests.post(url, json=body, headers=headers)

export_handle = response.json()["handle"]

‍

You’ll get back the handle of the export. Once the export is complete, use that handle to get the temporary presigned-link to reimport:

‍

time.sleet(20)

url = "https://api.gigasheet.com/dataset/"+export_handle+"/download-export"

response = requests.get(url, headers=headers)

presignedURL = response.json()["presignedUrl"]

‍

And upload that URL into Gigasheet:

‍

upload_url = "https://api.gigasheet.com/upload/url"

payload = {"url": presignedURL}

response = requests.post(upload_url, json=payload, headers=headers)

print(response.text)

‍

Once in Gigasheet, user the share endpoint to share:

‍

recipients = ["arizona_recipient@gigasheet.com"]

message = "Here is the latest Solar Panel target list, to include the March update."

share_url = "https://api.gigasheet.com/file/"+new_file_handle+"/share/file"

body = {

'emails':recipients,

"permissions": [0],

'message':message

}

response = requests.put(share_url, json=body, headers=headers)

‍

And optionally, you can then delete the export you created:

‍

time.sleep(20)

url = "https://api.gigasheet.com/delete/"+export_handle

response = requests.delete(url, headers=headers)

‍

‍

Using Gigasheet as a Data Delivery Platform

‍

Let’s look back at what we did to make automated data delivery possible:

Appended data located in an S3 bucket to a pre-existing master file in Gigasheet
Applied a pre-existing Saved Filter to the sheet with the appended rows
Created a fresh copy of that sheet
Shared the new sheet with the client
Cleanup up the library
Lived, loved, mostly laughed

Easy enough to set this up on a cron schedule, and import and export data automatically.

‍

How To

Automate Data Delivery Using the Gigasheet API