Connecting your data to Tableau Prep Builder with Supermetrics API [a step-by-step tutorial]
In order to build an actionable and insightful report in Tableau, you need to have well-prepared and clean data.
However, since businesses have data in many different places and formats, cleaning data for analysis can become a heavy process.
That is until Tableau introduced Tableau Prep Builder to empower data users — including analysts and marketers who may not have deep analytical skills — to transform their data.
In this article, I’ll show you how to connect your data to Tableau Prep Builder using Supermetrics API and share some ways you can leverage your data there.
Here are the steps we’ll go through:
- Install Tableau Prep Builder
- Install and set up TabPy
- Create a small Python script that will contain the Supermetrics API call
- Create the flows in Tableau Prep Builder, which will create Tableau extract files for us
When we’re done, you’ll be able to combine the output of Supermetrics API queries with other data sources via Union and schedule an incremental refresh of a Tableau extract with data from our API.
Sounds good? Let’s get started!
Step 1: Install Tableau Prep Builder
Tableau Prep Builder is a data transformation tool that Tableau released in 2018 to help its users clean and prepare data for analysis.
The best thing about Tableau Prep Builder is that it lets you visualize all your data transformation operations. That way, you’ll be able to follow what’s happening with your data and identify errors quickly. The result is the output of your data set in Tableau Hyper Extract — which is optimized for building reports in Tableau.
Since Tableau Prep Builder is available in the Tableau suite, you may already have access to it.
Setting it up is quite easy too. All you need to do is download it from the Tableau website, install it on your computer, and finally, enter your license key or start the free trial.
Step 2: Install TabPy
TabPy is the second component that we need for this solution.
TabPy is a Python extension for Tableau. It lets you run Python scripts right inside Tableau dashboards or, and most importantly for our case here, inside Tableau Prep Builder.
You can find the installation instructions on the package’s Github. Of course, you will need a recent version of Python for it to work. Once the installation is complete, you can start using TabPy by running the background process in a terminal window. It’ll tell you that it’s running and listening for incoming requests from Tableau on port 9004.
The final piece of prep for TabPy is to configure the connection to it in Tableau Prep Builder.
You can find the required option in the ‘Help’ menu of Tableau Prep under ‘Help’ > ‘Settings and Performance’ > ‘Manage Analytics Extension Connection’.
For this guide, we assume you are setting this up on your local machine, so all you need to do is make sure that TabPy is still running and that the port you see in the window matches the port that TabPy is listening on.
Click the ‘Sign in’ button, and you’re done!
Step 3: Set up the Python script
It’s time to set up the Python script.
In case you are worried about that step, don’t be. It’s pretty straightforward. The script we need will just have a few lines of code, most of which are the API link from Supermetrics.
You can even just copy and paste this into a text file and replace the ‘[PLACEHOLDER FOR YOUR API LINK]’ with your own link:
import requests import pandas as pd def get_data_to_flow(input): response = requests.get("[PLACEHOLDER FOR YOUR API LINK]") results = response.json() return pd.DataFrame(results['data'][1:], columns=results['data'])
Save this as a file with the .py extension somewhere where you will easily find it again. We’ll be using this in the following steps.
Let’s generate an API link in Supermetrics Query Manager next, and while we’re there, we will also download a sample of our data in CSV format.
To do this, log in to the team site.
Note that you can access the Supermetrics Query Manager with a valid Supermetrics API license. If you don’t have a license yet, start your 14-day free trial.
Then, choose ‘Integrations’ > ‘Query manager’. You’ll see a sidebar on the left of your screen. Start building your query there. In this example, I’ll use Google Ads as a data source. Continue building your query by filling in these fields:
- Select dates
- Select metrics
- Split by dimensions
- Filter & Options
Once you’re done, click ‘Run’. You’ll see your data in a preview table and raw JSON format.
Here’s a tutorial on how to get started with Supermetrics API.
Tableau Prep will need this sample to understand the data structure that will come in the API response.
Note that the format selection is set to ‘JSON’ and not ‘Tableau’ because we’ll let the Python script— which is written to work with the response in JSON format — handle that API call. Then I exported the results as a CSV file (1) and copied the API URL to the prepared Python script file (2).
Don’t forget to save the script file after pasting the Supermetrics API link!
Step 4: Bring it all together in Tableau Prep Builder
We’re almost there!
To get data to flow from Supermetrics API into your Tableau Prep flow, we now have to connect the CSV file we downloaded from the previous step.
Since Tableau Prep Builder is quite picky about the data types, it won’t accept the output of our Python script without this step. The CSV file has the same structure as the data from Supermetrics API, informing Tableau Prep Builder how to interpret that.
After selecting the file, Tableau Prep Builder will start a new flow and read in the content of the CSV. You can see in the infobox that it has parsed the field types correctly as date, string, and number, respectively. And that’s exactly what we want.
We need to add a script as the next step in the flow. Do that by clicking the plus sign next to the box representing the CSV file and selecting ‘Script’.
This is where we get to use TabPy. So make sure that the radio button for ‘Tableau Python (TabPy) Server’ is activated and that TabPy is still running on your machine. Then browse to the script file you’ve created in the previous step.
We also need to tell Tableau Prep Builder which functions from that file we want to use here. Our tiny script only has one function, so if you’re following along with this guide, type the function name ‘get_data_to_flow’ into the field.
Once these settings are done, Tableau Prep should immediately start executing the Supermetrics API query, and data will appear in the preview window.
Congratulations! You have just successfully loaded data from Supermetrics API to your Tableau Prep flow.
Leverage your data in Tableau Prep Builder
You can easily control what data to load through Supermetrics API by editing the API link. Otherwise, you can use this data input element in the Tableau flow as you like.
If you’re using our short URLs, you can find out how to edit parameters in this documentation article.
As long as you don’t change the shape of the data by adding or deleting columns, you can play around with the results. If you need to change the shape of the data, remember to change the CSV file accordingly so that Tableau Prep can interpret the API response.
Now, let’s explore some functions in Tableau Prep Builder you can use to leverage your data for analysis.
Append data from Supermetrics to a CSV dataset (UNION)
The first thing we can do is perform a UNION operation. For that, instead of just supplying the sample CSV file for Tableau Prep Builder to interpret the API output, we will also use it as a means to import static data.
Imagine having a set of historical data that won’t change any more. A common use case would be exporting last year’s data for a metric, which you would like to use in a Tableau report, but also appending a daily updated dataset to this to keep it up-to-date.
For example, I configured a query in Query Manager to get Google Ads data for 2020 and downloaded the results as CSV.
Then, I changed the date selector from ‘Last year’ to ‘Year to date’ and copied the API link for this query into the Python script file. Here’s how the script file looks like. Note that my unique API key has been redacted.
Now we go through the same steps as before. We need to add a UNION operation and an ‘Output’ in the flow:
- Start a new flow in Tableau Prep by first connecting to the CSV file.
- Add a ‘Script’ step after the CSV with TabPy serving the API call function.
- Add a ‘Union’ step after the script, then click and drag the CSV step onto it, forming a triangle.
- Add an ‘Output’ step after the UNION.
Running the flow now will give us a hyper-extract that can be used as a data source for your Tableau project. And because the API call has a relative date range (year to date), you can simply keep re-running the flow to update the extract!
You can also choose to save the output file to your local hard drive or directly publish it to a Tableau Server or Tableau Online to share with other Tableau users in your organization.
Incremental extract refresh with data from Supermetrics
The method described above will work great for small to medium-sized amounts of data. However, once your queries to the API become larger and more demanding — in terms of runtime resources — it’s a good idea to perform an incremental refresh of the data set instead of requesting the full-time range on each flow run and overwriting the previous extract.
Fortunately, Tableau Prep comes with a built-in option for incremental updates of an existing extract file. You just need to specify which column in the data table should be used for determining the new rows of data coming from the refresh. The flow will then take care of the rest for you.
However, Tableau Prep Builder does not support this for data through a script execution.
Here’s how to work around the limitations and create a working solution for incremental data refresh.
We will split the process into two parts:
First, we create a Tableau Hyper Extract that will contain the update. Then apply that to the actual data source in the second step. This is important because incrementally updating an extract is indeed supported when another extract file is the source of refresh data.
To get there, we create a flow that includes a script again. We use a relative time range in our API call that will keep the amount of data we request lightweight. I use ‘the last 10 days’ in this example.
To create the extract file that will be our ‘Refresher’, we have to use what we’ve learned so far about scripts in Tableau Prep Builder.
Here, I used an existing extract file as the input before the script step to demonstrate that it also works and that the initial CSV file we used is just a necessary step to run the script.
Without the UNION operation, the output of this flow will simply be whatever our script step is pulling via Supermetrics API.
Having saved this output file on my drive as ‘increment.hyper’, I can now set up the final flow that will update a published extract used for several reports.
This might look counterintuitive — it sure was for me.
We are connecting to the increment file in Tableau Prep Builder, and all we do is add an ‘Output’ that points to the file that we already have and want to update.
Here, I have an optional cleaning step in the flow that separates the two. The important bit here is to enable incremental refresh on the ‘Input’, our ‘increment.hyper’ file. We need to set the field that will be used to detect new rows coming into the flow. In our case, that’s ‘Date’.
Then, we’ll pick the ‘Output’ that points at the destination we wish to update and match the ‘Date’ field from the ‘Input’ with the corresponding ‘Date’ field in the ‘Output’.
This setup gives us a new option of running the flow as an incremental refresh. Great!
This method preserves whatever data we already have in the output file. It can be a huge extract with millions of rows — we are just adding new rows to it in this procedure. Once you have created these flows, it takes just a few seconds to run them and update the data. After all, we are now only querying the Supermetrics API for 10-days worth of data.
Using Supermetrics API as a data source in your Tableau Prep Builder helps you perform complex data modeling tasks which usually require a data warehouse.
Turn your marketing data into opportunity
We streamline your marketing data so you can focus on the insights.