[ Updated Oct 9, 2023 ]
Data blending is a great way to explore and make the most of your data in Looker Studio (formerly Data Studio). However, this feature also comes with some limitations that could slow your report down at best and affect your data accuracy at worst.
In this post, we’ll look at two workarounds using Google Sheets and Supermetrics.
Skip ahead >>
- The basics of data blending in Looker Studio
- The limitations of data blending in Looker Studio
- How to overcome data blending limitations using Google Sheets
- How to overcome data blending limitations using Supermetrics
- Video: how data blending works in Supermetrics
The basics of data blending in Looker Studio
Let’s look at what data blending is and the logic of data blending in Looker Studio.
What is data blending?
Data blending is the process of merging different data sources into one single dataset. Data blending works when your joined data sources share at least one common dimension or a ‘join key’.
Different types of join in Looker Studio
To blend data in Looker Studio, pick a join operator and join key. Your blended table shows different results depending on the join operator you pick. Currently, Looker Studio supports five join operators:
- Inner join means combining data from both sources — matching it where the join keys are the same and dropping the data that doesn’t match.
- Outer join means taking all the data from both sources — matching it where the join keys are the same. And finally, padding the non-matching columns with empty values in the joined table.
- Left join means taking all the data from the left table and the matching data from the right table where the join keys are the same.
- Right join means taking all the data from the right table and the matching data from the left table where the join keys are the same.
- Cross join is a special operation that results in a table with all possible row combinations from both tables. It doesn’t require join keys. Each row from table A is joined with every row from table B, resulting in a table with the number of rows A*B.
The limitations of data blending in Looker Studio
While data blending is a great feature, it also comes with many limitations that could slow down your reports or affect your results. Limitations of data blending in Looker Studio include:
- No control over what’s happening under the hood
- Slow processing and loading speeds
- A limited number of blended sources
No control over what’s happening under the hood
Traditionally, when you join data in a spreadsheet, you can use different formulas to tell the computer precisely what data you want to retrieve.
This lets you see what’s happening with your data in each step. If an error occurs, you can always go back to the raw data and trace the problem.
But, with Looker Studio, the join happens under the hood, so if the blended data has errors, you wouldn’t know what caused the problem.
Slow processing and loading speeds
You’ve probably noticed that Looker Studio can take its sweet time loading your reports. Things get worse when you bring data blending into the picture.
Whenever you create a blended data source, Google has to go through different APIs to retrieve data. And that process requires quite a bit of computational power.
The more blended data sources you add, the slower your dashboard will be.
A limited number of blended sources
Another frustrating limitation is that you can blend a maximum of five data sources. While this number sounds like a lot, it isn’t. Occasionally, in many advanced and in-depth reports, you need to blend data from more than five sources. You’ll easily cross the limit if you want to create a very detailed table with many columns.
So should you just save yourself from all the trouble and avoid data blending?
In fairness, Looker Studio does a splendid job with simple and light blending. So if you want to blend one to two data sources with a simple join key like date, you can stick with Looker Studio.
On the other hand, if you want to gain more control over your data and do more advanced blending, Google Sheets is the way to go.
How to overcome data blending limitations in Looker Studio using Google Sheets
When data blending in Looker Studio becomes a bit of a hassle, you can blend your data in Google Sheets and bring it back together in Looker Studio for reporting.
This approach gives you more flexibility with your data. You can take advantage of the Google Sheets formulas to enrich your data. Additionally, it’s much faster to load blended data from a Google Sheet than from several sources.
In addition, you can use Supermetrics to pull data into Google Sheets automatically. You’ll have more time to do what you’re good at — analyzing the data and getting meaningful insights. Start trying it out for yourself for free with our 14-day trial.
Let’s look at some tips for joining data in Google Sheets.
Manage your data in Google Sheets
It can get messy quickly when you bring data from different sources to Google Sheets for blending. Dividing them into separate tabs is a good way to stay organized with your data.
The ‘raw data’ tab is where you store all your unformatted raw data from your data sources. In this example report, we use Supermetrics to pull data from Facebook Ads, Microsoft Ads, and Google Ads into three separate tabs.
The ‘blended data’ tab is where the magic happens. You can match your data together and perform some calculations to get more insights from your data.
The ‘reporting data’ tab is where you put the last piece of the puzzle. When you’re done enriching and transforming the data, you can present them in a separate tab where it’s easier to monitor.
Additionally, you can connect the ‘reporting data’ tab to Looker Studio to bring the final results to your dashboard. You can find the Google Sheets connector in the connector gallery.
Three useful functions for joining data in Google Sheets
VLOOKUP is one of the most used functions for data joining. It lets you search for a value in one table and use it in another table.
The syntax for VLOOKUP is:
VLOOKUP (search_key, range, index, [is_sort])
- search_key: the value you want to look up.
- range: the range that contains the value you want to look up. Note that VLOOKUP will search from the first column in your range.
- index: the column number (within your chosen range) that contains the returning value.
- is_sort: this parameter is optional. Here, you can specify if you want to receive an exact match (FALSE) or the nearest match value (TRUE). In the case of data joining, you’ll want to set it to an exact match.
You’re telling Google Sheets what value you want to search for, where you want to search for it, the column number in the range that has the value to return, and finally, if you want to receive an exact match (FALSE) or the nearest match (TRUE).
Let’s say you have two tables:
- A marketing table with data about date, source, medium, campaign, impressions, cost, and clicks
- A conversion table with data about date, source, medium, transactions, and revenue.
There are two steps to connecting the puzzles.
First, you need to create composite keys for two tables using the TEXTJOIN function. Each composite key can be used to uniquely identify each row of the table. Without the composite keys, you’re likely to run into one-to-many relationships. Additionally, you can use them as join keys for VLOOKUP.
Your composite keys will include the campaigns’ date, source, medium, and campaign (which means campaign name in this case). It’ll look something like this.
Next, use VLOOKUP to join two tables. For example, the formula for combining transaction data with the marketing table is:
Tip: Using absolute reference makes it easier for Google to search for the value and for you to drag the formula across your spreadsheet.
Simply put, Google searches the first column for the composite keys and returns the corresponding transactions.
IF + REGEXPMATCH
The first step is to remap the campaign name to new values with an IF function (columns F and N).
That new cleaned-up name is then used as a join-key to generate the metrics table on the right side of the sheet, where metrics from two sources are aggregated together where the previously remapped campaign name matches.
The function we’re looking at next is a nested function — IF + REGEXPMATCH, where
Let’s take a look at the table below. As you can see, it has different naming conventions, for example, ‘Google Data Studio’ and ‘googledatastudio’, or ‘Enterprise’ and ‘enterprise’.
You can put all your Google Data Studio campaigns in one basket and Enterprise campaigns in one basket using this formula
REGEXMATCH(A7,"Data Studio|datastudio"),"Data Studio Campaigns",
In simpler terms, your function searches in column A7 for ‘Data Studio’ or ‘datastudio’ and returns ‘Data Studio Campaigns’. If there is no such value, it will search for ‘Enterprise’ or ‘enterprise’ and return ‘Enterprise campaigns’.
You can remap campaign names from different sources and use them as your join key.
In Google Sheets, you can use different aggregation functions to summarize your data — calculating the sum, average, or counting the number of data points. But, in reality, you may not want to aggregate all your data. In that case, you can use conditional aggregation to specify which data you want to aggregate.
Conditional aggregation is a function that tells Google to perform data aggregation over a set of data when it meets certain criteria. We’ll take a look at some common conditional aggregation functions.
The SUMIF function tells Google to calculate the sum of the data that meets a predefined condition in a range. The syntax for the SUMIF function is:
SUMIF (range, criterion, [sum_range])
- range: you want to specify the data range you want to apply the condition to.
- criterion: you should specify the condition that defines which cells will be summed.
- sum_range: you should specify the range to be summed if different from ‘range’. This is optional.
Take the table below as an example. Let’s say you want to calculate the impressions from the US. You can do so by using SUMIF (B3:J12, “US”, D3:D12).
The AVERAGEIF function returns the average value of data that meets certain criteria in a range. The syntax for the AVERAGEIF function is:
The AVERAGEIF function returns the average value of data that meets certain criteria in a range. The syntax for the AVERAGEIF function is:
AVERAGEIF (criteria_range, criterion, [average_range])
- criteria_range: you should choose the data range you want to apply the condition to.
- criterion: specify the condition that defines which cells will be averaged.
- average_range: you should specify the range to be averaged if different from ‘criteria_range’. This is optional.
For example, if you want to calculate the average cost from the US, you can use AVERAGEIF(B3:J12, “US”, E3:E12).
Similarly, the COUNTIF function performs a conditional count over your data. The syntax for COUNTIF is:
COUNTIF (range, criterion)
- range: the range you want to count
- criterion: the condition you want to apply
For example, you want to count how many countries have CPC greater than 1. You can do so by using COUNTIF(H3:H12, “>1”)
The easiest way to blend data in Looker Studio: Using the Supermetrics Marketing Intelligence Cloud
The Supermetrics Marketing Intelligence Cloud is a centralized platform that allows you to consolidate, transform, and move data to any reporting and analytics destination. It’s also the easiest and quickest way to blend your data.
Data blending in Supermetrics
Whether you’re an analyst looking for a more effective way to blend your data or a non-technical marketer who isn’t familiar with all join logic, you can easily blend your data and bring it to Looker Studio. With Supermetrics, you can:
- Blend more than 100 data sources—including paid media, social media, ecommerce, etc. Support for more sources is added on a continuous basis.
- Easily review your blended fields so you can verify if there are any duplicates and make sure your report is accurate.
- Modify the blended sources and configure the field mapping to fit your needs.
- Forget about the join operators since Supermetrics automatically takes care of the blend.
Getting started with data blending in Supermetrics
Here’s how you can use the data blending feature in Supermetrics.
- Log in to the Supermetrics.
- Choose ‘Transform’ → ‘Data blending’ on the left side of the screen.
- Choose ‘Create new blend’.
- Select the data sources you want to combine, then click ‘Continue to configuration’ and follow the instructions to configure the data sources.
- After that, name your blended source, then click ‘Create blend’.
You’ll see your blended fields.
After that, your blended source will be available in your analytics and reporting destination. Let’s say you’re using Looker Studio for reporting.
- Click ‘Back to blend list’.
- Select ‘Use your blends in destinations’.
- Once the window opens, choose ‘Looker Studio’ → ‘Go to connector’.
- Follow the instructions to authenticate your data sources.
After that, you can easily build reports with your blended sources. To learn more about how data blending works, check out our support center.
Watch how data blending works in the Supermetrics
Over to you
Despite its limitations, Looker Studio is still a great tool for visualizing and sharing your reports. If you’re looking for an easy way to do data blending, contact us.
About the author
Bartosz is a Senior Data Analyst and Lead of our Professional Services team. In his role, he’s consulting our customers on analytics solutions powered by Supermetrics products and implements these projects together with his team of domain experts.
Joy is the Content Strategist at Supermetrics. With internal and external experts, Joy helps businesses eliminate the data chaos and turn marketing data into opportunity.
Turn your marketing data into opportunity
We streamline your marketing data so you can focus on the insights.