If you’re a Google Data Studio advanced user, chances are you’ve already used the data blending feature. 

It’s a great feature that allows you to enrich and unlock the potential of your data quickly. Especially if you don’t have the time to pull data from multiple sources and combine them in spreadsheets. 

However, data blending also comes with some limitations that could slow your report down at best and affect your data accuracy at worst.

In this article, we teamed up with two experts from our team, Bartosz Schneider and Evan Kaeding, to discuss the good and the bad of data blending in Data Studio, and how you can avoid all the headaches.

To make sure we’re on the same page, let’s look at the basics first.

The basics of data joining

What is data joining?

Let’s say you’re managing an online store. You’re running paid ads across popular social media platforms. You want to know what channels bring in the most revenue. To do this, you need to combine your paid social data with data from Shopify.

Or you want to see how your ecommerce funnel looks. For example, what pages customers visited and what products they added to their shopping cart before purchasing. In this case, you can connect Google Analytics data with Shopify data.

That’s a rough description of data joining. Whenever you join data from multiple data sources into a single dataset, you’re performing data joining. Data joining works when your joined data sources share at least one common dimension, or a ‘join key’.

Typically, business accumulates data from different sources. Without combining all the data, you’re missing the whole picture of your performance. Data joining helps you:

  • Uncover valuable insights from separate data sources.
  • Discover meaningful relationships among data sets.
  • Make better data-informed decisions.

Different types of joins

  • Inner join means combining data from both sources — matching it where the join keys are the same and dropping the data that doesn’t match.
  • Outer join means taking all the data from both sources — matching it where the join keys are the same. And finally, in the joined table, padding the non-matching columns with empty values.
  • Left join means taking all the data from the left table and the matching data from the right table where the join keys are the same.
  • Right join, similarly, means taking all the data from the right table and the matching data from the left table where the join keys are the same.
Different types of data joining

So, does data joining have anything to do with data blending?


Spoiler alert: It does. Data blending in Data Studio is a left outer join.

Data blending in Google Data Studio

By default, when you create a chart in Google Data Studio, you’re pulling data from a single data source. However, you can connect multiple data sources and visualize them together in a chart or a table with data blending.

Data blending is a left outer join

To blend data, you need to choose:

  • A primary source: the first (or leftmost) data source that you add to the ‘Blend data panel.’
  • At least one secondary source: any data sources added to the right of the primary source
  • A join key

Since data blending in Data Studio is a left outer join, the blended data will include all data from the primary data source and matching data from secondary sources that share the same join key.

Let’s take a look at the example below.

Here, Google Ads has conversion data from five countries: the United States, Germany, Finland, France, and Australia.

Facebook Ads has conversion data from seven countries: the United States, Germany, France, Ireland, India, Singapore, and Spain.

Conversions by country from Google Ads and Facebook Ads

If you pick Google Ads as your primary data source, the blended result will show conversions from the United States, Germany, Finland, France, and Australia only. Since Ireland, India, Singapore, and Spain aren’t included in Google Ads (your primary source), the data will be excluded from the blended table.

You can also see Finland, which has data in the Google Ads table and not in the Facebook Ads table, will stay in the blended table. However, its Facebook Ads conversion value will be ‘null’.

Blended conversions by country from Google Ads and Facebook Ads

Alternatively, if Facebook Ads is your primary source, your results will be conversion data from the United States, Germany, France, Ireland, India, Singapore, and Spain.

Blended conversions by country from Facebook Ads and Google Ads
Bartosz Schneider
“Any data that is not present in your primary table will be lost, and that’s the first thing you have to watch out for. It makes sense to choose the longest table as the primary source (the left table for the blend). Otherwise, you may lose some important data during the process.”
Bartosz Schneider, Lead Analytics Consultant, Supermetrics

In Google Data Studio, the first data source you bring into the ‘Blend data’ view is your primary source. Changing the order of the data source is pretty straightforward. All you have to do is drag and drop the data source to the position you want.

Create a blended source

How to create a blended data source

There are two approaches you can use to blend your data.

The first approach is quite quick and easy. If you have two tables with a common dimension, you can select both tables, right-click, and choose ‘Blend data’. Data Studio will quickly combine two tables into one. Then, automatically generate a blended data view based on the fields provided in the source tables.

How to blend two table in Google Data Studio

The second approach requires more steps, but it gives you a little more control of your data.

To start, click on ‘Resource’ → ‘Manage blended data’.

Manage blended data Google Data Studio

Next, open your ‘Blend data’ view by clicking on ‘Add a data view’.

Then, add the data sources you want to blend. Remember, the first data source you add to the view will become your primary source.

Data blending panel

From here, you can choose the join keys, dimensions, and metrics you want to blend.

Tip: Give your blended data source a name so it’s easy to distinguish from other sources later on.

Give your blended data source a name

After you’re happy with the setting, click ‘Save’.

Start building charts with your blended data source by adding it to the ‘Data source’ field.

Add your blended data source to Google Data Studio

The limitations of data blending in Google Data Studio

Accuracy

Traditionally, when you join data in a spreadsheet, you can use different formulas to tell the computer precisely what data you want to retrieve. This lets you see what’s happening with your data in each step. If an error occurs, you can always go back to the raw data and trace the problem.


However, with Data Studio, the join happens under the hood, so if the blended data has errors, you wouldn’t know what caused the problem.

Evan
“In Data Studio, you really have no way to check your joined data. So you’re kind of flying blind. It’s very hard to inspect. There’s no way to know what Google does, whether Data Studio is producing duplicates, or there’s no way to understand if fields are dropped if they don’t match.”
Evan Kaeding, Senior Sales Engineer, Supermetrics

Supports only left outer join

As mentioned above, data joining in Google Data Studio is always a left outer join. This can be somewhat limiting if you’re used to using different types of joins to enrich your data.

You have to be extra careful when blending data, especially the order in which you join them. One problem with the primary sources can harm the accuracy of your blended results.

Speed

You probably notice Google Data Studio can take its sweet time loading your reports. Things get worse when you bring data blending to the picture.

Whenever you create a blended data source, Google has to go through different APIs to retrieve data. And that process requires quite a bit of computational power.

The more blended data sources you add, the slower your dashboard will be.

A limited number of blended sources

Another frustrating limitation is that you can blend a maximum of five data sources. While this number sounds like a lot, it isn’t. Occasionally, in many advanced and in-depth reports, you need to blend data from more than five sources. You’ll easily cross the limit if you want to create a very detailed table with many columns.

So, should you just save yourself from all the trouble and avoid data blending?

In fairness, Google Data Studio does a splendid job with a simple and light blending. So if you want to blend one to two data sources with a simple join key like date, you can stick with Data Studio.

On the other hand, if you’re looking to gain more control over your data and do more advanced blending, Google Sheets is the way to go.

Data blending in Google Sheets

When data blending in Google Data Studio becomes a bit of a hassle, you can blend your data in Google Sheets and bring it back together in Data Studio for reporting.

This approach gives you more flexibility with your data. You can take advantage of the Google Sheets formulas to enrich your data. Additionally, it’s much faster to load blended data from a Google Sheet than from several sources.

In addition, you can use Supermetrics to pull data into Google Sheets automatically. You’ll have more time to do what you’re good at — analyzing the data and getting meaningful insights.

spreadsheet

Move your data into Google Sheets in minutes

Start a 14-day free Supermetrics trial. Full features. No credit card required.

Let’s take a look at some tips for joining data in Google Sheets.

Manage your data in Google Sheets

It can get messy quickly when you bring data from different sources to Google Sheets for blending. A good way to stay organized with your data is to divide them into separate tabs.

Evan
“It’s a good idea to separate your data into three buckets — raw data, blended data, and reporting data. So basically, you want to be able to make changes to both your raw layer and blended layer. But then you want to keep the final report as clean as possible.”
Even Kaeding, Senior Sales Engineer, Supermetrics

The ‘raw data’ tab is where you store all your unformatted raw data from your data sources. In this example report, we use Supermetrics to pull data from Facebook, Microsoft, and Google Ads into three separate tabs.

Put your unformatted data into the raw data tab

The ‘blended data’ tab is where the magic happens. You can match your data together and perform some calculations to get more insights from your data.

Campaign blended data tab

The ‘reporting data’ tab is where you put the last piece of the puzzle. When you’re done enriching and transforming the data, you can present them in a separate tab where it’s easier to monitor.

Additionally, you can connect the ‘reporting data’ tab to Google Data Studio to bring the final results to your dashboard. You can find the Google Sheets connector in the connector gallery.

Google Sheets Google Data Studio connector

Next, let’s take a look at some functions you need to know when blending data in Google Sheets.

Three useful functions for joining data in Google Sheets

VLOOKUP

VLOOKUP is one of the most used functions for data joining. It lets you search for a value in one table and use it in another table.

The syntax for VLOOKUP is:

VLOOKUP (search_key, range, index, [is_sort])

  • search_key: the value you want to look up.
  • range: the range that contains the value you want to look up. Note that VLOOKUP will search from the first column in your range.
  • index: the column number (within your chosen range) that contains the returning value.
  • is_sort: this parameter is optional. Here, you can specify if you want to receive an exact match (FALSE) or the nearest match value (TRUE). In the case of data joining, you’ll want to set it to an exact match.

You’re telling Google Sheets what value you want to search for, where you want to search for it, the column number in the range that has the value to return, and finally, if you want to receive an exact match (FALSE) or the nearest match (TRUE).

Let’s say you have two tables:

  • A marketing table with data about date, source, medium, campaign, impressions, cost, and clicks
  • A conversion table with data about date, source, medium, transactions, and revenue.
Marketing performance and conversion table

According to Bartosz, there are two steps to connecting the puzzles.

First, you need to create composite keys for two tables using the TEXTJOIN function. Each composite key can be used to uniquely identify each row of the table. Without the composite keys, you’re likely to run into one-to-many relationships. Additionally, you can use them as join keys for VLOOKUP.

Your composite keys will include campaigns’ date, source, medium, and campaign (which means campaign name in this case). It’ll look something like this.

Create composite keys

Next, use VLOOKUP to join two tables. For example, the formula for combining transaction data with the marketing table is:

VLOOKUP($A4,$A$22:$J$33,6,0)

Tip: Using absolute reference makes it easier for Google to search for the value and for you to drag the formula across your spreadsheet.

Simply put, Google searches the first column for the composite keys and returns the corresponding transactions.

Vlookup formula

IF + REGEXPMATCH

The first step is to remap the campaign name to new values with an IF function (columns F and N). That new cleaned-up name is then used as a join-key to generate the metrics table on the right side of the sheet, where metrics from two sources are aggregated together where the previously remapped campaign name matches.

The function we’re looking at next is a nested function — IF + REGEXPMATCH, where

  • IF starts a conditional evaluation.
  • REGEXPMATCH checks the target for a text match

Bartosz finds that this function comes in handy when he needs to remap campaign names from one or many different data sources.

Let’s take a look at the table below. As you can see, it has different naming conventions, for example, ‘Google Data Studio’ and ‘googledatastudio’ or ‘Enterprise’ and ‘enterprise’.

Google Analytics campaign performance data

You can put all your Google Data Studio campaigns in one basket and Enterprise campaigns in one basket using this formula:

=IF(

REGEXMATCH(A7,"Data Studio|datastudio"),"Data Studio Campaigns",

IF(REGEXMATCH(A7,"Enterprise|enterprise"),"Enterprise campaigns"

))

In simpler terms, your function searches in column A7 for ‘Data Studio’ or ‘datastudio’ and returns ‘Data Studio Campaigns’. If there is no such value, search for ‘Enterprise’ or ‘enterprise’ and returns ‘Enterprise campaigns’.

You can remap campaign names from different sources and use them as your join key.

Conditional aggregation

In Google Sheets, you can use different aggregation functions to summarize your data — calculating the sum, average, or counting the number of data points. However, in reality, you may not want to aggregate all the data you have. In that case, you can use conditional aggregation to specify which data you want to aggregate.

Conditional aggregation is a function that tells Google to perform data aggregation over a set of data when it meets certain criteria. We’ll take a look at some common conditional aggregation functions.

The SUMIF function tells Google to calculate the sum of the data that meets a predefined condition in a range. The syntax for the SUMIF function is:

SUMIF (range, criterion, [sum_range])

  • range: you want to specify the data range you want to apply the condition to.
  • criterion: you should specify the condition that defines which cells will be summed.
  • sum_range: you should specify the range to be summed if different from ‘range’. This is optional.

Take the table below as an example. Let’s say you want to calculate the impressions from the US. You can do so by using SUMIF (B3:J12, “US”, D3:D12).

SUMIF function

The AVERAGEIF function returns the average value of data that meets certain criteria in a range. The syntax for the AVERAGEIF function is:

AVERAGEIF (criteria_range, criterion, [average_range])

  • criteria_range: you should choose the data range you want to apply the condition to.
  • criterion: specify the condition that defines which cells will be averaged.
  • average_range: you should specify the range to be averaged if different from ‘criteria_range’. This is optional.

For example, if you want to calculate the average cost from the US, you can use AVERAGEIF(B3:J12, “US”, E3:E12).

AVERAGEIF function

Similarly, the COUNTIF function performs a conditional count over your data. The syntax for COUNTIF is:

COUNTIF (range, criterion)

  • range: the range you want to count.
  • criterion: the condition you want to apply.

For example, you want to count how many countries have CPC greater than 1. You can do so by using COUNTIF(H3:H12, “>1”).

COUNTIF function

Different ways to use data blending

There are many ways you can put data blending into practice. We’ll take a look at some examples in this section. Additionally, you’ll also find some ready-made templates with blended data that you can use right away.

Note that connecting the data sources to the templates will automatically start your 14-day free Supermetrics trial.

Compare your Facebook Ads vs. Google Ads performance

Google Ads and Facebook Ads are among the most popular advertising platforms. Even though it isn’t exactly an exact comparison, combining Facebook Ads data with Google Ads data can tell you which types of campaigns work best on which channels.

For example, in the Google Ads vs. Facebook Ads dashboard below, you can easily see:

  • A side-by-side performance of each channel.
  • The split in your cost, impressions, clicks, and conversions between two channels.
  • Your best-performing campaigns by channels.
Google Ads vs. Facebook Ads template

Swipe the Google Ads vs. Facebook Ads template >>

Organic social media

Managing your company’s social media accounts isn’t a walk in the park. For one thing, you have to manage at least three different accounts, all of which have different algorithms and requirements for content.

Blending data from social media platforms helps you manage your performance easily and stay on top of your social game.

For example, in the dashboard below, we combine data from four popular social media channels — Facebook, Instagram, Twitter, and LinkedIn. This dashboard is great for:

  • Monitoring your performance across channels.
  • Comparing the performance of different channels.
  • Diving deeper into each channel’s performance.
Social media mix template

Swipe the social media mix dashboard >>

Paid channel mix

You probably have performance marketing data in paid channel platforms and sessions and conversion data in Google Analytics. Blending paid ad data with web analytics data helps you understand which campaigns and channels drive high-quality traffic.

For example, in this paid channel mix dashboard, we blend paid data from Google, LinkedIn, Twitter, Facebook, Microsoft with Google Analytics data. With it, you’ll see:

  • A general overview of your paid campaigns.
  • A comparison of your performance with the previous period.
  • A table showing the key metrics of your best-performing channels.
Paid channel mix template

Get the paid channel mix dashboard >>

Organic search vs. paid search analytics

It’s not about organic search versus paid search. To grow your business, you need both. For example, performance marketers can look at the high-ranking search phrase and decide if it makes sense to bid on those keywords.

Similarly, content marketers can also use paid search data to fuel their content strategy.

In this organic search vs. paid search analytics template, our friends at OIKIO agency combine data from Google Ads and Google Search Console. It helps you drive conversions in both channels. With it, you can determine:

  • The Google Ads phrase performance that you haven’t included in your organic search.
  • The organic search phrase performance that you haven’t bid on with Google Ads yet.
  • The PPC/SEO ratio for a particular search phrase.
Organic search vs. Paid search analytics template

Swipe the organic search vs. paid search analytics template >>

Organic traffic and keyword analysis

It was pretty frustrating when Google removed search phrase data from Google Analytics. But worry not because there’s a workaround. By combining data from Google Search Console with Google Analytics data, you’ll figure out which organic keywords bring in traffic to your website.

Let’s take a look at this organic traffic and keyword analysis template by our friends at OIKIO. With this template, you can:

  • Figure out which search phrases bring in the most traffic.
  • Compare the performance of different landing pages.
Organic traffic and keyword analysis template

Get the organic traffic and keyword analysis template >>

Over to you

Congrats, you’ve made it to the end of this article. Now pat yourself on the back.

After all, data blending helps you make the most of your data and uncover more meaningful insights.

If you’re working with a small and manageable amount of data, you can totally take advantage of the data blending feature in Google Data Studio.

On the other hand, if you’re handling a much bigger dataset and want to have better control over your data, Google Sheets is a better solution.

And remember, whenever you need help with moving your data to Google Sheets, you can start your 14-day Supermetrics free trial.

Try Supermetrics for free

Get full access to Supermetrics with a 14-day free trial.
No credit card required.