Jan 27, 2020

The first 4 queries every marketer using BigQuery needs to know

By Supermetrics

BigQuery

[ Updated Mar 7, 2023 ]

6-MINUTE READ · By Andy Kozak

Google BigQuery presents a new way to store and analyze data extracted from different data sources. Utilizing the power of the Google Cloud Platform, larger data volumes and complex queries can get processed in minutes.

Loading data into Google BigQuery is easy: Supermetrics for BigQuery provides native connectors for a variety of sources.

After connecting your data sources to Google BigQuery, you are ready to dive into the details contained within. Multiple sources can be combined to one table. These tables can be used for what is called multi-channel reporting.

To get started with multi-channel reporting, we have compiled four ways to prepare your data to get insights across data sources.

The following four queries every marketer needs to know can be easily performed inside your Google Cloud Console for instant results.

Try out Supermetrics for BigQuery with a free 14-day trial.

1. Partitioning by date

The first query is a simple one. It extracts the campaign data from each data source for a specific month. It then combines them into a single table, giving us a multi-dimensional view.

In our example data, the Facebook ad campaign data is stored daily into a separate table. In order to get monthly insights, we need to perform three operations: convert days to months, aggregate data on a monthly level into one row per campaign, and scan through all days available in the dataset for that particular campaign.

SELECT distinct
campaign_name,
EXTRACT(MONTH FROM date) AS Month,
sum(impressions) as Impressions
FROM `example_Dataset.Example_DWH.FBADS_AD_*` where EXTRACT(MONTH FROM date)=4
GROUP BY campaign_name,
Month

Since we have multiple tables, we use a wildcard to perform a UNION function for consolidating all tables. Using a table suffix model allows us to have a different data structure for each day. Therefore, this allows us to add new metrics and dimensions.

Partitioning-by-date

2. Multi-channel reporting with simple data blending (UNION)

Many marketers operate within different platforms but lack the resources to have a clear comparison between them. Data is contained in separate tables, which can result in having different reports for each platform. With Supermetrics for BigQuery, this data can be utilized in a unified report. The process of utilizing numerous data sources in one report is called data blending. 

In the following example, we are doing a query to see how much we are paying for each campaign by month. We are using Bing Ads, LinkedIn Ads, and Facebook Ads as our data sources to produce a data table with results from each platform. As in the previous query, we narrowed the result set to a single month using the wildcard approach as in our previous query. To succeed in blending data, all data sources should contain the same dimensions and metrics to be comparable with one another.

SELECT
‘Facebook Ads’ as data_source,
campaign_name,
EXTRACT(MONTH FROM date) AS Month,
floor(sum(cost)) as Cost
FROM `supermetrics-sql.Supermetrics_internal_DWH.FBADS_AD_*` where EXTRACT(MONTH FROM date)=4
GROUP BY campaign_name,
Month,
data_source
UNION ALL
SELECT
‘Bing Ads’ as data_source,
campaign_name,
EXTRACT(MONTH FROM date) AS Month,
floor(sum(cost)) as Cost
FROM `supermetrics-sql.Supermetrics_internal_DWH.BINGADS_AD_*` where EXTRACT(MONTH FROM date)=4
GROUP BY campaign_name,
Month,
data_source
UNION ALL
SELECT
‘LinkedIn Ads’ as data_source,
campaign_name,
EXTRACT(MONTH FROM date) AS Month,
floor(sum(cost)) as Cost
FROM `example_data.Example_DWH.LINKEDINADS_AD_*` where EXTRACT(MONTH FROM date)=4
GROUP BY campaign_name,
Month,
data_source
ORDER BY data_source

The monthly cost of our campaigns is returned as a result.  By performing this query, we consolidated multiple sources into one table. We can now compare our campaign data extracted from different sources against one another.

UNION

3. Cross-channel reporting with advanced data blending (JOIN)

Sometimes we need to take the extra step and compare our metrics side-by-side. To achieve that, we need to join our data from multiple tables. Joining the different tables into one table gives us a multi-dimensional view. With a multi-dimensional view, we get a cross-sectional view on all of our operations.

Dimensions with exact match

The easiest way to achieve cross-channel comparison is to connect data dimensions with each other in one table. The following example shows us how to achieve this.

SELECT sub1.Month,sub1.Cost AS Fb_cost,sub2.Cost AS Bing_cost
FROM
(SELECT
EXTRACT(MONTH FROM date) AS Month,
floor(sum(Cost)) as Cost
FROM
`example_dataset.Supermetrics_internal_DWH.FBADS_CAMPAIGN_*`
GROUP BY Month) as sub1
LEFT JOIN
(SELECT
EXTRACT(MONTH FROM date) AS Month,
floor(sum(Cost)) as Cost
FROM
`example_dataset.Example_DWH.BINGADS_CAMPAIGN_*`
GROUP BY Month) as sub2
ON sub1.Month = sub2.Month
ORDER BY Month

This query contains a JOIN statement. A JOIN statement enables us to link two data tables with each other. The tables are grouped by the month extracted from the data tables. Therefore, it gives us a side-by-side view of monthly marketing campaign costs across two separate ad platforms.

BQ-JOIN

What if we don’t have dimensions with exactly matching names? There are two major approaches that we can utilize for linking dimensions that have only a partial match.

Creating manually maintained mapping tables

Manually maintained mapping tables

Manually maintained mapping tables help to remap Google Ads campaign tags to Facebook Campaign names that contain matching keywords. In case we have multiple campaign tools, we can remap these to a set of master data that contains generic campaign names.

Remap multiple campaign tools to a set of master data

“Best effort” matching using regular expressions

Instead of using master data, the matching can be done utilizing regular expressions. Regular expressions help to unify strings found in queries with a preset form. Regular expressions can be inserted directly into queries.

Our objective is to aggregate campaigns where the word “data studio” is present in the name. To achieve this, we will perform the following query. Using regular expressions, this query searches for the specified form and replaces it with “Google Data Studio”.

SELECT distinct regexp_replace(campaign_name,'(?i).data studio.’,’Google Data Studio’)
campaign_name,
EXTRACT(MONTH FROM date) AS Month,
sum(impressions) as Impressions
FROM `example_data.Example_DWH.FBADS_AD_*`
GROUP BY campaign_name,
Month

As a result, we have unified all of our campaign names to the same format.

4. External tables

You don’t always have the luxury of having automated data extractions. Many marketers commonly deal with spreadsheets stored in Google Sheets. Fortunately this data can also be loaded into BigQuery. Manually maintained spreadsheets can be linked to BigQuery as tables that can be used in queries.

To load the spreadsheet, we need to follow the following steps:

  1. Click “create a table” from the BigQuery console.
  2. Select Google Drive as a source.
  3. Paste the google sheet URL to the address field.
  4. Select the file format as “Google Sheets”.
  5. Select the project and dataset name. Enter a table name.
  6. Tick the auto-detect box.
  7. Click Save.

This transforms our manually maintained spreadsheet into a BigQuery table and enables us to run queries on the data contained.

Spreadsheet-to-BQ-table

All changes are updated to BigQuery, so new uploads are not needed whenever the spreadsheet is updated.

In summary

Using the power of BigQuery, a data-driven marketer is able to derive insights from a variety of data. BigQuery helps data-driven marketers blend data across a variety of sources with little to no effort. With multi-channel reporting, you are able to cross-analyze across various sources within minutes.

This guide showcased four ways to query a variety of data loaded into a BigQuery storage. These form a great basis to get into the world of BigQuery and marketing data warehousing.

Sign up for a free 14-day trial for a head start into the world of Supermetrics for BigQuery.

About Andy Kozak

Andy Kozak

Andy is the Head of Data Engineering at Supermetrics. He’s a passionate data engineer who is always curious about how things work. Andy believes in predicting the future (from data) and traveling back in time (on dashboards). Connect with him on LinkedIn.

Stay in the loop with our newsletter

Be the first to hear about product updates and marketing data tips