The marketing analytics roadmap: 3 levels of data architectures
Every organization is on an analytics journey. Some are quite far along, while others are just getting started.
At Supermetrics, we provide solutions to help customers wrangle their marketing data and analytics, no matter what stage they’re at. From freelance marketers and small agencies to Fortune 500 companies, you’re in good company with us.
In this post, we’ll identify some of the architectural patterns that we’ve noticed while helping customers grow and scale their marketing analytics practices. With these patterns in mind, you can make an educated decision about what kind of tools your company needs based on where you’re at in this journey.
Psst! If you prefer the same content in video format, check out this webinar we recently recorded with our friends at Google.
Tier 1: The basic marketing analytics stack
For those just beginning their journey, a basic stack will get you quite far.
Many of our customers who are using Supermetrics to move data to Google Sheets and Google Data Studio as their primary reporting tool will eventually run into limitations. These usually come in the form of slow queries due to growing account volumes and large amounts of historical data.
Additionally, blending this data across marketing channels is cumbersome (if not impossible) and enriching it with data from outside marketing is challenging.
A Tier 1 stack will solve exactly these issues. Slow queries are mitigated by the durable and scalable storage provided by BigQuery. Blending data across marketing channels can be easily achieved with features in Data Studio or some light SQL.
It’s also easy to incorporate ad hoc data sources into the model using Google Sheets or Google Cloud Storage as inputs to your BigQuery model. The delivery of which can be fully automated and delivered to your data warehouse at the beginning of each workday.
You’ve now achieved comprehensive monitoring of your marketing and ad spend. In addition, you’re leveraging a significant amount of automation to do so. This kind of stack can be assembled by a technically savvy marketer, with additional functionality being unlocked with only a small amount of SQL.
Once everything seems to be running smoothly, you’ll find that you want to start getting more out of your data. You will eventually find that cross-channel analysis isn’t as easy as you’d like it to be.
Different data sources have different names for metrics and dimensions and it isn’t very easy to blend this data without more advanced SQL skills. Furthermore, as the number of additional data sources increases, the complexity of your data model will begin to grow. You’re likely to start getting to the point where having a dedicated marketing analyst to manage this process will be well worth the expense.
At this point, it’s best to start looking at a Tier 2 stack.
Tier 2: Separating reporting data from raw data
The Tier 2 analytics stack is characterized by a core distinction between “raw” data and “reporting” data.
As your data needs grow, you’ll realize that piping data directly from your sources into BigQuery and then straight on into Data Studio leads to suboptimal performance and leaves little opportunity for data enrichment. Separating your raw data tables from your reporting tables allows you to add one (or more) transformation steps to your process.
These transformation steps can help you clean, reshape, and enrich your data before piping it into your Data Studio dashboards. While you might be able to get pretty far doing this directly in Data Studio, its functionality pales in comparison with what is available using SQL in BigQuery.
The steps described above need to be built, configured, managed, and monitored, functionality that is all available in BigQuery. Cron jobs can help automate query execution making sure that reporting tables are up to date with the most recent data from raw tables and ensure that clean, consistent data ends up in your marketing dashboards.
We’re now starting to enter an environment where there will be significant returns to investing in SQL skills. Whether you’re a marketing analyst looking to up your SQL game or a technical analyst tasked with helping the marketing team, chances are much of your days will be spent in SQL. At this stage, it makes a lot of sense to have at least one or two dedicated analysts for your reporting stack.
So what can we do once we get here? Quite a bit that wasn’t possible before. Now that we have the technical chops to categorize, enrich, and blend our data in BigQuery, we’re able to unlock additional insights about media performance in the context of the overall business. Prescriptive analytics is now possible, which will help you drive suggestions for improvement in departments other than marketing. Furthermore, data centralization has made your team an integral part of the business. Well done!
But alas, there are still challenges here. The good news is that we’ve largely achieved many of the benefits of the Tier 3 stack. The problem, however, is that they are usually on shaky ground. In fact, the biggest limitation of a Tier 2 analytics stack is its lack of robustness.
Commonly, we see that Tier 2 stacks have a high degree of data pipeline complexity. Complexity is not a bad thing, but it is often paired with two other factors that can make it perilous: poor documentation and dependency on a single person. If you’re looking to ensure that your data warehouse is a strategic asset for your company, investing in shoring up these faults is a worthy investment. These are not easy problems to solve, but there are some tricks we can show you.
Tier 3: Future-proofing a Tier 2 marketing analytics stack
Fundamentally, the Tier 3 stack looks very similar to the Tier 2 stack. There are no more sources, storage pieces, or destinations than in the Tier 2 stack. The large difference here lies in four things: testing, orchestration, version control, and documentation.
Let’s dive into each of these four to talk about the problems they solve and why you’ll need them.
Would you be happy if the water company didn’t test the water before it came into your home? Absolutely not!
Think of your data warehouse the same way. You always want to remain skeptical about data that comes into your warehouse, whether it comes from your internal data sources, an external vendor, or even Supermetrics.
Having a framework that you can use to test for data consistency and data quality will pay huge dividends by ensuring that the data you deliver to your stakeholders is always of the highest quality. Using a tool like dbt or Dataform can help enhance your testing strategy.
Remember when we talked about those Cron jobs keeping tables in sync? Well, that only works well to a point. As soon as you have a few different sources with a few different transformation jobs, getting all of those cron jobs to execute reliably in sequence can be a real nightmare.
Using a tool like Airflow or Google Cloud Composer can help you dramatically simplify operations by ensuring that report tables don’t get built until all of their upstream dependencies have been met.
Software engineers use version control to track changes and enforce best practices within their organization. There’s also the huge benefit of auditability and traceability that helps reduce your dependency on a single person.
Using version control in your data stack will also give you cover in case you accidentally introduce an error into your processing code. You can always roll back to the previous version before anyone sees incorrect data. Github and Google Cloud Source Repositories are our preferred tools for this step.
If your team is building a strategic business asset, it had better come with an instruction manual. Encouraging your team to document everything can be tough, but it’s highly necessary as your organization grows and begins to add more people to the team.
Tools like dbt and Dataform can help with source code documentation, with Google Cloud Data Catalog helping with field-level documentation for your endpoints.
How many team members does it take to support a stack this size? It’s hard to say exactly, but we wouldn’t recommend approaching this with a data team of fewer than five people.
Of course, your business needs to be of a certain size to justify this investment, but different companies reach this break-even threshold faster than others.
Data-intensive companies like gaming or ecommerce can start to move toward a Tier 3 stack while they have fewer than 50 employees, as data is a critical asset for their company. For other companies, the ROI only makes sense after they’ve crossed a certain employee or revenue threshold. It’ll be up to you and your executive team to determine how valuable this data is for your business and then make your staffing decisions later on.
Over to you
In conclusion, no matter where your business is in its data journey, there is always room to grow. The data landscape is also changing rapidly. What we encourage you to do is get started.
The Google Cloud Platform has the best learning curve of any major cloud provider making it easy for marketing teams to transition into more technical environments without significant obstacles. The tight integration with Google Sheets and Data Studio makes it that much more approachable and easy to add business value quickly.
And as always, if you’re interested in moving your marketing data to Google Sheets, Google Data Studio, BigQuery, or Google Cloud Storage, simply start your free 14-day trial of Supermetrics today.
Turn your marketing data into opportunity
We streamline your marketing data so you can focus on the insights.