Decoding Incrementality testing: A deep dive into marketing measurement techniques with Olivia Kory

Evan:
Welcome to the show Olivia. Nice to have you here.
Olivia Kory:
Thanks Evan. Good to be here.
Evan:
So Olivia, I will have just introduced you to the rest of the audience in the intro here, but would you mind just giving us a quick introduction to yourself?
Olivia Kory:
Sure. My name is Olivia Kory. I am the Head of Go-To-Market at Haus. We are an incrementality platform, experimentation platform, a couple of years old and prior to joining Haus, I probably was similar to a lot of your listeners in terms of performance marketing. I led growth at companies like Sonos, prior to that at Quibi and at Netflix as well. So I was exposed to a lot of the types of problems that led us to really founding Haus.
Evan:
Yeah, thanks for the background and introduction and I think for many of our listeners, incrementality might be a bit of an unfamiliar word and what I'd like to make sure we do over the course of today's episode is really try to let everybody walk away with an understanding of what it is and how it applies to the business that you may be in today. So Olivia, maybe if you could just help us out first off, could you give us an understanding maybe of what incrementality is and maybe how that contrasts to what marketers typically think of as marketing measurement?
Olivia Kory:
Of course, and I think of incrementality in that word as kind of analogous to causation. You may have heard of the idea of correlation versus causation and a lot of digital tracking methods, whether that be platform reporting or multi-touch attribution, are built on this idea of correlation. This is really the idea that a user was exposed to an ad or they clicked on an ad and then they did this thing, they downloaded an app or they purchased a product, whatever it is that you and your business care about. And incrementality takes this a step further. What we're doing is trying to establish a counterfactual to understand what would that user have done anyway in the absence of the ad? Would they have purchased or was it the ad that really drove them to take that action? This is really common in other industries, which is why it makes it a bit surprising that it hasn't quite caught on in advertising yet, or moved as mainstream as you might think.
But if you think about a drug trial, you don't just give that drug to people and see what happens. You establish a group of people where you're going to give them that drug and a group of people that kind of looks just like that other group, statistically undistinguishable, and then you observe both groups and you see how much changed between those two groups in order to judge its efficacy. That's exactly what the concept of incrementality means in marketing is really establishing that counterfactual to understand what's happening anyway, what's happening organically and what is paid marketing or advertising driving on top?
Evan:
So with incrementality, it's probably fair to say that it's a scientific approach to quantifying the effect of marketing by trying to hold the effect of all external factors equal at least as much as possible. Is that a fair assessment?
Olivia Kory:
Exactly, yep.
Evan:
Okay, great. So you've worked on a number of big brands. I think you mentioned Netflix, Quibi, Sonos. And side fact, and one of the reasons that I was introduced to Olivia in the first place is I was actually at Wieden+Kennedy during the Quibi days and Olivia and I worked a little bit together on some of the analytics sides of the operation there. And so when you're at these big brands and you're working at this large scale, incrementality probably becomes a lot more important. And is that because of the organic effects that marketers are taking credit for that performance marketing wouldn't otherwise be driving? Or is it due to the size of the budgets or why does this tend to become more important the larger the brand that you're working on?
Olivia Kory:
It's a good question and I try to also frame it up as when you get to the point in your business where you have an incrementality problem, it's a great thing and it's something to celebrate. I often say, "Congratulations, this is a great problem to have in your business" because it means that there's so much organic demand, so much word of mouth that it's really starting to get fuzzy in terms of what is actually driving sales. Is it your advertising? Is it word of mouth? Is it your PR efforts, your social efforts? And so it's just not as clear to really see. And often I like to just give an example of if you are able to turn off Facebook ads tomorrow and see a very clear measurable impact on your business, you might not need a tool like Haus quite yet. You may be at the level of small enough that you can get by with some of these more kind of lighter weight tests.
And so yeah, I would also give an example of a brand like Barbie where that movie came out over the summer and everybody was really applauding the marketing for the movie and they were getting coined as geniuses for all the stunts that they were doing. And you really have to ask yourself how many of people were going to go see Barbie no matter what the marketing looked like and if anything, maybe they just needed to be reminded the day that it was coming out and that it was coming soon. But that's a fantastic example and why it's such a big deal for a brand like Netflix or a brand like Sonos where it's really hard to untangle what is actually happening outside of the world of advertising with what your paid ads are driving.
Evan:
Yeah. And I think especially with the Barbie example, you've got a significant portion of the population with just an innate latent understanding of who Barbie is and what she represents. And so how can you actually determine the impact of paid ads when in reality you're dealing with a, what is probably double-digit portion of brand awareness essentially. So that can be really tough I'm sure if you want to quantify the effect of your performance marketing. So is it a fair assumption then to say that with performance marketing and the way that you drive customers to whether it's like you said, a website for conversions, an app download, some kind of product purchase. Is it fair to assume that traditional methods like last click attribution or looking at a variety of different orders and their order source for example might be potentially over-counting the revenue that is attributable to marketing?
Olivia Kory:
Yes, and we see this all the time, no matter the size, I like to give the example of brand search just because it's very easy to pick on brand search where if you're searching for a brand and you get served an ad and you click on that ad, were you going to come in anyway? You were expressing intent, you were searching for that brand already, you were likely going to come in and purchase. Did you actually need to pay for that ad? And brand search is one of those channels that looks fantastic in a click-based attribution model. It always has the kind of best reported return on ad spend. And if you were just looking only at digital attribution methods, no matter your size, you're probably going to keep investing in brand search.
Another example, a common example is retargeting. I was just shopping for new cookware and I went to some sites of some of these cookware products and now I'm getting slammed with ads on Meta for these companies. I was already in the market, I was likely already going to purchase. And how many of those users really need that level of frequency and that many ads to make that purchase is a really common question and something that again, brands of all sizes should be considering.
And conversely, so those are examples of kind of tactics or strategies where your click-based or your attribution model might point you in the direction of investing more when in reality you may not even need to be investing there and you could kind of cut back and reallocate that to more effective channels. Conversely, I would say YouTube is a really good example of a platform that just looks awful in an attribution model. I think a lot of exposure happens on TV screens. Really hard to tie back to a click. And so performance marketers just think it can't work and they shut it off and they say, "We're not ready for this. It's more of a brand channel, more top of funnel and we'll start buying it when we need awareness. And then we'll move off of attribution and we'll actually just look at how did this channel drive sales in the regions that were treated versus the regions that didn't get YouTube."
And you see it can hang in there right next to search and right next to Meta, which has been a really, really powerful shift for a lot of marketers that we've been working with. And so that's why it's really important to ... No one is advocating to move away from attribution models. They provide a lot of value, they give you a really good day-to-day look at the business, but to calibrate those models with experiments when you need to.
Evan:
So it sounds like incrementality can give you a really radically different view of your marketing effectiveness and how you spend your money on acquiring new customers and it might exist as a supplement or a compliment to traditional click-based methods. What are some of the ways that marketers can get started on actually building out some incrementality understanding? I know that there are a lot of different tests that you run at Haus. What are some of those tests and could you maybe give us some examples of how those work in practice?
Olivia Kory:
Absolutely. I'll start off high level on just the types of experiments that you can run from a methodological standpoint. So there are three types of incrementality tests that you can run. One is a kind of ad platform conversion lift study that you would run directly with Meta or with Google. And because in the world of user acquisition, you don't know anything about the users you're acquiring, they're really kind of blind to you, you need to rely on Google and Meta to segment their user base into a treatment in a control group in order to actually set up an incrementality study. And so this involves Meta and Google saying, "We're going to divide our user base up, we're going to serve some to one group and we're not going to serve ads to the other group. And then we're going to tell you at the end of this campaign what that lift was and how the treatment performed versus the holdout."
I think it's a great method. It's been more challenged recently because of the privacy changes where it's not clear anymore that Google can actually allocate these users into treatment and control and see them through the lifetime of that campaign. If you think about it, they can't identify a lot of iOS users. So the privacy changes have really hindered the ability of these platforms to do this in a clean way. And additionally, I'd say that the challenge with platform tests is that the way Google does it is not at all comparable to the way Meta does it, and a lot of vendors don't offer it at all. So you don't really have a standardized measurement approach across channels and it's hard to compare your Meta read to your Google read and you're not getting reads for a lot of the other channels. So in terms of just having an apples to apples methodology there, it's a bit challenging.
The second type of incrementality test, and this is what we focus on in Haus, is geo-testing. And this is, you've probably heard classic match market testing. This has been around for a long time where we are segmenting populations by geo rather than user, which is what the ad platforms do. So we will identify, we'll look at a brand sales and say, "We're going to segment your high volume regions evenly between a treatment and control group, down to medium and low such that your regions in the treatment are statistically again indistinguishable from the control group. And then you are going to turn on Meta in the treatment group, you're going to leave it off in the holdout and we're going to look at the aggregate sales in these regions versus the control group."
This has great properties because it's privacy safe, we don't need any user level data. And in that way, the setup from an integration standpoint is pretty straightforward. Also great because you can test anything, offline channels included. There's really no difference in methodology across any channel that you test. You can test Meta, Google, but also offline, YouTube, OTT, direct mail in the same way.
Historically I would say some complaints about geo-testing are that it's very noisy, it's harder and you need to spend more to actually be able to sufficiently power a read because you're looking at aggregates.
And then the second piece with geo-testing is that is it's very hard to set up. So it's very resource intensive and often takes a data scientist to actually do that in a way where you have a clean comparable between treatment and control and then that analysis requires quite a bit of work as well. And this is what we're trying to solve for on both ends where there have been a lot of advancements in methodology over the past few years with synthetic control models to allow you to get a little bit more precision out of these tests and reduce that noise more than you historically would've been able to. And then resource wise, that's again what we're focused on is making this really scalable, super automated such that you could set up and launch a test in a few minutes without needing a large data science team, again, whereas previously and historically this has been a common complaint.
The third method, and I won't talk too much about this, but we call this observational or natural experiments, where you don't have a holdout group and you're just making a change and you're trying to observe incrementality by looking at kind of before versus after. So an example here is we're rolling out a promotion, we're going to look at sales before this promotion versus the time period where you did run that promotion. There's no holdout group, it's a lot less precise, so it's often a fallback if you can't run a more controlled experiment.
Evan:
Understood. So we've got three different types of tests realistically. We've got the user-based or platform tests, we've got the geo holdout tests and then we've got the observational studies. If I am a marketer who wants to get started with incrementality, I know there are certain thresholds that you need to meet in terms of spend to qualify for those platform studies in many cases. Is it something that I can just kind of walk right into and design my own test around? For example, if I've got a test that I want to run in two different countries, say I can pick Columbia and Peru for example, and I want to run two different promotions for my website, both Spanish speaking, both South American markets, is that something I can just roll in and start testing or are there further considerations I need to worry about?
Olivia Kory:
For the geo-testing specifically? Or do you want me to comment on both the user level and geo-testing?
Evan:
I'd say probably for the geo-testing.
Olivia Kory:
For geo-testing. Geo-testing you could set up tomorrow. All you need really is a sense of ... You need some historical data in order to do the stratified sample I mentioned where you are actually assigning regions into treatment and control based on sales. So we need some historical data but not nearly as much as an MMM would need. And so we need something to be able to do that segmentation. And then yeah, you could run a geo-test [inaudible 00:17:50] brands. It's actually a great method to not just look at advertising effects but a new product. You could roll out a new product in a certain set of geos and see how those markets behave relative to the holdout group. So really there's a low barrier to entry.
The only thing I'll say is that you want to make sure you're sampling regions in a way where you come out the other end with a result you feel confident in. An example that my founder likes to give is when Disney was launching a test with Google, they kind of accidentally put Anaheim and Orlando in the treatment group and they weren't balanced between treatment and control. And of course Anaheim and Orlando are two very, very large markets for Disney and it skewed the results of the test. So doing that in a way where scientifically you feel confident in the results coming out the other end is important, but that's really the only consideration. It's just establishing that pre-test fit.
Evan:
Understood. And you mentioned one thing about how this can actually be really beneficial for young brands, for example, to test the effectiveness of their messaging or launching a new product in a couple of different markets to see whether or not there's a significant lift due to their ads. And in the beginning, we talked about larger organizations who might be dealing with a significant amount of latent brand awareness of whatever it is they're selling or their intellectual property. What kinds of companies are those who benefit most from incrementality testing as a whole? Is this really kind of within the realm of large established companies or is this something that smaller, younger, less established companies should be looking at? Or does it really span the spectrum?
Olivia Kory:
It really spans the spectrum. We work with brands of all sizes. I would say that it's hard to give generalized kind of rules of thumb here. But I would say if you're spending less than $5 million a year in marketing and paid, you probably don't need to be doing a whole lot of this just because as I mentioned, you might be able to see this at the top line. You can just turn off Meta ads for a couple of days and get a read at least directionally on how much Meta is driving your business because you're not at a level where there's a whole lot of PR and word of mouth and organic demand that's muddying that top line.
Once you start spending north of 5 million, you're getting into 10 million plus, and you're expanding beyond just 1 single channel. That's also an important note is that if you're spending now across both Meta and Google and you're perhaps diversifying into YouTube and OTT and it's starting to get fuzzy for you in terms of which channel is actually driving that lift, then you'll want to consider incrementality testing and you'll want to start to experiment and really kind of toe dip into this.
But at those levels, at let's say $5 million of spend, you're likely not needing a tool like this because you could get most of the way with some of these directional methods on, off and maybe even platform conversion lift studies through Meta, which often they might not tell you that they're there proactively and you may have to go ask them for it, but I would do that. Google has them, they're just not talking about it actively. You can imagine why from an incentive standpoint they might not want to put their revenue at risk, but those tools are there and they could probably get you most of the way in the early days.
Evan:
Sure. And that brings up another good point, which is as marketers think about adding incrementality into their measurement design system, what are the ways that incrementality can compliment or supplement existing methods that marketers might be using? For example, last click, click based attribution, MMM or other measurement techniques?
Olivia Kory:
I love this question because we don't claim to be kind of the one single source of truth, and there are some papers on how models really can't get close without experiments. So experiments are a critical component, but ultimately these teams, you need to be able to report day-to-day performance to the business and experiments won't do that. And so what we like to say is that you should be triangulating performance across platform-based attribution, experimentation and potentially an MMM if you're a certain size and you have a more complicated kind of omnichannel business.
So a lot of our customers will run an experiment with us and let's just say they're running an experiment on Google, on search and they will see through a test that of the 100 conversions that Google reported in the platform attribution for the period of the tests, 50 of them were actually incremental. They were caused by the marketing and we've kind of filtered out the organics. Then they'll take that back and say, "Okay, my incrementality factor, my kind of adjusted CPA is 0.5 because 50 out of those 100 conversions were incremental." And that's how they're calibrating their platform attribution with experiments to actually report to the business on what happened yesterday. They'll say, "I'll take my platform reported conversions and cut that in half." So that's where teams are going and that's what I'd say is my recommendation on how to integrate these systems and these methods.
Evan:
That's a great tip. Once you figure out your incrementality factor, you can really apply it to your day-to-day measurement with the platform reported conversions. In fact, that's what I've seen on a variety of different clients specifically in the mobile app and gaming space where you run one of these tests maybe quarterly, maybe every six months, and then you have a dashboard basically that's pulling in all of your daily conversions and it's scaling it by whatever your incrementality factor is. So whether it's 0.5, 0.6, 0.4, whatever it happens to be, that gives you at least a reasonable degree of confidence of how many conversions you might've had over the course of the previous day or two.
That leads me to another great, I think, topic for us to jump into, which is marketers who come to Supermetrics oftentimes want to get involved in more sophisticated technology and more sophisticated marketing measurement techniques. Incrementality obviously can certainly be one of those, but specifically with MMM, I think one of the things that people find challenging is number one, the sheer amount of historical data that you need to bring to the table in order for MMM to be an effective technique.
And number two, and this is the big one, is the quality of that data needs to be very high in order for MMM to be an effective technique. So we've established this, and I've spoken a bit about this on previous episodes of the podcast, but could you talk a little bit about data preparation both in terms of quality and quantity and what brands should be thinking about for their marketing data and how they can best set themselves up for success when it comes to incrementality testing?
Olivia Kory:
Absolutely. Data quality is so important and even just beyond experimentation, like when I was leading the team at Quibi and again at Netflix as well, that was the first tool that we bought, was a tool that would help us kind of normalize and ingest spend data, conversion data, platform data across every platform. And we could see that in one centralized kind of consolidated view. Again, I'll get into experimentation, but just general reporting purposes, being able to see how much you spent on a certain creative across platforms is so hard to do, takes so much manual time and we've just seen that even before Haus, you should probably have a tool like that in place just from an efficiency standpoint.
In terms of experimentation, geo-testing is a bit unique in that we don't need a whole lot of data as I mentioned. We need sales by day by geo. Looking back, we typically ask for about 6 to 12 months, but if you don't have that, it's fine. And then we just need spend for any given test. So in order to be able to calculate an incremental CPA or an incremental return on ad spend, so we just need, again, spend for the period of the test and then we're [inaudible 00:26:42] the sales not from the ad platform, but likely from a customer Shopify account or from their data warehouse where they have sales.
If we want to take this to the next level, which is where a lot of our customers are, where they're using experiment data to calibrate their platform attribution or their MTA, then you do need spend, you do need conversions, you need a lot more from the ad platforms in order to be able to apply those adjustments on a regular basis. And that's where we see tools like Supermetrics really come in handy where that would be really messy and cumbersome without a tool where you're looking across all of your channels and you can see your spend and your attributed CPAs day over day.
Evan:
Makes a lot of sense. Makes a lot of sense. And when you have customers come to Haus and they say, "Olivia, Haus team, we need some help getting started on building out incrementality tests." What do the first 30, 60, 90 days look like for them? Is it really understanding the business, understanding the strategy? Are there data quality audits or different practices that you need to instill? What are some of the challenges that customers have when they first start getting used to this different way of measuring their marketing?
Olivia Kory:
One of the biggest challenges I would say in opportunities is that this is just a very new way of reporting, of looking at things. The idea of an incremental CPA, call it an ICPA, is so new that I think folks don't even have a guess or an assumption on what their incremental CPA is. They're starting from scratch. And so it's kind of resetting and establishing a baseline, kind of foundational understanding of what it is incrementality is going to show you and how to reset on some benchmarks of what to expect. Because we've seen that certain tactics you might think you were doing a $10 CPA might be after running a test, you might find that it was $100 so it's 10 X what you thought. And so it is really resetting on the metrics that we report on and how those are different from what you're used to seeing and educating the organization on why this is different and why this is important.
The onboarding piece outside of that education across your org is pretty straightforward. We want to understand where you're spending and your confidence level in these different channels, if you will. We just cut out. Yeah.
Evan:
Yeah. We just cut out.
Olivia Kory:
We're back now.
Evan:
Are we back? Okay.
Olivia Kory:
Okay, we're back.
Evan:
We're back now. So last thing I heard was the onboarding, you said the onboarding is pretty straightforward. You said we want to understand, I think you said your spend, that's where you left off.
Olivia Kory:
Yep, yep. The onboarding is straightforward. We want to understand where you're spending, which channels you're spending in, and which channels you feel confident in versus perhaps the channels you're a little bit more skeptical of. And then we'll help our customers build a testing roadmap in terms of first 30, 60, 90 days, what are the testing priorities? And then simultaneously in parallel, there's a data onboarding where as I mentioned, we'll ingest historical sales by day, by geo. And that's much easier for certain brands than others. If you're an e-comm brand and you have a shipping address, it's simple. If you're in Shopify, it's just a very basic integration and it takes a couple of days.
But if you're a mobile app for example, that geo data might not be as straightforward. If you're a free mobile app, the way in which you're collecting geographic data is a little bit more complicated. So that's one area where if you want to do more geo-testing, it is worth thinking about whether you have whatever action you care about, whether that's installs or signups, whether you have that broken down at a level of geo granularity that will enable you to test.
Evan:
That's super helpful. And one of the things, Olivia, that is I think going to be interesting for a lot of marketers using Supermetrics is we've talked a lot about a lot of the brands in this episode so far, Sonos, Netflix, we mentioned a couple of gaming examples, a couple of e-commerce examples, all of which are largely B2C examples. But I'm curious to know, based on your experience, do you also see broad applicability in the B2B space? For example, if we're generating orders or doing lead gen or something like that where maybe the signal is strong but far more sparse in terms of the revenue. When I speak to, for example, large commercial automotive manufacturers, they say, "Evan, everything you're talking about sounds great, but fundamentally we only get one order a month and that order is going to be make or break for our quarter essentially. So can we still use tools like this to help out when the signal may be just from a quantity perspective isn't quite as strong on the revenue side as it might be for like a B2B company?"
Olivia Kory:
It's a great question. We get this all the time and not just for B2B, but to your point, B2C that have a higher average order value like a mattress company for example, where you're looking at a $1,000 purchase. There are a couple of things you can do here. And you mentioned this. This is very common for some of our B2B customers. You can test a shallower kind of KPI. You could test leads, for example, and that shallower event is going to get you a view of how your marketing is driving, again, some of those actions that are higher up in the funnel. And in that way, I would say that that's a really common path.
And then additionally, you could run a longer test, so you could run instead of maybe most of our customers are running two to three week experiments, you could run a six or an eight week experiment and you could see not just how marketing is driving those shallower KPIs, but how it's resulting in revenue and in sales.
So those are the two pieces of advice I have for those types of brands. But in terms of B2B versus B2C, we do work with a number of B2B customers who are spending on paid marketing and they want to understand how paid marketing is driving the business. In that way, it's no different from the B2C use case. It is a matter of whether they have the data that they need to actually understand those outcomes. And if they have leads, it's a really great use case and we can follow those groups of geos through several weeks or months to understand how those leads are actually resulting in sales. So it's a great use case and we're seeing a lot more of it. If you have an enterprise sales motion and you're not spending in paid marketing, you're not buying Meta ads, it's probably not as strong of a use case just because the geo segmentation there gets a little bit messy.
Evan:
Yeah, I like your idea of picking a shallower KPI. So rather than modeling on sales, which might be of course what we're trying to drive at the end of the day, picking something like leads or form fills or something like that, that maybe happens a bit more frequently, that we can look at for a shorter period of time. We're just using a longer time horizon essentially to capture more of that data essentially.
Olivia Kory:
Exactly. And one thing I really like, some brands will do this is you'll run one test where you run a longer term experiment and you see how those leads kind of translated into sales over time. And then you have an adjustment factor that you can apply to future tests such that you don't have to run a very long test. Every time you want to run an experiment, you can just say, "Okay, based on our learnings from this previous experiment, we expect X% of these to actually turn into sales."
Evan:
Understood. Makes total sense. Olivia, thanks so much for your time on the podcast today. Anything else that you'd like to leave our audience with as last words?
Olivia Kory:
No, just so excited to be here. Thank you for having me on. And yeah, I'm available on LinkedIn, Twitter, if you want to reach out and have more questions. We love talking about this stuff.
Evan:
Awesome. Thank you very much. I'm sure some of our audience will be in touch.
Olivia Kory:
Thanks, Evan.
Evan:
Of course. See you later.
Olivia Kory:
Bye.

Decoding Incrementality testing: A deep dive into marketing measurement techniques with Olivia Kory

You'll learn

Subscribe to the Marketing Intelligence Show

Stay in the loop with our newsletter