Many Big Tech companies have created platforms that offer businesses tips and tools and services to better target their customers online. These can be helpful — but anybody relying on them needs to be very careful. That’s because many of these companies’ claims about how to measure advertising effectiveness are wrong.
Several major tech companies have recently built platforms that claim to educate companies about how best to market themselves and their products online. Examples include Meta for Business (formerly Facebook for Business; “Get step-by-step guidance, industry insights and tools to track your progress, all in one place”), Think with Google (“Take your marketing further with Google”), and Twitter for Business (“Grow your business with Twitter ads”).
These sites are very appealing. They provide small and medium-sized companies an abundance of genuinely helpful information about how to do business online, and, of course, they offer a variety of advertising tools and services designed to help those companies boost their performance.
All of these sites have the same basic goal. They want you to understand their tools and services as powerful and highly personalized — and they want you to invest your marketing dollars in them.
Not as Simple as It Looks
Facebook is perhaps the most insistent of the three companies cited above. In recent weeks, the company has been broadcasting ads that tell all sorts of inspiring stories about the small businesses that it has helped with its new services. Maybe you’ve seen some of these ads at airports, in magazines, or on websites. My Jolie Candle, a French candlemaker, “find[s] up to 80% of their European customers through Facebook platforms.” Chicatella, a Slovenian cosmetics company, “attributes up to 80% of their sales to Facebook’s apps and services.” Mami Poppins, a German baby-gear supplier, “uses Facebook ads to drive up to half of their revenue.”
That sounds impressive, but should businesses really expect such large effects from advertising? The fact is, when Facebook, Google, Twitter, and other Big Tech companies “educate” small businesses about their services, they often are actually encouraging incorrect conclusions about the causal effects of advertising.
Consider the case of a consulting client of ours, a European consumer goods company that for many years has positioned its brand around sustainability. The company wanted to explore if an online ad that makes a claim about convenience might actually be more effective than one that makes a claim about sustainability. With the help of Facebook for Business, it ran an A/B test of the two ads and then compared the return on advertising spend between the two conditions. The return, the test found, was much higher for the sustainability ad. Which means that’s what the company should invest in, right?
Actually, we don’t know.
There’s a fundamental problem with what Facebook is doing here: The tests it is offering under the title “A/B” tests are actually not A/B tests at all. This is poorly understood, even by experienced digital marketers.
So what’s really going on in these tests? Here’s one example:
1) Facebook splits a large audience into two groups — but not everybody in the groups will receive a treatment. That is, many people actually won’t ever see an ad.
2) Facebook starts selecting people from each group, and it provides a different treatment depending on the group a person was sampled from. For example, a person selected from Group 1 will receive a blue ad, and a person selected from Group 2 will receive a red ad.
3) Facebook then uses machine-learning algorithms to refine its selection strategy. The algorithm might learn, say, that younger people are more likely to click on the red ad, so it will then start serving that ad more to young people.
Do you see what’s happening here? The machine-learning algorithm that Facebook uses to optimize ad delivery actually invalidates the design of the A/B test.
Here’s what we mean. A/B tests are built on the idea of random assignment. But are the assignments made in Step 3 above random? No. And that has important implications. If you compare the treated people from Group 1 with the treated people from Group 2, you’ll no longer be able to draw conclusions about the causal effect of the treatment, because the treated people from Group 1 now differ from the treated people from Group 2 on more dimensions than just the treatment. The treated people from Group 2 who were served the red ad, for example, would end up being younger than the treated people from Group 1 who were served the blue ad. Whatever this test is, it’s not an A/B test.
It’s not just Facebook. The Think with Google site suggests that ROI-like metrics are causal, when in fact they are merely associative.
Imagine that a business wants to learn if an advertising campaign is effective at increasing sales. Answering this question, the site suggests, involves a straightforward combination of basic technology and simple math.
First, you set up conversion tracking for your website. This allows you to track whether customers who clicked on an ad went on to make a purchase. Second, you compute total revenues from these customers and divide by (or subtract from) your advertising expenditures. That’s your return on investment, and according to Google, it’s “the most important measurement for retailers because it shows the real effect that Google Ads has on your business.”
Actually, it’s not. Google’s analysis is flawed because it lacks a point of comparison. To really know whether advertising is making profits for your business, you’d need to know what revenues would have been in the absence of advertising.
Twitter for Business offers a somewhat more involved proposition.
First, Twitter works with a data broker to get access to cookies, emails, and other identifying information from a brand’s customers. And then Twitter adds information about how these customers relate to the brand on Twitter — whether they click on the brand’s promoted tweets, for example. This supposedly allows marketing analysts to compare the average revenue from customers who engaged with the brand to the average revenue from customers who did not. If the difference is large enough, the theory goes, then it justifies the advertising expenditure.
This analysis is comparative, but only in the sense of comparing apples and oranges. People who regularly buy cosmetics don’t buy them because they see promoted tweets. They see promoted tweets for cosmetics because they regularly buy cosmetics. Customers who see promoted tweets from a brand, in other words, are very different people from those who don’t.
Companies can answer two types of questions using data: They can answer prediction questions (as in, “Will this customer buy?”) and causal-inference questions (as in, “Will this ad make this customer buy?”). These questions are different but easily conflated. Answering causal inference questions requires making counterfactual comparisons (as in, “Would this customer have bought without this ad?”). The smart algorithms and digital tools created by Big Tech companies often present apples-to-oranges comparisons to support causal inferences.
Big Tech should be well aware of the distinction between prediction and causal inference and how important it is for effective resource allocation — after all, for years they’ve been hiring some of the smartest people on this planet. Targeting likely buyers with ads is a pure prediction problem. It does not require causal inference, and it’s easy to do with today’s data and algorithms. Persuading people to buy is much harder.
Big Tech firms should be commended for the helpful materials and tools they make available to the business community, but small and medium-sized businesses should be aware that advertising platforms are pursuing their own interests when they offer training and information, and that these interests may or may not be aligned with those of small businesses.
Editor’s Note (12/16): The headline on this piece has been updated.