What is marketing or media mix modeling?

Terence Einhorn, Sr. Solutions Consultant in Sales

Published 04/13/2022

Introduction to Media Mix Modeling

What is Media Mix Modeling?

Media Mix Modeling, or MMM, is a method of estimating media’s impact on a business’s sales by observing how week-to-week (or day-to-day) variation in media exposure is associated with variation in sales. The association (or correlation) of different media drivers to sales is used to calculate the relative impact of each, typically with respect to baseline sales, i.e., the sales realized without any media support.

MMM models typically require inputs in the form of 2+ years of complete and stable historical media data, including spend, impressions, and clicks, as well as non-media factors such as seasonality, special events, economic conditions, weather, competitive activities, pricing, and operational data. While laborious and time-consuming to implement, MMM has a unique advantage: It can address and measure basically any marketing tactic for which there is historical data.

What are the differences between “media mix modeling” and “marketing mix modeling”?

These two terms are often used interchangeably but have one subtle difference. “Media Mix Modeling” generally only addresses Paid Advertising efforts directly, while “Marketing Mix Modeling” endeavors to quantify the impact of any and all Marketing efforts on Sales. This could include PR, Sponsorships, Promotional Pricing, Coupons, In-Store Events (e.g., Displays and Product Demos), etc.

How does Media Mix Modeling Work?

Media Mix Modeling processes can be broken down into a few separate workstreams:

Data Collection and Validation

As MMM aims to isolate the impact of Media on Sales, it aims to account for all the other various factors that could also impact Sales. These include Marketing, Pricing, In-store activities, Significant one-time Events, Competitive activity, Macroeconomic factors, and Industry-specific factors, among many others. This requires the team to collect historical data, broken out weekly (in some cases daily) for any factor they expect may have a meaningful relationship to sales, and consolidate it into one cohesive dataset. This often requires extensive processing to align different data sets to the same dimensions, fill in data gaps where needed, and remove erroneous and outlier data. Once processing is complete, the post-processed datasets will generally be returned to their various originators for a “validation” or QA before the data receives final sign-off to begin the modeling process.

Modeling

Once the dataset is complete, model design can begin. A modeler will generally load this data set into a modeling platform (such as R, Python, SAS, or SPSS, an in-house platform, etc.) and begin the process with deciding on a model structure. Though there are various variations of MMM, most models follow a similar multiplicative model structure:

equation

where xi, w represents the spend, impressions, or clicks, etc. for the Channel i in week w, and i is the “coefficient” or measured quantity of interest representing the degree to which a change in the Channel i’s Spend/Impressions/Clicks correspond to a change in Sales for any week w.

There are a few MMM model structure variations and modeling components that are important to understand:

Additive vs. Multiplicative Model: 

An additive model sums up the effects of various factors rather than multiplying them together as in the above. This technique employs an assumption that marketing drivers are exactly additive in nature, independent of each other, and do not interact, and their cumulative effect is equal to the sum of the individual drivers. An example of an additive model is:

equation-2

A multiplicative model multiplies the terms together, which employs a different assumption that marketing drivers perform better when supported by other marketing drivers. In other words, your various marketing channels interact, and the total effect is greater than the sum of the individual drivers’ effects.

Indirect Models:

MMMs often deploy a “model within a model”, referred to as an “Indirect” or “Intermediate” model. This is useful when you need to control for misleading collinearity between a Media and Demand – a situation where Demand is a driver of Media Execution itself. 

For example, when more people are interested in your product, you are likely to receive more Paid Search clicks. In such a case, an MMM without an indirect model would likely attribute a vast amount of credit to Paid Search as it will be highly correlated with Sales, but we know that upstream Channels deserve a large share of the credit for driving users to search in the first place.

The solution is to build a multi-level MMM, whose output is not Sales but Search clicks, with the same Media Channels used as drivers within the model. This model will estimate the degree to which each channel is driving Search volume and allocate a portion of the credit going to Search in the main model back to these drivers.

Bayesian Priors:

Some model software solutions leverage a hierarchical Bayesian framework.  This allows for the input of "Priors" to be used in the estimation of the coefficients of a model. Priors are ranges of historical or known estimates that serve as bounds within which the modeling software can freely estimate model coefficients. Priors are useful for stabilizing model reads where data is sparse, but more importantly, allow you to leverage causal inference results from real-world experiments to define coefficients for testable channels, which in turn make the entire model more credible.

Response Functions:

Most common MMM structures employ an assumption that Media performance does not scale linearly. In other words, the more you spend on the Channel, the less efficient it is at driving Sales. This non-linear relationship between Spend and Sales, also known as a “response function with diminishing returns” or “diminishing returns curve”, can take different shapes depending on the model structure, Channel, and use case. 

Regardless of the exact shape of the response function, the core assumption remains the same: the ROI you receive on your “next dollar spent” will not equal your current ROI but will likely be lower. The ROI of your “next dollar spent” is referred to as Marginal ROI (mROI), which is an important metric for media plan optimization (see later section).

Adstocks and Lag:

It is generally assumed that certain upper funnel and branding Channels, such as Print Catalog, have a delayed and/or an extended presence in the lives and minds of consumers after their initial date of execution, and their impact on Sales should not be thought of as “immediate.”

MMMs can input this assumption to the model by “transforming” the media variables using an “Adstock” function. In simple terms, Adstock takes the execution variable of a channel (for example, Total Sent in the case of Catalog) and “spreads it out” over a period of time post-execution to allow its potential elongated impact to be measured.

“Lag” is a similar function that “moves back” or delays a Channel’s media exposure from the execution date to simulate the lag between media execution and a customer actually “acting on the media.” In the Catalog example, it usually takes a few weeks from the official “send” date for a customer to actually open, view, and order from the Catalog.

Calculation

Once the model structure and Priors are defined, and the final dataset is loaded, the modeling software will be used to run a statistical regression (also known as machine learning) algorithm. 

In simple terms, a regression looks through the dataset and observes how week-to-week (or day-to-day) variations in the independent variables (Media, Non-Media, External, etc.) correlate to week-to-week variations in the dependent variable (Sales). 

The more closely an independent variable, or “driver,” is correlated with Sales, all else equal, the higher the coefficient, or average calculated impact on Sales, will be.

Results and Output

Finally, the model software will leverage the coefficients calculated in the regression process and use them to calculate the historical “Contribution” of each Media driver to the business. 

Contributions, often referred to as “Components”, “Due-tos” or “Decomps or Decompositions” will serve as the basis for ROI calculations, Optimizations, Insights, and Recommendations. 

Though there are several mathematical variations when it comes to calculating Contribution, the general approach is to simulate the expected loss to Total Sales when a given Channel is removed from the mix. This hypothetical loss in sales is attributed to the Channel as its “Contribution” to the business.

Dividing a Channel’s Contribution by its Spend for the corresponding time period yields its ROI (Return on Investment, aka Return on Ad Spend or ROAS) for that time period.

Media Plan Optimization

The ultimate goal of MMM (or any Attribution technique) is to inform Media budget allocation with the goal of either: 

A) Maximizing sales for a fixed budget 

B) Minimizing Spend required to reach a specific Sales goal 

C) Maximizing spend while maintaining a profitable ROI. 

Though strategies A and B satisfy certain use cases, the best Optimization strategy, generally speaking, is C, as it maximizes that can be driven by Marketing, all else being equal. 

Strategy A runs the risk of leaving Profit on the table if total ROI is well above “break-even” after every dollar in the fixed budget is invested, or conversely could spend past the point of profitable return. Strategy B could similarly leave Profit on the table if the given Sales goal is too low (a higher goal could be achieved profitably with additional investment), or could spend well beyond the point of profitability to achieve a goal that was likely set too high to begin with.

In either case, the optimization principles in MMM work the same: reallocate investment from Channel with lower mROI to higher mROI until total ROI is maximized across the entire portfolio.

Note: (mROI = Marginal ROI, or the Return on Investment of the next incremental dollar invested in a given Channel)

As money flows into a Channel, the mROI (return on the next dollar invested) will decrease due to diminishing returns. As such, we generally want to keep spending money on a given tactic until its mROI is no longer profitable. 

(Note: the Channel with the highest ROI may not necessarily have the highest mROI depending on its current position on the response curve).

This allocation is generally performed using an Optimizer or a tool/program that can automatically calculate the best Channel to allocate the theoretical “next-dollar” until all the dollars within a budget are allocated or until “break-even” mROI is achieved with a fluid budget. The Optimizer typically implements efficiently designed mathematical optimization algorithms that search through all possible media allocations and arrive at an optimal mix that best achieves the specified goals while working within certain business constraints.

In theory, an optimal allocation would have every Channel showing equal mROI, right at or just above “break-even.” This rarely happens in actuality, as many Channels are constrained by real-world factors that inhibit investing or divesting freely. Examples include:

  • Pre-committed and paid budget for next year's Sponsorship deal with X sports league and cannot reallocate that budget
  • Paid search spend, which is demand constrained by how many people search for your keywords
  • Minimum budget requirements for media channels like national Linear TV 

How Do You Use Media Mix Modeling?

MMM is best used as a high-level, strategic guide for the following use cases:

  • Quantifying the overall impact and ROI of Media on Sales and subsequently calculating the optimal Total Media Budget
  • Calculating the optimal allocation of that Media budget across different Channels and Tactics at the monthly or quarterly level
  • Forecasting potential future business impacts of hypothetical scenarios
  • Gleaning insight into how External Factors impact your business

What are the Benefits of MMM?

The main benefit of MMM is its ability to measure all business drivers for which historical data is available, including offline channels and non-addressable media, in addition to non-media factors.

It also excels at capturing the longer-term impacts of media on sales and calculating the diminishing relationship between spend and ROI for a given media tactic.

Another key benefit of MMM is its widespread acceptance as a legitimate measure of Marketing performance not just in Marketing organizations but across Business and Finance organizations as well.

What are the Challenges of MMM?

MMM faces several challenges that can render it an incomplete Measurement solution in isolation.

The first and most obvious is the difficulty of data collection across so many disparate data sources. This is often a time and labor intensive process that can take months (even years in some cases) to complete before the first version of a model can be released.

The second challenge is the typical collinearity between media variables. On a time series basis, it is likely that many of your Media and Marketing vehicles are highly correlated (you tend to spend more across all channels at similar times, for example, around new product launches or the holidays). This can make extracting the particular impact of any given Marketing Channel difficult as its “variability” is in lockstep with many other Channels. 

Additionally, this makes MMM unsuitable for measuring very granular reads within a Channel (e.g., the Campaign or Adset level) as Campaigns will generally be intercorrelated in addition to having a lower volume of data off which to measure. MMM granularity is further limited by statistical degrees of freedom, there is a limit to how many variables can be measured in a model based on 2 - 3 years of historical data.

Similarly, many Media drivers are naturally correlated with Sales, being that their execution itself is driven by Demand (e.g., when people are interested in your Product, you will receive more Search clicks). This can make it tricky to separate the impact of Demand on Spend vs. Spend on Demand (see Indirect Model section above). Another reason for this is the seasonality of Demand and Spend – marketers typically spend more on Marketing when their products are naturally in high demand in order to boost effectiveness.

The third and most important challenge is the time-series nature of MMM. MMM seeks to quantify the average impact of a Media driver on Sales over the entire time frame for which you have data. While this is certainly useful from a strategic standpoint, it also makes it difficult for MMM to measure the impacts of ongoing changes to execution, creative messaging, and targeting within a certain channel (e.g., How did changing my creative impact the performance of my Instagram ad?). It is also difficult for MMM to capture the ever-changing response of users with an increasingly shorter attention span.

What is MMM’s Role in Marketing Strategy?

Given the above challenges, MMM’s ideal role in a Marketing measurement strategy is to provide coverage for holistic measurement for a wide variety of media channels that are otherwise difficult to address with similar methods (e.g., Linear TV, Radio, Print, Podcast, Influencer, Digital), and also to level set the long term, cumulative impact of Marketing as a whole before digging into the more granular impacts of each Channel or campaign.

MMM should also provide support for Business Forecasting and Planning, given its ability to track External factors in addition to Media. 

MMM works best when supplemented with Causal Inference results from Experimentation, and some form of day-to-day, granular campaign-level attribution to be able to address the needs of the weekly granular decisions made by channel managers.

How do you Compare MMM and MTA?

Unlike MMM, Multi-touch attribution, or MTA for short, is the approach of tracking consumer journeys across digital media touchpoints to ascertain which touchpoints are most present in converting vs. non-converting pathways.

MTA is considered a bottom-up attribution approach (as opposed to MMM, which is top-down) because it collects customer journey data at the user level and then uses the aggregation of those user-level journeys to estimate the impact of each marketing touchpoint. 

While the ability to parse out different user pathways is attractive on its face, the data limitations and gaps around many important digital tactics in a privacy-centric digital marketing world pose serious challenges to this technique, let alone accounting for channels like Podcast, TV, Radio or Print where impressions are not trackable at the user level.

How do you choose the Right MMM Tools?

Generally speaking, there are 3 types of MMM solutions:

Open Source

Open Source solutions, such as Facebook’s Robyn, are a simple and easy way to get started with MMM. They are inexpensive to implement and relatively quick to set up.

That said, they do come with a truncated feature set, a lack of customization abilities, and require heavy lifting (data collection, model validation, insights, recommendations, and optimizations) to be done by your team of experts and your team alone. 

Open-source solutions are best for Marketers with tight resource constraints but enough expertise to handle MMM in-house. 

Agile MMM

Agile MMM providers are a great middle ground between Open Source and Enterprise MMM. They generally leverage a semi-automated data Infrastructure, standardized model structures, and a lightweight service model that allows a degree of flexibility and customization at a fraction of the cost of Enterprise MMM. 

While these solutions offer the full feature set required for the basic use cases of MMM, they generally won’t satisfy more sophisticated use cases such as hyper-granular Model dimensions (e.g., breakouts by Product, Region, Campaign Type, etc.), robust business forecasting capabilities, and custom model components. 

They also generally don’t provide the extensive consultative white-glove service level of their Enterprise counterparts.

Agile MMM, due to its formulation,  is also particularly susceptible to multi-collinearity and must be paired with in-market experimentation to achieve sound results.

Agile MMM is best for Marketers who would prefer dedicated experts to handle the MMM process but don’t have millions of dollars to spend on a full-scale enterprise solution.

Enterprise MMM

Enterprise is, as the name suggests, the most expensive type of MMM solution and comes with all the bells and whistles. 

Enterprise firms, such as Analytics Partners and Transunion, build custom, flexible MMM from the ground up for their clients with a high degree of services attached. They also typically break out their MMM to a high degree of “cross-sections”, or combinations of dimensions like Product, Sales Channel, and Region.

This level of detail comes with a hefty price tag, usually over $1M annually. As such, Enterprise MMM is best for large brands that spend at least $100M annually on Marketing.  Like any other MMM solution, they are most accurate when paired with frequent in-market experimentation to achieve the best results.

Want to Learn More About
Incrementality and Attribution?