Practical Data Science: Promotion Effectiveness and Promotion Planning — Part 1

 

6-step end-to-end process to build promotion effectiveness measurement and planning solution


Business Problem | Key Concepts | Why should we care?

Photo by Lukas Blazek on Unsplash


For machine learning applications, it is not easy to find real-world use cases with end-to-end holistic views. Many academic programs offer great training in machine learning algorithms and modeling techniques. However, practical aspects beyond predictive modeling such as what raw data are required for a certain ML use case, what feature engineering is needed to create a modeling dataset, and what software applications are required after developing predictive modeling to ensure usage and end business-user adoption tend to be use-case specific. Thus, they are not taught extensively in typical data science and machine learning curriculums.

In a series of articles on “practical data science”, I will take a use-case approach and describe an end-to-end picture beyond predictive modeling. To achieve tangible business impacts, it is important to understand business problems (with the source of the value), data requirements and potential data sources, data pipeline, feature engineering, prescriptive analyses after predictive model building, design of end user facing software solutions, and how to best communicate results. I will start with “promotion effectiveness and promotion planning” as a first use case.

In this article, I will discuss:

(1) Business Problem: Promotion Effectiveness and Promotion Planning

(2) Key Concepts and Background Information

(3) Why Should We Care?

Business Problem: Promotion Effectiveness and Promotion Planning

Promotion is considered as one of the key pillars in 4Ps (Product, Pricing, Promotion, and Placement) in marketing. Both retailers and manufacturers have keen interests in measuring the ROIs of promotion events and planning of future promotion events to best utilize their promotion dollars. The key business problems in promotion effectiveness and promotion planning can be categorized into (1) post-mortem (i.e., looking back into the past): measurements and understanding of past promotion impacts on business and (2) forward-looking planning: which product to promote, when to promote, how much discounts, what promotion tactics to use, and where to promote. These questions are summarized in Figure 1.

Figure 1. Image by Minha Hwang

(1) Promotion Effectiveness: Backward-looking (Post-mortem)

Promotion ROI: The first key business problem is about the performance measurement of past promotion events. Promotion requires investments (e.g., lost revenues from price discounts, costs to print out coupons and set up end-of-aisle displays). Therefore, it is natural for the businesses to think about what benefits they can obtain from promotions. With promotion, there can be “incremental” unit sales (additional unit sales beyond expected level of unit sales without promotion). This sounds simple in concepts, but “incremental” unit sales are not directly observable from the data. With a promotion event for a specific product, store, and date, we can see actual unit sales with promotion. However, the counterfactual outcome — baseline unit sales for the same product/store/date (i.e., unit sales without promotion) is not observed. This missing data problem is a typical challenge for causal inference problems with observational data. If we can somehow measure “incremental” unit sales (I will describe later on “how”), calculations of incremental revenues and profits from a specific past promotion event can be done. Once we have “incremental” unit sales, it is fairly straightforward to calculate “incremental” revenue or “incremental” profits. By comparing incremental profits (i.e., return) with the required promotion costs (i.e., investment), we can evaluate the ROI of the promotion event.

Promotion DNA: Once the measurements of incremental unit sales and ROIs from past promotion events are possible, it is very natural to think about why certain promotion events did well and why the others did not deliver good ROIs. A driver analysis (a.k.a. DNA), which decomposes the incremental unit sales and profit to contributions of each drivers helps to answer main reasons why certain types of promotion events are more profitable than others. Key drivers can include the following types.

  • External factors: seasonality, holidays, trends, weather, key external events, location of stores, characteristics of store demographics, competitive intensity around stores
  • Own promotion: types of promotion (e.g., feature, display, coupons), promotion tactics (e.g., Buy One Get One Free, simple price discount, Must Buy 2), degree of price cuts (e.g., 5% discounts, 15% discounts), what other products in the product portfolio are simultaneously promoted
  • Competitor factors: promotion events from competitors
  • Cost factors: raw material costs, promotion execution costs, vendor funding (for retailers)

Pattern Recognition: After accumulating measurements of incremental unit sales, incremental revenues, incremental profits, and ROIs of past promotion events together with data on underlying drivers for 2–3 years, it is also possible to identify common patterns of winning and losing promotion events. Insights derived from such analyses can be useful for future promotion planning.

(2) Promotion Planning: Forward-looking

Counter-factual simulation (i.e., “What-if” analysis / scenario analyses): Performance measurement of the past promotion events is a great starting point. Once organizations embraced the measurement approaches and appreciated the values and insights of measurements, they want to apply the underlying modeling approach for future promotion planning. Compared to past measurements where all promotion drivers are known, planning adds an additional complexity of making assumptions on key drivers (e.g., what we should assume for competitor prices and promotions). There will be multiple “What if” scenarios to test, which calls for building a “promotion simulator.” This can be simple interactive apps (built with Shiny/R or Dash/Python) or enterprise grade software, depending on the required scales and sophistication. A promotion simulator produces expected incremental unit sales, revenue, profits, and ROIs for a given promotion scenario. Promotion simulators with intuitive user interface also helps with the deployment of modeling results through a simulation engine to end business users who are less technical.

Promotion optimization: More sophisticated retailers and manufacturers want to go beyond scenario analyses based on a promotion simulator. In this case, prescriptive data science techniques (e.g., optimization) can be applied to find out best promotion for a specific product at given location at given timing for entire product portfolio. Based on the optimization results, the best-N promotion candidates can be recommended for the final review of a promotion planner.

Promotion calendar: To manage interdependencies of different products and promotion events across time, more complex optimization techniques which incorporate these interdependencies can be used to support the developments promotion calendars with multiple products over time. This also handles purchase acceleration or stock piling phenomena.

To answer key business questions on promotion effectiveness and planning, the backbone of what we need to build is a causal econometric or machine learning demand model, which maps promotion event characteristics to unit sales (i.e., volume), together with other control variables. I will describe key concepts behind the causal machine learning demand model in the next section.

Key Concepts and Background Information

(1) Key Concepts and Intuition: Missing data problem for counterfactual baseline

If you try to measure promotion ROI, you will realize soon that it is not easily doable unless you ran carefully designed field experiments. The core problem is that we don’t have data on counterfactual outcome (baseline unit sales without promotion), even though we do observe actual unit sales with promotion for a specific product, store, and date. Therefore, we have to build a causal demand model with promotion as one of the key drivers. This demand model can be used to simulate counter-factual baseline unit sales since you can turn off promotion events for a given product, store, and week and simulate baseline unit sales.

Figure 2. Image by Minha Hwang

To make this idea more concrete, let’s consider a simple toy example of a single product in a store. Figure 2 shows how unit sales change over time for this product. You can reasonably guess that baseline unit sales would be around 100 by eye-baling the time series graph. With a promotion event in week 6, unit sales were increased to 140. Thus, “incremental” unit sales could be 140–100 = 40 in week 6 (i.e., promotion week). This is called gross lift. However, there can be more complications with substitutions over time and with other products. Consumers may delay the purchase or accelerate purchase around promotion events, which is shown as decreases in week 5 and week 7 in Figure 2. Net lift (after correcting for purchase acceleration or deceleration) can be smaller: (140–100) — (100–90)- (100–95) = 25 (instead of 40). With multiple products in a product portfolio, you also need to make adjustments since unit sales of other products can be reduced with the promotion of a focal product (i.e., cannibalization).

Depending on the source of volume, retailers and manufacturers have different preferences for a certain promotion event. If incremental unit sales are mostly from non-buyers (i.e., category expansion), both retailers and manufacturers would be happy. If incremental unit sales are mostly from other products in the own product portfolio, both manufacturers and retailers would be unhappy. If most of incremental unit sales are from other manufacturing competitors (i.e., competitive draws), manufacturers would be happy, but retailers might not be happy. A sophisticated causal demand model also allows the users to conduct source of volume analyses.

Figure 3. Image by Minha Hwang

Figure 3 shows actual market share data for Yogurt at a specific data with promotions. Red line shows price over time and blue line shows market share of the product over time. You can clearly see promotion bumps (i.e., spikes in sales) with feature, display, price cuts, and coupons. There are also reductions in sales, which might be caused by impacts from other products or purchase acceleration or deceleration. Finally, there can be trends and seasonality, which is easier to identify if we plot 2–3 years of data.

(2) Background Information: Different Promotion Types

Promotion can include both price discounts and non-price events. Figure 4 shows most typical promotion types for grocery retailers such as supermarkets. Price discounts are usually communicated with sales sign. It can be same sales prices for everyone (10% discount), sales prices with purchase requirement (Buy 10 to get $10 off), targeted price discounts (discount coupon issued for only certain types of consumers.) As for non-price promotion events, end of aisle display can be set up which grabs attention from consumers. Feature flyers can be distributed with a newspaper. To encourage trials of new products, free samples can be given.

Figure 4. Image by Minha Hwang

Why Should We Care?

There are 3 main reasons for why we should care about data science for promotion effectiveness and planning.

(1) Trade promotion is a big spending item for consumer-packaged-goods (CPG) companies and retailers. CPG companies worldwide invest ~20% of their revenue in trade promotions (same for retailers.) However, 59% of promotions lost money (72% in the United Sates). Best-in-class CPG promotions returned 5 times more than the least efficient ones.

(2) Data and machine learning is now mature enough to have scalable impacts. This used to be niche area due to lack of experts, custom-built models, and computing resource limitations. With open-source packages for econometric or causal machine learning models (e.g., PyStan or lme4 for linear mixed effect modeling, Explainable boosted machine) together with cloud computing infrastructure, the method is more accessible nowadays. Indeed, large data science consulting companies are adopting this as a key service line and making SaaS products.

(3) There are potentials for large impacts. Price elasticity modeling can have 1~2% increase in revenues or profits with adjustments of regular price levels if price elasticities are not measured with data before. Promotion effectiveness measurements and promotion planning can have additional 1~2% increase in revenues or profits by optimizing promotions. Note that the same underlying causal demand modeling can be used both for promotion use case and pricing use case.

In the subsequent article, I will describe (1) a 6 step end-to-end process to build promotion effectiveness measurement and planning solution and (2) how to build a data foundation for promotion modeling (raw data requirements and how to prepare modeling dataset.)

Reference: How analytics can drive growth in consumer-packaged-goods trade promotion, McKinsey Quarterly, October 2019 (Minha Hwang, Ryan Murphy, and Abdul Wahab Shaikh)

Comments

Popular posts from this blog

Cracking Business Case Interviews for Data Scientists: Part 1

How The Influence of Multi-Tiered Private Label Brand Architecture Varies Across Retailers

Cracking Business Case Interviews for Data Scientists: Part 2