Seeing through the CLV lens

Seeing through the CLV lens
Slide Note
Embed
Share

Updates regarding several assignments have been posted online, including model answers for Assignments 1 and 2. Additionally, tutorial videos for Assignments 2 and 3 are now available, while the tutorial for Assignment 4 will be posted on Monday. Today's agenda covers key concepts in churn modeling and customer lifetime value, emphasizing logistic regression methods for analyzing customer behavior and retention strategies.

  • CLV
  • churn modeling
  • logistic regression
  • customer retention

Uploaded on Feb 15, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Seeing through the CLV lens

  2. Updates on Assignments Model answers for Assignment 1 and 2 posted online Tutorial videos for assignment 2 and 3 posted Tutorial video for assignment 4 will be posted on Monday

  3. Todays Agenda Churn modelling Understand the basic principles of logistic regression Know how to build, and interpret churn models Customer lifetime value Understand the basic principels of CLV across contexts Know how to build basic CLV models for subscription-based businesses

  4. Churn modeling

  5. Churn modeling Goal: predict whether a customer will end their relationship with the firm (Churn or No Churn) Marketing Goal: Identifying Marketing Activities that increase / decrease the likelihood that a customer will end their relationship with the firm

  6. Binary variable of interest Retain profitable customers (1) Churn unprofitable customers (0)

  7. Relationship between Income and Response

  8. Method of Choice: Logistic regression Similar principles to linear regression, except logistic regression is designed to deal with categorical dependent variables: Did a customer churn? (yes/no) Is a customer retained? (yes/no) Will a customer convert? (yes/no) Linear regression assumes that the dependent variable is continuous (e.g., revenue).

  9. Logistic regression Suppose p is the probability of event Y occurring, given certain values of independent variables xn: p = P(Y=1 | {x1, x2, xn}. The odds of the event are p / ( 1 - p ): E.g., If there is a 25% probability of an event occurring, the odds of the event occurring are 25% / ( 1 - 25% ) = 25% / 75% = 0.33 = 1 : 3 Formally, the dependent variable of logistic regression is the logarithm of the odds: log (p / ( 1 - p ) ) = b0+ b1x1+ ... +bnxn+ e

  10. Odds and Odds Ratios Question: If someone is a Home student, are the odds higher that they will fail the course? Failed 23 6 Passed 117 210 Home Student Exchange Student TOTAL = 356 N (Failed) = 29 N (Passed) = 327 N (Home) = 140 N (Exchange) = 216 Odds (Failing) (Home Student) = 23 / 117 Odds (Failing) (Exchange Student) = 6 / 210 Odds Ratio = 23 / 117 = 0.02 = 6.88 6 / 210 0.03

  11. Odds and Odds Ratios Question: If someone is a Home student, are the odds higher that they will fail the course? Observed Failed 23 6 Passed 117 210 Home Student Exchange Student Expected Failed 11.2 17.3 Passed 128.8 198.7 Home Student Exchange Student

  12. Predicting course success student id courseviews_before courseviews_early attendance marketing pass 1 ACJFYDYAZB 11 7 1 1 1 2 ACXIIIXBCY 2 1 0 1 0 3 AWDVFBVMLM 5 42 1 0 1 4 BNDTWGKAIH 23 35 0 1 1 5 BOSTSXPWPS 5 10 1 0 1 6 BWQKENFVLW 5 0 1 0 0 NB: data are simulated and not based on actual course completion!

  13. Lets Deconstruct the Logit model Pass = B0 + B1 (Courseviews_Before)

  14. Lets Deconstruct the Logit model Prob(Pass ) = B0 + B1 (Courseviews_Before) Prob Range = [0, 1] (y axis) Midpoint = 0.5

  15. Transform the dependent variable into Odds P = B0 + B1 (Courseviews_Before) 1 - P Odds Range = [0, Infinity] Midpoint = (Problem of asymmetry) 1 (@ P = 0.5)

  16. Transform the dependent variable into LOG Odds P = B0 + B1 (Courseviews_Before) Ln 1 - P LOG Odds Range = [- Infinity, Infinity] Midpoint = 0 (@ Ln(1) = 0) (Problem of asymmetry gone)

  17. Transform the dependent variable into LOG Odds BINOMIAL LOGISTIC REGRESSION P = B0 + B1 (Courseviews_Before) Ln 1 - P LOG Odds Range = [- Infinity, Infinity] Midpoint = 0 (@ Ln(1) = 0) (Problem of asymmetry gone)

  18. We can add more variables to the Logit model BINOMIAL LOGISTIC REGRESSION P = B0 + B1 (Courseviews_Before) + B2 (Courseviews_Early) Ln 1 - P LOG Odds Range = [- Infinity, Infinity] Midpoint = 0 (@ Ln(1) = 0) (Problem of asymmetry gone)

  19. Interpreting the results glm is the function we need to run a logit model (general linear model) This part of the code specifies that we want to run a logit model As always, at the end we need to include the name of our dataset > fit<-glm(pass~ . -id, family=binomial(logit), data=ma2018_data) > summary(fit) This notation is similar to linear regression. pass is our dependent variable. ~ separates our dv from our iv. We want to include all variables in our dataset except student id number id . Rather than type our all the variables, we can write . followed by -id . This tells R to include all variables in the dataset, except for id . Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.261026 0.601571 -3.759 0.000171 *** courseviews_before 0.050823 0.024225 2.098 0.035908 * courseviews_early 1.677243 0.586075 2.862. 0.004212 ** attendance 2.008767 0.746092 2.692 0.007094 ** marketing -0.008964 0.024537 -0.365 0.714855 Interpretation of the regression coefficients: Positive: increases probability of outcome Negative: decreases probability of outcome

  20. Intepreting the intercept If a (hypothetical) person: does not view the course s MyCourses pages at all before or in the beginning of the course, does not attend the lectures, does not major in marketing (i.e., when all variables are zero), The odds of completing the course are [ e (-2.261026 ) = 0.1042435 ] -2.261026 0.601571 -3.759 0.000171 *** Estimate Std. Error z value Pr(>|z|) (Intercept) courseviews_before 0.050823 0.024225 2.098 0.035908 * courseviews_early 1.677243 0.586075 2.862. 0.004212 ** attendance 2.008767 0.746092 2.692 0.007094 ** marketing -0.008964 0.024537 -0.365 0.714855

  21. Translating Odds to Probability P / (1 P) = 0.10 P = 0.10 (0.10 * P) 1.10 P = 0.10 P = 0.10 / 1.10 P = 0.094 = Approx 9.5%

  22. Intepreting the estimates An increase in an independent variable by one unit changes the log-odds of the outcome by b is the regression coefficient (the estimate ). Courseviews_before: For each additional page view in MyCourses before the course starts, the log-odds of completing the course increases by 0.05. In real terms, change in odds = e 0.05 = 1.05 Students with one extra lecture attended have 1.05 times the odds of finishing a course than those with one less. -2.261026 0.601571 -3.759 0.000171 *** Estimate Std. Error z value Pr(>|z|) (Intercept) courseviews_before 0.050823 0.024225 2.098 0.035908 * courseviews_early 1.677243 0.586075 2.862. 0.004212 ** attendance 2.008767 0.746092 2.692 0.007094 ** marketing -0.008964 0.024537 -0.365 0.714855

  23. Intepreting the estimates An increase in an independent variable by one unit changes the log-odds of the outcome by b is the regression coefficient (the estimate ). Attendance: By attending all the lectures, the log-odds of completing the course increase by 2. In real terms, change in odds = e2 = 7,45 Students who have attended ALL the lectures have 7,45 times the odds of finishing a course compared to those who did not attend all the lectures. -2.261026 0.601571 -3.759 0.000171 *** Estimate Std. Error z value Pr(>|z|) (Intercept) courseviews_before 0.050823 0.024225 2.098 0.035908 * courseviews_early 1.677243 0.586075 2.862. 0.004212 ** attendance 2.008767 0.746092 2.692 0.007094 ** marketing -0.008964 0.024537 -0.365 0.714855

  24. Intepreting the estimates A student views various sections of the course s MyCourses page 6 times before the course, as well as 2 times in the early stage of the course. She attends all the lectures, and is a marketing student. What is her probability of completing the course? Log odds = -2,26 + 6 x 0,05 + 2 x 1,67 + 1 x 2,00 + 1x 0 = 3,40 Odds ratio: exp(3,40) 30,18 Probability of completing the course: exp(3,40) / (1 + exp(3,40)) 97% -2.261026 0.601571 -3.759 0.000171 *** Estimate Std. Error z value Pr(>|z|) (Intercept) courseviews_before 0.050823 0.024225 2.098 0.035908 * courseviews_early 1.677243 0.586075 2.862. 0.004212 ** attendance 2.008767 0.746092 2.692 0.007094 ** marketing -0.008964 0.024537 -0.365 0.714855

  25. Evaluating accuracy of predictions Out of 92 student in our FICTIONAL dataset, 55 did not pass the course ( 60%). Based on this, we can predict that for every student, it is more likely that they will fail than they will pass. We predict that every student in our dataset will fail the course. our prediction accuracy would be 60djfiasjdfia sj%: Prediction: will not pass 92 Prediction: will pass 0

  26. Evaluating accuracy of predictions Out of 92 student in our dataset, 55 did not pass the course ( 60%). Based on this, we can predict that for every student, it is more likely that they will fail than they will pass. If we predict that every student in our dataset would fail the course. Compared to actual data, our prediction accuracy would be 60%: Correct predictions (55) / total cases (92) = 60% Did not pass Passed Prediction: will not pass 55 37 Prediction: will pass 0 0

  27. Evaluating accuracy of predictions Values predicted by our model saved to object fit (see slide 17) Our logarithmic model fares better: Actual data (variable pass in the ma2018_data dataset > confusion_matrix <- table(fit$fitted.values>.5, ma2018_data$pass) > colnames(confusion_matrix) <- c("Did not pass", "Passed") > rownames(confusion_matrix) <- c("Pred: won t pass", "Pred: will pass") > confusion_matrix Correct predictions (50+19) / total cases (92) = 75% Did not pass Passed Prediction: will not pass 50 18 Prediction: will pass 5 19 > accuracy<-sum(diag(confusion_matrix))/sum(confusion_matrix) > accuracy [1] 0.75

  28. Evaluating accuracy of predictions Actual data Did not pass Passed True negative: 50 False negative: 18 prediction Model's Prediction: will not pass False positive: 5 True positive: 19 Prediction: will pass Model accuracy: (% of correctly predicted cases): (50+19)/(50+19+5+18) = 0,75 Model precision: (% of positive predictions that were correct): 19/(19+5) = 0,79 Model recall: (% of positive cases in data that were correctly predicted): 19/(19+18) = 0,51

  29. Quick Refresher about Errors

  30. Validating our model 2. Analyze training data to train the model Training data 1. Split the data Full dataset Testing data 3. Validate the model by seeing how well it predicts testing data

  31. Applications of logistic regression Churn modelling Modeling conversions Predicting customer behavior Diagnostic analytics: What are the causes of churn, conversions, other behavior? Predictive analytics: Anticipate behavior (identify users likely to churn, purchase) and take action accordingly. Usually involved machine learning more on this next week!

  32. Churn prediction in a telecom company > fit<-glm(Churn ~ . -customerID , family=binomial(logit),data=telco) > summary(fit) Coefficients: Estimate Std. Error z value Pr(>|z|) An increase in the customer s tenure by 1 month is associated with a change in the odds of churn by 100%*(exp(-0.06)-1) = - 6 %. (Intercept) 3.4834051 1.2166126 2.863 0.004194 ** genderMale -0.0338220 0.0962781 -0.351 0.725367 SeniorCitizen 0.0807436 0.1232729 0.655 0.512468 PartnerYes 0.0450094 0.1164443 0.387 0.699103 DependentsYes -0.1596980 0.1327263 -1.203 0.228894 Tenure -0.0594129 0.0098411 -6.037 1.57e-09 *** ... StreamingTVYes 1.4600912 0.075995 -2.733 0.006275 ** TotalCharges 0.0003580 0.0001113 3.217 0.001296 **

  33. Churn prediction in a telecom company > fit<-glm(Churn ~ . -customerID , family=binomial(logit),data=telco) > summary(fit) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 3.4834051 1.2166126 2.863 0.004194 ** Customers with a streaming TV service have 100 %*(exp(1.46) -1) = 329 % higher odds of churning compared to customers without streaming TV genderMale -0.0338220 0.0962781 -0.351 0.725367 SeniorCitizen 0.0807436 0.1232729 0.655 0.512468 PartnerYes 0.0450094 0.1164443 0.387 0.699103 DependentsYes -0.1596980 0.1327263 -1.203 0.228894 Tenure -0.0594129 0.0098411 -6.037 1.57e-09 *** ... StreamingTVYes 1.4600912 0.075995 -2.733 0.006275 ** TotalCharges 0.0003580 0.0001113 3.217 0.001296 **

  34. Customer lifetime value

  35. Customer lifetime value $$ $ $ Customer A Historically more profitable, but probably churned $ $ $ Customer B Historically less profitable, may be more valuable in the long run Present day

  36. Customer lifetime value Customer profitability (CP) Customer profitability based on historical revenue and cost data Customer lifetime value (CLV) Net present value of past and future cash flows associated with a customer relationship CP = R C R: revenue associated with customer relationship C: costs of serving the customer CLV = CPt=1 /(1+d) + ... + CPt=T/(1+d)T CP: profit from customer T: customer relationship length d: discount factor $$ $ $ Customer A Historically more profitable, but probably churned $ $ $ Customer B Historically less profitable, may be more valuable in the long run Present day

  37. Customer lifetime value ? ? 1 + ? ? ??? = M: Profit per time period (e.g. 10 EUR per month) = revenue per time period (100 EUR) * contribution margin (10%) r: Retention rate (e.g. 90 % per month) d: Discount factor (e.g. 10 % per month)

  38. Customer lifetime value (Spotify) (Sources: Statista, Reuters) Average revenue per subscriber: 4.72 EUR per month ? ? 1 + ? ? Contribution margin: ~22% ??? = Churn: ~5% per month Discount rate: 1% per month (assumed) M: r: d: 4.72 * 0.22 = 1.04 EUR per month 1 0.5 = 95% per month 0.01 % CLV (1.04 * 0.95) / (1 + 0.01 0.95) 14.5 EUR

  39. Uses for customer lifetime value modeling Marketing ROI: Use CLV in place of short-term sales to estimate ROI Company valuation: what is the value of a company s customer base (i.e., customer equity)? What-if analyses: How can changes in customer relationships (retention) affect firm performance? Customer prioritization and targeting: who are the most profitable customers that the company should prioritize and focus on?

  40. From CLV to ROI (Spotify) Let s say spotify spends 100 000 EUR on a promotional campaign, acquiring 16 000 new customers as a result. Average customer acquisition cost: 100 000 / 16 000 = 6.25 CLV: 14.5 (from previous example) ROI of online marketing campaign: ( 14.5 6.25) / 6.25 132% Note that we are assuming that CLV for customers joining as a result of the campaign is the same as for existing customers. Newly acquired customers behave the same way as existing customers Revenue is the same (e.g., no promotional discounts associated with marketing campaign)

  41. From CLV to valuation (Spotify) The intrinsic value of a company is the discounted value of cash that can be taken out of a business during its remaining life. Warren Buffet Let s assume that the value of a business is the combined value of its customers: Spotify makes nearly 100% of its profit from Premium subscribers (Source: Spotify) Total premium subscribers as of Q3 2020: 144 million (Source: Statista) CLV 14.5 (see previous slides) Total customer equity: 144M * 14.5 = 2.088 Billion Actual Spotify market cap: 42.33 Billion

  42. What-if analysis (Spotify) Getting from 2.088 Billion to 42.33 Billion Spotify s long-term business model is based on a combination of growth in user base, improvement in margins, reduction of churn, and increasing prices. User base: Margin: Churn:

  43. What-if analysis (Spotify) Getting from 2.088 Billion to 42.33 Billion Spotify s long-term business model is based on a combination of growth in user base, improvement in margins, reduction of churn, and increasing prices. If monthly churn decreases by 3 percentage points: CLV 33.97; CE = 4.89 Billion (insufficient) Add a price hike of 5 (costs stay the same): CLV 69.9; CE = 9.94 Billion (still insufficient) User base x4 (market penetration rate 47.9%): CLV 69.9, CE = 39.7 Billion (we re getting there )

  44. What-if analysis The CLV formula assumes that recurring revenue, contribution margin, and retention rate will stay fixed. Changes in the marketplace New competitors, new products and new technology can affect customer retention rates, the price a firm is able to charge for its products and services, and the costs of providing said products/services. In reality, these figures are likely to change, leading to biased CLV models. Changes in the business model =>We should run multiple CLV calculations with different figures to see how sensitive the results are to fluctuations! The focal firm may also choose to raise retention rates by investing more in its service quality or lowering the price of its services (or vice versa, raise the price or lower quality to make higher profits, and hope to live with lower retention).

  45. Prioritization and targeting (case: T-mobile) Pre-paid customers Monthly ARPU: $37.95 Monthly retention: 96% CLV $728 Post-paid customers Post-paid phone ARPU: $46.04 Monthly retention: 98% CLV $1500

  46. Prioritization and targeting based on CLV Customer lifetime value does not accurately measure the value of a customer to the company beyond the direct transactions that come as a result of the relationship. Low economic value High relationship value High economic value High relationship value Customers with low CLV may still be valuable! Halo accounts (e.g., the marketing benefit of having Apple as your client, Coke and Burger King) Referral accounts (e.g., customers who bring in other customers, Immigrants with kids in Finland) Low economic value Low relationship value High economic value Low relationship value Accounts that fuel co-creation and innovation in products and services (e.g. IDBM students at Biz)

  47. Any questions? Next week: Machine learning for marketers Course wrap-up

Related


More Related Content