ChurnBurner
Customer Behavior··7 min read

Why Your Customer Health Score Is Lying to You

Most SaaS health scores are just login frequency with a traffic light on top. They tell you who already left, not who is leaving. Here is what to track instead.

The traffic light problem

Open any customer success platform and you will find a health score. It is usually a number from 0 to 100, color coded green, yellow, or red. It looks scientific. It feels actionable. And it is almost certainly misleading you.

Here is what most health scores actually measure: did the customer log in recently, and did their payment go through? That is it. Maybe there is a support ticket count thrown in for good measure. The score is a dressed-up activity check presented as predictive intelligence.

The problem is not that these inputs are wrong. Login frequency and payment status do correlate with retention. The problem is that by the time these metrics turn red, the customer has already decided to leave. You are not looking at a leading indicator. You are looking at a lagging confirmation of a decision that was made weeks ago.

78%
Health scores based on login + payment
Per industry survey of CS platforms
3-6 wks
Gap between churn decision and cancellation
For monthly SaaS subscriptions
<15%
Save rate on red accounts
When intervention starts at the red stage

Why login frequency is a terrible health metric

Login frequency is the most common health score input and the most deceptive. A customer logging in every day looks healthy. A customer who stopped logging in looks at risk. Simple.

Except that a customer logging in every day might be doing so because they are frustrated, trying repeatedly to get something to work. And a customer who stopped logging in might have automated their workflow through your API and no longer needs the UI at all.

Login frequency tells you nothing about value realization. It tells you whether someone opened your app, not whether they got anything useful out of it. A customer who logs in once a week but runs three reports, exports data, and shares dashboards with their team is far healthier than a customer who logs in daily but only checks one metric and leaves.

The metric that matters is not how often someone shows up. It is what they do when they get there, and whether the depth and breadth of their usage is expanding or contracting over time.

A customer whose feature usage narrows from 8 features to 2 over a quarter is showing a stronger churn signal than a customer whose login frequency drops from daily to weekly. Narrowing usage means they are consolidating to the minimum viable interaction before leaving.

The behavioral signals your health score should track

If login frequency and payment status are lagging indicators, what are the leading ones? Behavioral signals that capture how a customer's relationship with your product is changing over time.

Engagement decay rate. Not a binary "active or inactive" but a continuous measurement of whether engagement is trending up or down. A customer whose usage is declining at 10% month over month is in early drift, even if their absolute usage still looks healthy. EWMA (exponentially weighted moving average) smoothing catches this gradient better than simple period-over-period comparisons because it weights recent behavior more heavily.

Feature adoption breadth. Healthy customers explore and adopt new features over time. At-risk customers consolidate to fewer features. Measuring usage entropy (the distribution of activity across features) tells you whether a customer is becoming more embedded or more fragile. Low entropy means single-feature dependency, which is one missed use case away from cancellation.

Time-to-value milestones. Every product has key activation moments: the first report generated, the first integration connected, the first team member invited. Customers who hit these milestones on schedule retain at dramatically higher rates than those who are late. A health score that does not track milestone velocity is ignoring the single strongest predictor of long-term retention.

Champion dependency. If 90% of an account's activity comes from a single user, that account's health is entirely dependent on one person's engagement. When that person gets busy, changes roles, or leaves the company, the account is dead. Multi-user adoption is structural health. Single-user concentration is structural risk.

Typical Health Score Inputs
  • Login frequency (did they show up?)
  • Payment status (did the charge succeed?)
  • Support tickets (are they complaining?)
  • NPS survey response (did they rate us?)
  • Days since last activity (binary threshold)
Behavioral Health Score Inputs
  • Engagement decay rate (are they drifting?)
  • Feature adoption breadth (are they deepening?)
  • Time-to-value velocity (did they activate on schedule?)
  • Champion dependency (is one person the whole account?)
  • Support sentiment trend (is frustration building?)

Sentiment is a signal, not a score

Support ticket volume is a common health score input, but volume alone misses the point. Three tickets in a month could mean a deeply engaged power user requesting features, or it could mean a frustrated customer about to leave. The number tells you nothing without context.

What matters is the direction sentiment is moving and the behavioral pattern that follows. A customer who writes a frustrated ticket and then continues using the product normally is venting. A customer who writes a frustrated ticket and then goes quiet for two weeks is leaving.

This is why sentiment analysis in isolation has limits. The words customers use are one data point. What they do after writing those words is a much stronger signal. Combining text sentiment with behavioral follow-through (did usage change after the interaction?) gives you a read on actual customer health that raw NPS scores and ticket counts never will.

Most customers will not reach out until they are already frustrated. The absence of support tickets is not a positive signal. It often means the customer gave up on getting help and is now evaluating alternatives quietly.

Normalizing by customer age changes everything

A customer with 3 logins in the past month could be healthy or at risk. It depends entirely on context. A one-month-old account with 3 logins is showing early engagement. A twelve-month-old account that used to log in 20 times a month and now logs in 3 times is in freefall.

Most health scores treat all customers the same. They apply the same thresholds regardless of tenure, plan size, or acquisition cohort. This produces two types of errors:

False positives on young accounts. New customers naturally have lower usage because they are still onboarding. A one-size-fits-all threshold flags them as at-risk when they are actually on a normal ramp.

False negatives on mature accounts. Long-tenured customers who are declining get scored as healthy because their absolute usage is still above the threshold, even though their trajectory is pointing straight down.

The fix is cohort normalization. Score each customer relative to peers at the same tenure stage (onboarding, activation, growth, mature). A customer at the 25th percentile of their cohort is at risk even if their raw numbers look acceptable. A customer at the 75th percentile of their cohort is healthy even if their raw numbers look low.

2.3x
False positive reduction
When normalizing by tenure cohort
40%
More at-risk accounts caught
That raw thresholds miss entirely
4 stages
Tenure normalization segments
Onboarding, activation, growth, mature

Build a health score that actually predicts

The difference between a decorative health score and a predictive one is not sophistication. It is inputs. Swap login frequency for engagement decay rate. Swap payment status for feature adoption entropy. Add milestone velocity and champion dependency. Normalize by cohort. Now you have a score that catches customers in Phase 1 (drift) instead of Phase 3 (detachment).

The intervention window for drift-stage customers is 4 to 8 weeks. The save rate is roughly 70%. The intervention window for detachment-stage customers is days. The save rate is under 10%. Every week of earlier detection directly translates to recoverable revenue.

ChurnBurner builds this behavioral layer on top of your existing data. Connect your Stripe account and event stream, and you get a multi-dimensional risk score that combines payment signals, engagement decay, feature adoption, champion dependency, and support sentiment into a single score with transparent drivers showing exactly why each account is at risk.

Stop watching the traffic light. Start watching the behavior.

Start a 14-day free trial at churnburner.com.

ChurnBurner customers see an average 4.3x lift in the top decile of risk predictions, meaning the accounts flagged as highest risk are 4.3 times more likely to actually churn than a random selection. That is the difference between a health score that decorates your dashboard and one that saves revenue.

Ready to see your risk scores?

Connect your Stripe account and get your first churn risk report in under 5 minutes. 14-day free trial, no credit card required.

Start free trial