What Our Wearable Can't Tell Us: The Limits of HRV, Sleep Metrics, and Physical Scores

Mar 11
6 min read

There's something almost seductive about waking up and checking our numbers before we even check in with ourselves. Our app tells us we slept 87% efficiently. Our readiness score says 72…good, but not great. HRV is up slightly from yesterday. We process these outputs and begin forming a picture of how we'll perform today, usually before we've had a moment to actually notice how we feel. By the time we look internally to see how we’re feeling, if we even do that, we’ve already skewed our interpretation by checking the cold, hard data.

The appeal is obvious. These devices are remarkable pieces of engineering and track things our nervous system can't consciously access, patterns we'd miss over months of guesswork, and translate complex physiological signals into something actionable. With that said, there's a gap between what they actually measure and what we may assume they’re measuring. Whether we’re an athlete, student, professional, parent…however we identify, understanding that gap matters tremendously regardless of who we are and what we do.

Man in orange shirt sits on bed, looking concerned. Smartwatch shows heart rate 72. Background features graphs and symbols, suggesting confusion.

What HRV and Sleep Scores Are Actually Capturing

Heart rate variability, or HRV, which is the natural fluctuation in time between heartbeats, has become one of the most popular proxies for recovery and readiness. A higher HRV generally signals that the autonomic nervous system, our largely automatic system that governs heart rate, digestion, and stress response, is in a more flexible, recovered state. A lower reading suggests it's under stress.

Sleep scores follow similar logic. They typically pull from movement data, heart rate, breathing patterns, and time estimates for different sleep stages to produce a single number meant to represent how restorative the night was.

These are genuinely useful signals and reflect essential physiological processes, but they’re prone to capturing a narrow slice of a much larger picture. More specifically, they measure what our body is doing at the “hardware” level, meaning what’s physically happening in our body. They say very little about what's happening at the “software” level, meaning how we're interpreting our situation and environment, what emotional weights we're carrying, and what story we're telling ourselves about today and who we are. Yes, all of these pieces impact our physical metrics, but much of the nuance is lost in translation.

The Gap Between Signal and State

At one point or another, many of us have likely woken up with excellent HRV, low RHR, and solid sleep scores but feel completely depleted. Another person can post a low score after a restless night and feel surprisingly charged up and ready to move. Neither of these outcomes is a malfunction; they're evidence that physiological readiness and experienced readiness aren't the same thing.

Research consistently shows that perceived exertion, mood, and self-reported wellbeing are among the strongest predictors of performance outcomes and often more predictive than objective physiological markers. Subjective wellbeing measures (i.e. forms, self-report, open-ended conversations, etc.) have been found to track training load and performance changes better than many physiological markers alone. Multiple studies in team sports have also shown similar patterns, highlighting that how athletes said they felt was a reliable signal that objective metrics often missed.

A core reason for this comes down to what drives performance in the first place. Output, whether athletic, cognitive, or creative, isn't generated by hardware alone. It emerges from the interaction between our physical state and how our brain is interpreting and orienting to that state. Two people with identical physical metrics can perform very differently depending on their mental framing of themselves and the environment, their history with a given task, and how they interpret their performance metrics.

What Subjective State Tracks

It’s useful to define subjective state here: subjective state is the full internal experience that we're actually living from, not “mood” in the sense of how we’re likely used to hearing it, but our sense of how capable we are, our relationship to effort, the quality of our attention, how connected or isolated we feel, the narrative we're telling ourselves about our situation, and other factors. These aren't “soft” variables. They're functionally upstream of performance, where how we think largely determines how we perform.

The prefrontal cortex, which is the part of the brain responsible for planning, decision-making, and regulating emotional response, is incredibly sensitive to perceived threat and meaning. When we feel overwhelmed, uncertain, or behind, even slightly, it shifts resources away from higher-order processing. This shows up in how we perform long before it shows up in any wearable metric. Conversely, a genuine sense of engagement or purpose can compensate for physical fatigue in ways that no algorithm currently captures.

This is also why the same objective workload feels completely different depending on context. A hard training week that precedes a competition almost always feels different from an identical training week that follows a disappointing result. The physiology may look similar, but the experience and our performance often doesn't.

Where Wearables Struggle Most

The data gap tends to be widest in three areas. First, during periods of physical stress that haven't yet translated into measurable disruption. Early-stage burnout, relationship strain, and low-grade anxiety can erode performance significantly before HRV or sleep quality noticeably shift. By the time the wearable signal degrades, we're often already well behind.

The second gap happens during recovery. Whether we’re genuinely recovering or just resting is partly a function of our mental relationship to rest (i.e. whether we’re able to disengage, whether we feel permission to not be productive, and whether rumination continues to run in the background). A night of adequate sleep duration doesn't guarantee restorative sleep if our nervous system stays on low-level alert.

The third gap happens in the context of accumulated narrative load, which is the weight of ongoing stories about our identity, capabilities, and trajectory. A series of poor performance “points” can be quite dangerous if we let it. Rather than just reflecting a current state or a slump, it can start to shape how we interpret future data and how we see ourselves. The poor performance numbers become a self-fulfilling framing.

The Case for Integrated Assessment

None of this is an argument against wearables. As mentioned, the data they surface is massively valuable, particularly for detecting long-term trends, catching early signs of illness or overtraining, and providing a more objective anchor against which subjective experience can be checked. The deeper issue is using them as the primary or sole input for assessing our readiness and capacity to show up.

Complete performance orientation requires both streams, including what our body is doing and what we’re actually experiencing. Neither is sufficient alone. A low HRV reading means something different when paired with a subjective sense of heaviness and dread than when it follows a night of vivid dreams and disrupted sleep that still left us feeling surprisingly clear.

The most accurate picture of readiness isn't a single score. It's the relationship between the signals and our experiences–where they align, where they diverge, and what that pattern tells us about where we actually are and where we’re heading.

We've built remarkable tools for measuring the body. The next layer of understanding is learning to treat our own internal experience with equal seriousness.

References

Saw, A. E., Main, L. C., & Gastin, P. B. (2016). Monitoring the athlete training response: subjective self-reported measures outperform commonly used objective measures. British Journal of Sports Medicine, 50(5), 281–286. https://doi.org/10.1136/bjsports-2015-094758
Mclean, B. D., Coutts, A. J., Kelly, V., McGuigan, M. R., & Cormack, S. J. (2010). Neuromuscular, endocrine, and perceptual fatigue responses during different length between-match microcycles in professional rugby league. International Journal of Sports Physiology and Performance, 5(3), 367–383. https://doi.org/10.1123/ijspp.5.3.367
Meeusen, R., Duclos, M., Foster, C., Fry, A., Gleeson, M., Nieman, D., ... & Urhausen, A. (2013). Prevention, diagnosis, and treatment of the overtraining syndrome. European Journal of Sport Science, 13(1), 1–24. https://doi.org/10.1080/17461391.2012.730061
Kivimäki, M., & Steptoe, A. (2018). Effects of stress on the development and progression of cardiovascular disease. Nature Reviews Cardiology, 15(4), 215–229. https://doi.org/10.1038/nrcardio.2017.189
Plews, D. J., Laursen, P. B., Stanley, J., Buchheit, M., & Kilding, A. E. (2013). Training adaptation and heart rate variability in elite endurance athletes. Sports Medicine, 43(9), 773–791. https://doi.org/10.1007/s40279-013-0071-8
Beedie, C. J., & Foad, A. J. (2009). The placebo effect in sports performance. Sports Medicine, 39(4), 313–329. https://doi.org/10.2165/00007256-200939040-00004