Methodology

How Peregrine computes the numbers it shows you.

This page documents every analytic on the site: what the metric measures, how it's computed, where the math comes from, and where the math is an internal heuristic that isn't from a published paper. The goal is auditability. If a number on a Peregrine page looks wrong, you should be able to come here and figure out why.

Contents

Readiness score
Form, Fitness, Fatigue
Load ratio (ACWR)
Heat- and grade-adjusted pace (HAP)
Personal pace zones
Workout classification
VO2max estimation
Race time prediction
Confidence scoring
Marathon readiness
Fitness categories
"What's New" detectors
Plan and workout recommendations
References

Readiness score

The readiness score combines four signals into a 0–100 number that frames how today's training should feel. Form (TSB) percentile against your 90-day history contributes 40%. Recent training load versus your 28-day average contributes 30%. Days since your last hard run contributes 20%. Recent long-run heart-rate drift against baseline contributes 10%.

The score is binned into three bands: green (≥70) indicates room to push, amber (40–69) suggests moderate effort, rose (<40) signals accumulated fatigue. The headline sentence is deterministic given the score and the dominant component; the same inputs always produce the same headline.

Weather, when available, can adjust the recommended effort phrasing but never changes the band. If forecast temperature exceeds about 22°C the headline mentions heat; the score itself stays driven by training signals.

This is an internal composite, not a published score. The weights were tuned against typical training patterns. Runners with unusual schedules (e.g. ultra-runners doing 20-hour single runs, athletes returning from injury) may find the score biased and should weight the per-component breakdown in the Today card tooltip more heavily than the headline number.

Form, Fitness, Fatigue

The Numbers panel surfaces three values derived from a TRIMP series. TRIMP (Training Impulse) is a single number per run that combines duration and heart-rate intensity. Peregrine uses Banister's exponentially-weighted form^[6]: each run's TRIMP is computed from time-in-zone using your max heart rate, and the result feeds two exponential moving averages.

Fitness (CTL) is a 42-day exponential average of daily TRIMP. It moves slowly. Think of it as the training load your body has adapted to.

Fatigue (ATL) is a 7-day exponential average of the same daily TRIMP series. It moves quickly. Think of it as the load your body is currently carrying.

Form (TSB) is Fitness minus Fatigue. A positive Form value means rested relative to your chronic load. A negative Form value means you're carrying more recent load than your baseline. Form mirrors the readiness band thresholds so the colors read consistently.

If you don't have heart-rate data on a run, TRIMP falls back to a pace-based estimate using your own pace distribution. This is less accurate but keeps the series continuous. The pace-based fallback uses a sigmoid relationship between pace-as-percentile and effort intensity.

Load ratio (ACWR)

Load ratio is the acute-to-chronic workload ratio^[4]: your 4-week average mileage divided by your 12-week average mileage. The ratio describes how quickly your training is ramping up or down. Values between 0.8 and 1.3 sit in the "safe corridor." Ratios above 1.5 correlate with elevated injury risk in the published literature; ratios below 0.8 typically indicate detraining or a deliberate down week.

The Numbers panel shows the ratio as a colored band with a dot at your current position. The strip spans 0 to 1.9; values past 1.9 peg the dot at the right edge.

The original ACWR research is contested. Some recent papers have questioned how reliably the metric predicts injury at the individual level^[7]. We surface it because it's the standard load-ramp signal in endurance training and remains useful as a directional indicator, not as a forecast. A high ratio is a flag to consider; it isn't a verdict.

Heat- and grade-adjusted pace (HAP)

Pace alone is a noisy indicator of effort. A 7:30/mi run on a 90°F day at 3% incline isn't the same effort as 7:30/mi flat on a cool morning. HAP normalizes pace to equivalent flat-and-cool conditions so two runs can be compared like-for-like.

The grade adjustment uses Strava's published grade-adjusted-pace coefficients: the metabolic cost of running scales non-linearly with slope, and the corrected pace reflects what you'd have run on the flat for the same metabolic load.

The heat adjustment is based on published correlations between ambient temperature and pace decrement^[8]. Warmer than about 13°C, pace slows by a roughly logarithmic amount as temperature rises. HAP undoes that decrement. Below about 13°C the adjustment is zero, and cool weather isn't penalized.

Activity Log rows display three columns side by side: Pace (raw), GAP (grade-adjusted only), and HAP (grade and heat). When weather data isn't available for the run, HAP equals GAP. When neither weather nor GPS elevation is available, all three columns show the same pace.

Personal pace zones

Most analytics platforms use formula-based pace zones tied to a single race time or VDOT estimate. Peregrine's zones are derived from your own pace distribution. The Training page runs a kernel density estimate over your historical paces and identifies the modal peaks: clusters where you actually run a lot of miles. Those peaks become your zone centers.

The implementation is a pure-Python Epanechnikov KDE with peak detection, no SciPy dependency. The bandwidth is automatically chosen using a rule that scales with sample size. Edge handling clips to the observed pace range to prevent spurious peaks at the tails.

KDE zones work well for runners with a mix of easy and quality runs. Runners who run every pace at the same effort (e.g. all easy, all the time) will see a single broad peak. The classifier then leans on duration and heart-rate signals rather than zone position. This is by design: a runner without quality variation doesn't have zones to detect.

Workout classification

Each run is tagged with one of eight types: easy, recovery, long, tempo, threshold, workout, race, or steady. The classifier is a deterministic decision tree with the following inputs: pace relative to your personal pace zones, duration, average heart rate relative to max, and recent training context (the prior seven days of classified runs).

The order of decision matters. A run flagged by Strava with workout_type=1 (Race) is trusted as-is. Otherwise the tree checks duration first (long runs are duration-defined, ≥60 minutes by default), then race-distance proximity (a hard effort at a canonical race distance is a race), then pace-zone position with heart-rate corroboration. A short run after a recent hard session is a recovery run even if its pace would otherwise place it as easy.

Each classification carries a confidence value (high, medium, low) based on how strongly the inputs agreed. The Activity Log Type column shows the classification; low-confidence types render in a muted color so users know not to read too much into them.

The same decision tree exists in both Python (server-side, used by the planner and recommender) and JavaScript (client-side, used by the Activity Log and the "What's New" detectors). Snapshot tests verify the two implementations produce identical output on a shared fixture file. If they ever diverge, the snapshot test fails and the deployment blocks.

VO2max estimation

Each run is converted to a Daniels VDOT value^[1] from its pace and distance. For runs with heart rate data, the raw VDOT is projected upward using the run's heart rate reserve. A sub-maximal effort still reflects aerobic capacity, just submaximally expressed. The projection is conservative by design: easy runs get a small upward adjustment, not a large one, since over-aggressive projection can produce implausibly fast race predictions. Each run carries its own confidence weight based on how race-like the effort was.

Your rolling VO2max estimate is a weighted aggregation across all qualifying runs in a rolling window. The window length is automatic, scaling with how much recent data is available. Runs with higher race-likeness weight more heavily than easy long runs.

Race time prediction

Race times are blended from three signals: a Quality signal from your best recent efforts, a CTL signal from your training volume, and a Trend signal from the direction your fitness is moving. The blend weights shift by distance.

Short races weight Quality heavily because race-like efforts are the best predictor of mile and 5K times. Marathon predictions weight training volume and long-run readiness, since marathons are paced well below VO2max and depend on aerobic adaptations that volume builds. The model ensemble also includes Riegel's power-law^[2] at two exponents, including the conservative 1.08 specifically because empirical work has shown the standard 1.06 underestimates marathon time substantially for recreational runners.^[3]

The Race Predictor page shows a median prediction plus a 5th–95th percentile band. The band width is the prediction interval; it widens when models disagree (low confidence) and narrows when models agree (high confidence).

Confidence scoring

Confidence is per-distance and reflects how well your data supports a prediction at that distance, not just how wide the prediction interval is. A runner with consistent hard intervals but no long runs can have high mile and 5K confidence alongside low marathon confidence on the same dataset. Four factors feed in: quality of efforts, training volume relative to a distance-specific target, long-run readiness (for half and longer), and how much the three blended signals agree.

Marathon readiness

Six factors are scored independently: peak weekly volume, consistency, longest single run, building trend, aerobic fitness, and injury risk via the acute-to-chronic workload ratio^[4]. Each factor has its own threshold for full credit, calibrated against typical marathon training norms. The composite score and the per-factor breakdown both render on the Race page.

Fitness categories

VO2max categories on the VO2 page (Poor, Fair, Good, Excellent, Superior) follow ACSM age- and sex-adjusted population norms^[5], the same scale used in clinical fitness assessments. A "good" VO2max for a 25-year-old is different from "good" at 55.

"What's New" detectors

The Overview page's "What's New" panel surfaces backward-looking observations from your training. Each detector is a pure function that returns either a finding with a significance score or nothing. Findings below significance 20 are filtered out. The top three by significance render visibly; the rest hide behind a "Show more" expander.

The current detector set, in significance order:

PR. Personal best at a canonical race distance (1 mile, 5K, 10K, half, marathon) within the last 30 days. Distance match is ±5–10% depending on distance. Lifetime PRs out-rank rolling-window PRs.
Goal transition. Fires only when an active goal has just been met or has just expired. Active-goal progress stays on the Today card to avoid duplication.
Pace trend. Median pace of your classifier-tagged easy runs, current 14 days versus the prior 14. When most of your easy runs have heart-rate data, the comparison narrows to a ±3 bpm band around your easy-pace HR. This is the controlled-effort fitness signal.
Sustained block. At least four consecutive weeks above the 70th percentile of your annual weekly volume. Threshold adapts to your training pattern.
Volume milestone. Your current week is a personal best (new-user mode, <52 weeks of history) or sits in the top 10% of your weekly history (power-user mode).
First-time fast. An easy run in the last 7 days was faster than every easy run in the prior 90 days. Catches "your aerobic ceiling moved" events the PR detector misses because they aren't at canonical race distances.
Long-run streak. Four or more consecutive weeks with a classifier-tagged long run, including the current week.
Workout cluster. Three or more quality sessions in the last 14 days, when the prior 14 days had one or fewer.
Workout drought. First quality session of a type (tempo, threshold, workout, race) after a gap of at least 21 days.
Year-over-year. This week's mileage differs by 10% or more from the same calendar week one year ago. Only fires for users with at least 53 weeks of history.

Each finding ships with multiple sentence variants. A day-stable seed picks which variant displays. Same date and same athlete produces the same wording on every page load, then re-rolls at midnight.

The detectors are internal heuristics. The thresholds (21-day drought, top-10% volume, 5 sec/km pace delta) were tuned for typical training patterns rather than derived from published research. The thresholds may evolve as the panel sees real use.

Plan and workout recommendations

The Plan page shows a rolling 7-day forward plan built by a greedy walk algorithm. Each day's workout is chosen from a ranked list of candidates produced by a rule-based recommender. The recommender considers readiness, recent classifier output, days since hard, goal phase if a goal is set, and day-of-week patterns derived from your own history.

Five candidate workout types are evaluated per day: easy, recovery, long, tempo, and rest. Each candidate gets a score; the top-ranked one becomes the day's default. Users can override any day from the Plan page's customize controls. Overrides persist; they replace the default until cleared.

The planner runs a forward projection of fitness and fatigue assuming the recommended plan is followed. Projected CTL, ATL, and TSB feed back into later days' recommendations. A recommendation for Friday accounts for how Thursday's recommended workout would affect Friday's readiness. The plan is regenerated lazily when training data changes meaningfully.

If a recommended workout would push the projected acute load ratio outside the safe corridor, the planner backtracks and selects a lower-load alternative. The same backtracking handles taper periods near a goal race: four taper rules apply during the final two weeks before a target date.

The recommender is an internal rule system, not a published training methodology. It's calibrated to produce plans similar to standard recreational-running advice (typically 80/20 easy-to-hard, one long run per week, two quality sessions max). It won't replicate a personalized coach's plan, and it's deliberately conservative: when in doubt between an easy day and a workout day, it picks easy.

References

Daniels, J. (2014). Daniels' Running Formula (3rd ed.). Human Kinetics. Source for VDOT and the pace–VO2 relationship.
Riegel, P. S. (1981). Athletic records and human endurance. American Scientist, 69(3), 285–290. PubMed.
Vickers, A. J., & Vertosick, E. A. (2016). An empirical study of race times in recreational endurance runners. BMC Sports Science, Medicine and Rehabilitation, 8(1), 26. PMC full text.
Gabbett, T. J. (2016). The training-injury prevention paradox: should athletes be training smarter and harder? British Journal of Sports Medicine, 50(5), 273–280. PubMed.
American College of Sports Medicine (2021). ACSM's Guidelines for Exercise Testing and Prescription (11th ed.). Wolters Kluwer. Source for VO2max population norms by age and sex.
Banister, E. W., & Calvert, T. W. (1980). Planning for future performance: implications for long term training. Canadian Journal of Applied Sport Sciences, 5(3), 170–176. Origin of the impulse–response model that became TRIMP/CTL/ATL.
Impellizzeri, F. M., et al. (2020). The acute-chronic workload ratio: an inaccurate scaling index. International Journal of Sports Physiology and Performance, 15(2), 268–270. PubMed. Counterpoint to the original ACWR work, useful context for interpreting Load ratio.
Ely, M. R., et al. (2007). Impact of weather on marathon-running performance. Medicine & Science in Sports & Exercise, 39(3), 487–493. PubMed. Source for the temperature-pace decrement curve used in HAP.

← Back to Peregrine