PhilosophyELO vs OPR: Why We Need Both
A Brief History of ELO
The ELO rating system was developed by physicist Arpad Elo in the 1960s for chess. Today it's used across competitive domains: FIFA world rankings, League of Legends matchmaking, FiveThirtyEight's NFL predictions, and professional esports. The system's power lies in its ability to predict outcomes and adapt based on results.
Why Not Just Use OPR?
If OPR estimates a team's scoring contribution, why can't we just add up OPRs to predict winners?
The Problem: Close Matches
Consider these two outcomes:
| Match | Red Score | Blue Score | Result |
|---|
| Match A | 200 | 198 | Red Wins |
| Match B | 50 | 48 | Red Wins |
OPR sees these as completely different matches (200 vs 50 points). But for winning, they're equally valuable - a 2-point victory either way. ELO captures this: both red alliances get similar rating boosts because both achieved the outcome that matters.
When Each Metric Shines
📊 Use OPR For:
- Predicting expected scores
- Evaluating robot hardware capability
- Alliance selection scouting
- Identifying high-scoring partners
🎯 Use ELO For:
- Predicting match winners
- Measuring competitive success rate
- Bracket placement and seeding
- Cross-regional ranking
💡 Key Insight: A team scoring 150 points per match (high OPR) but consistently losing 180-150 will have lower ELO than a team scoring 120 but winning 120-110. OPR says the first team has a better robot; ELO says the second team wins more often. Both are true - they measure different things.
Core MetricNormalized cELO: The Best of Both Worlds
Normalized Cumulative ELO (cELO) combines competitive success with absolute performance, adjusted for regional strength and meta evolution. It's our most comprehensive single metric for ranking teams globally.
The Three-Level System
Event ELOIsolated rating from a single event's matches
cELO (Cumulative ELO)Running total across all matches, exponentially weighted toward recent performance
Normalized cELOcELO adjusted for regional strength and blended with cOPR-based absolute performance
Recency Weighting
Teams improve throughout the season. To reflect current skill rather than historical averages, we apply exponential decay weighting to match importance:
$$ w(t) = e^{-\lambda \cdot \Delta t} $$
Where Δt is days since the match and λ is the decay parameter. Recent matches contribute significantly more than older ones.
The Regional Normalization Challenge
Consider two teams with identical 1600 cELO ratings:
- Team A: Dominates weak region (15-0 record, avg opponent ELO: 1200)
- Team B: Competes in elite region (8-7 record, avg opponent ELO: 1800)
Which team is truly stronger? Raw ELO can't distinguish between "big fish in small pond" and "contender among elites."
Hybrid Normalization Formula
Our normalization blends two components to create a globally fair rating:
Evolution Scaling
To prevent artificial ceilings and account for meta evolution, the entire ELO scale adjusts proportionally to global scoring trends:
$$ \text{Evolution Factor} = \frac{\text{Current Season Global Mean cOPR}}{\text{Baseline Season cOPR}} $$
As teams collectively improve and raise the scoring ceiling, the ELO scale naturally inflates to match. A world-class team today might rate 2200, but if the meta doubles scoring capability in future seasons, world-class teams would rate ~4400.
Example: Cross-Regional Comparison
| Team | Region | Record | cOPR | Raw cELO | Normalized cELO |
|---|
| Elite Team | Strong | 12-3 | 140 | 1750 | 2178 |
| Stat Padder | Weak | 15-0 | 42 | 1800 | 1652 |
Despite the undefeated record, the stat padder's low cOPR reveals they're crushing weak opponents. Normalized cELO properly ranks the elite team higher for global comparison.
Use Cases
- Cross-regional team comparisons and world rankings
- Championship seeding and advancement predictions
- Identifying underrated teams from highly competitive regions
- Multi-season historical comparisons despite meta evolution
PerformanceCumulative Offensive Power Rating (cOPR)
While ELO measures ability to win, cOPR measures ability to score points. It isolates an individual team's contribution to alliance scores, with exponentially higher weight given to recent events.
The Alliance Score Problem
FTC matches are 2v2, but we only observe total alliance scores. If Red Alliance (Teams 123 + 456) scores 180 points, how much did each team contribute individually?
Linear System Solution
We model alliance scores as a linear system across many matches:
$$ \text{cOPR}_{\text{Team}_1} + \text{cOPR}_{\text{Team}_2} \approx \text{Alliance Score} $$
Over an event with N teams and M matches, this creates an overdetermined system \( Ax = b \), solved using Weighted Least Squares Regression.
Time-Weighted Recency
Teams improve throughout the season. To emphasize current performance:
- Most recent event: Full weight
- Previous event: Reduced weight (exponential decay)
- Older events: Progressively less influence
This makes cOPR more predictive of current capability than a simple average across all events.
💡 Why Weighted? A team that scored 40 OPR at their first event but now scores 100 OPR should be rated closer to 100, not 70 (the average).
TrendMomentum
Momentum quantifies the rate of improvement over time. It answers: "Is this team getting better, staying stable, or declining?"
Methodology
We perform Weighted Least Squares regression on match scores over time, with higher weights on recent matches. The slope of the fitted line represents points-per-match improvement rate.
$$ \text{Score}(t) = \beta_0 + \beta_1 \cdot t + \epsilon $$
Where β₁ (the slope) indicates improvement direction:
- Positive slope: Improving performance
- Near-zero slope: Stable performance
- Negative slope: Declining performance
The raw slope is normalized to a 0-100 scale for interpretability.
ReliabilityConsistency Index
Consistency measures how reliably a team performs near their average. High consistency means few "bad matches," while low consistency indicates volatility.
Mathematical Foundation
Based on the Coefficient of Variation (CV):
$$ CV = \frac{\sigma}{\mu} $$
Where σ is standard deviation and μ is mean score. We invert and scale this to 0-100, where CV = 0 (perfect consistency) maps to 100.
💡 Why It Matters: A team with 100 ± 50 point variance is riskier for eliminations than a team scoring 90 ± 10, even if their averages are similar.
PenaltiesFoul cOPR
Foul cOPR estimates the average penalty points a team gives to opponents per match. Like scoring OPR, penalties are reported per alliance, so we use the same linear system approach to isolate individual responsibility.
Lower Foul cOPR is better. A team with 5.0 Foul cOPR contributes ~5 penalty points to opponents per match on average.
Time-Weighted Evolution
Foul cOPR uses the same recency weighting as scoring cOPR. Teams that clean up their driving or fix problematic mechanisms will see rapid improvement in this metric.
Ranking PointsRP Reliability
Ranking Points (Movement, Goal, Pattern) determine tournament seeding. RP Reliability estimates the probability of earning each RP type in the next match.
Bayesian Inference with Recency
We blend three statistical approaches:
- Historical Success Rate: Long-term track record
- Recency Weighting: Recent matches weighted exponentially higher
- Bayesian Smoothing: Prevents overfitting to small samples (e.g., 100% from 1 match becomes ~75% after smoothing)
$$ P(\text{RP}) = \frac{\sum_{i} w_i \cdot \text{Success}_i + \text{Prior Successes}}{\sum_{i} w_i + \text{Prior Trials}} $$
Where wᵢ are recency weights. This produces robust probabilities that adapt quickly to new strategies without overreacting to outliers.
PredictionsMatch Win Probability
Given two alliances, what's the probability each alliance wins?
ELO-Based Probability
The probability Alliance A defeats Alliance B follows a logistic curve:
$$ P(A \text{ wins}) = \frac{1}{1 + 10^{(R_B - R_A) / D}} $$
Where RA and RB are alliance ratings (sum of both teams' Normalized cELOs) and D is a scaling constant.
Score Prediction Enhancement
We also estimate expected scores using cOPR and Foul cOPR:
$$ \text{Expected Score}_A = \sum \text{cOPR}_{A} + \sum \text{Foul cOPR}_{B} $$
Alliance A's expected score equals their teams' combined scoring ability plus penalties they'll draw from Alliance B.
💡 Two Models, One Prediction: If ELO predicts Red wins but score prediction favors Blue, we flag this as a high-uncertainty match requiring further analysis.