Bayesian Shrinkage
Overview
Bayesian Shrinkage is a statistical technique used to improve the reliability of player scores, particularly for players with limited tournament data. It prevents players with only a few exceptional performances from being ranked disproportionately high compared to players with consistent tournament participation.
The Problem: Low Sample Size Variance
When players participate in very few tournaments, their average scores can be misleading:
- A player who plays 2 tournaments and scores very well might appear elite
- A player who plays 20 tournaments with the same average is clearly more proven
- The 2-tournament player might have gotten lucky or faced weaker competition
Example Scenario
| Player | Tournaments | Average Score | True Skill? |
|---|---|---|---|
| Player A | 2 | 95.0 | Uncertain |
| Player B | 18 | 87.5 | Well-established |
| Player C | 5 | 82.0 | Somewhat reliable |
Without shrinkage, Player A would rank above Player B, even though their score is based on far less data.
The Solution: Bayesian Shrinkage
Bayesian shrinkage pulls low-tournament players toward the regional mean, reducing the impact of small sample sizes while preserving the ranking signal for high-tournament players.
What is the Regional Mean?
The regional mean is the average performance score of all players in a specific region (e.g., NAE, EU, NAW) for a given season. It's calculated as:
Why it matters:
- Baseline comparison: Provides a reference point for what's "average" performance in that region
- Regional context: Accounts for differences in skill levels between regions (some regions may be more competitive)
- Stability point: Acts as an anchor that prevents extreme scores from dominating rankings
Example:
- NAE regional mean might be 75.0 points (highly competitive)
- EU regional mean might be 68.5 points (different competitive environment)
- A player with 2 tournaments scoring 95 in NAE gets shrunk toward 75.0
- The same player scoring 95 in EU gets shrunk toward 68.5
The regional mean ensures that a player who plays few tournaments is evaluated relative to their specific competitive environment, not against some arbitrary standard.
Mathematical Formula
Where:
- = Number of tournaments played by the player
- = Bayesian prior (default tournaments, typically 6-18 based on season length)
How It Works
Step 1: Calculate Shrinkage Weight
The weight determines how much we trust the player's raw score versus the regional average:
| Tournaments (N) | Prior (P=6) | Shrinkage Weight | Interpretation |
|---|---|---|---|
| 3 | 6 | 0.33 | Strong shrinkage toward mean |
| 6 | 6 | 0.50 | Balanced between raw and mean |
| 12 | 6 | 0.67 | Moderate trust in raw score |
| 18 | 6 | 0.75 | Strong trust in raw score |
| 30 | 6 | 0.83 | Very high trust in raw score |
Step 2: Apply Regional Mean
The formula blends the player's raw score with the regional mean:
Shrunk Score = (Weight × Raw Score) + ((1 - Weight) × Regional Mean)
Step 3: Dynamic Prior Calculation
The Bayesian prior is calculated based on the season's total windows:
This means:
- Short seasons (fewer windows): Prior = 6
- Long seasons (many windows): Prior = up to 18
- Adaptive to competitive season length
Visual Example
Consider a regional mean of 75.0 points:
| Player | Raw Score | Tournaments | Shrinkage Weight | Shrunk Score | Change |
|---|---|---|---|---|---|
| Elite A | 120.0 | 25 | 0.81 | 111.4 | -7.2% |
| Pro B | 95.0 | 12 | 0.67 | 88.4 | -7.0% |
| Casual C | 110.0 | 3 | 0.33 | 86.7 | -21.2% |
| Lucky D | 130.0 | 2 | 0.25 | 88.8 | -31.7% |
Notice how:
- Elite A (25 tournaments): Minimal shrinkage, score remains elite
- Lucky D (2 tournaments): Heavy shrinkage, suspiciously high score normalized
Benefits
1. Statistical Reliability
- Reduces variance from small sample sizes
- Produces more stable rankings week-to-week
2. Fairness
- Rewards consistent participation
- Prevents "lucky streak" players from dominating
3. Predictive Accuracy
- Shrunk scores better predict future performance
- Less overfitting to outlier results
4. Regional Normalization
- All players are measured against their region's baseline
- Accounts for regional skill distribution differences
Mathematical Properties
Convergence
As tournament count increases, shrunk score converges to raw score:
Bounded Effect
The shrinkage effect is always bounded:
- Maximum shrinkage: When , score equals regional mean
- Minimum shrinkage: When , score equals raw score
Preservation of Rankings
For players with the same tournament count, their relative rankings are preserved (both are shrunk proportionally).
Relationship to Log Volume Confidence
Bayesian shrinkage works in tandem with Log Volume Confidence:
- Shrinkage adjusts the score magnitude based on reliability
- Confidence further scales the final score based on participation volume
This two-step process ensures both statistical reliability (through shrinkage) and participation rewards (through confidence).
Real-World Impact
In practice, Bayesian shrinkage:
- Prevents new players with 1-2 good tournaments from appearing in S-Tier
- Stabilizes tier assignments for players near tier boundaries
- Makes rankings more defensible and statistically sound
- Improves the correlation between tier assignment and true skill level