The Effective Sample Size
5 days ago
- #Covariate Shift
- #Effective Sample Size
- #Reweighting
- Reweighting data to correct covariate shift reduces bias but increases variance, especially when weights are uneven, making estimates sensitive to errors in heavily weighted points.
- Effective sample size (n_eff), defined as 1 / (sum of squared normalized weights), measures the number of observations that contribute meaningfully after reweighting; equal weights yield n_eff = n, while concentrated weights reduce it.
- The effective sample size arises from variance calculations (e.g., for weighted averages of Normal variables) and concentration inequalities like Hoeffding's, showing that reweighted sums behave like unweighted averages over n_eff samples.
- In applications like off-policy reinforcement learning (e.g., replay buffers), n_eff acts as a diagnostic for data usability, indicating when stored data becomes too stale relative to the current policy.
- Effective sample size is used in methods like Sequential Monte Carlo to trigger resampling, ensuring particles remain evenly weighted for accurate filtering.