Each point = one EXP. X = local accuracy (binary mean from this EDA's scoring).
Y = leaderboard accuracy (manually entered below or in artifacts/eda/lb_scores.json).
Diagonal = perfect agreement; below = LB worse than local; above = LB better.
Hover for run_id. Click to open that EXP.
Color: v3 paired runs · Copilot Public-50 EXPs · grey = LB missing
Edit values inline. Saved to browser (localStorage) and reflected in the plot above.
Click Download lb_scores.json to get the merged file → commit to artifacts/eda/lb_scores.json for persistence.
| run_id | label | dataset | local acc | LB acc | note |
|---|