country string | date string | anomaly_rate float64 | measurement_count int64 | spike_magnitude float64 | label int64 | event string | confidence float64 |
|---|---|---|---|---|---|---|---|
AE | 2018-01-03 | 0.358209 | 67 | 2.259626 | 0 | null | 0.7 |
AE | 2018-11-09 | 0.303797 | 79 | 1.724287 | 0 | null | 0.7 |
AE | 2019-04-21 | 0.294118 | 51 | 1.62905 | 0 | null | 0.7 |
AE | 2019-05-23 | 0.315789 | 57 | 1.842273 | 0 | null | 0.7 |
AE | 2019-09-14 | 0.353846 | 65 | 2.216702 | 0 | null | 0.7 |
AE | 2021-01-25 | 0.31746 | 126 | 1.858712 | 0 | null | 0.7 |
AE | 2021-03-17 | 0.293103 | 58 | 1.619071 | 0 | null | 0.7 |
AE | 2021-10-06 | 0.352941 | 51 | 2.207798 | 0 | null | 0.7 |
AE | 2022-04-18 | 0.285714 | 224 | 1.546371 | 0 | null | 0.7 |
AE | 2022-05-01 | 0.373333 | 75 | 2.408431 | 0 | null | 0.7 |
AE | 2022-08-08 | 0.373984 | 123 | 2.41483 | 0 | null | 0.7 |
AE | 2022-09-26 | 0.412261 | 2,610 | 2.791424 | 0 | null | 0.7 |
AE | 2022-10-26 | 0.294927 | 1,163 | 1.637012 | 0 | null | 0.7 |
AE | 2023-01-11 | 0.351542 | 1,135 | 2.19403 | 0 | null | 0.7 |
AE | 2023-01-25 | 0.322388 | 335 | 1.907194 | 0 | null | 0.7 |
AE | 2023-02-05 | 0.346405 | 612 | 2.143493 | 0 | null | 0.7 |
AE | 2023-03-03 | 0.296296 | 648 | 1.650485 | 0 | null | 0.7 |
AE | 2023-05-30 | 0.412 | 500 | 2.788861 | 0 | null | 0.7 |
AE | 2023-05-31 | 0.37375 | 800 | 2.41253 | 0 | null | 0.7 |
AE | 2023-06-02 | 0.31625 | 800 | 1.846804 | 0 | null | 0.7 |
AE | 2023-06-14 | 0.285714 | 700 | 1.546371 | 0 | null | 0.7 |
AE | 2023-06-15 | 0.33125 | 800 | 1.994385 | 0 | null | 0.7 |
AE | 2023-07-31 | 0.94 | 50 | 7.983704 | 0 | null | 0.7 |
AE | 2023-08-01 | 0.945455 | 55 | 8.03737 | 0 | null | 0.7 |
AE | 2023-08-02 | 0.933333 | 60 | 7.918113 | 0 | null | 0.7 |
AE | 2023-08-09 | 0.342039 | 2,795 | 2.100538 | 0 | null | 0.7 |
AE | 2023-08-21 | 0.307619 | 2,100 | 1.761886 | 0 | null | 0.7 |
AE | 2023-08-27 | 0.320423 | 3,873 | 1.887865 | 0 | null | 0.7 |
AE | 2023-08-30 | 0.938462 | 65 | 7.968568 | 0 | null | 0.7 |
AE | 2023-08-31 | 0.86 | 50 | 7.196607 | 0 | null | 0.7 |
AE | 2023-09-04 | 0.8 | 60 | 6.606284 | 0 | null | 0.7 |
AE | 2023-09-16 | 0.317446 | 2,224 | 1.858571 | 0 | null | 0.7 |
AE | 2023-09-26 | 0.367656 | 4,749 | 2.352576 | 0 | null | 0.7 |
AE | 2023-09-27 | 0.92 | 50 | 7.78693 | 0 | null | 0.7 |
AE | 2023-10-11 | 0.289716 | 2,820 | 1.585746 | 0 | null | 0.7 |
AE | 2023-10-20 | 0.419355 | 62 | 2.861223 | 0 | null | 0.7 |
AE | 2023-10-21 | 0.381818 | 55 | 2.491911 | 0 | null | 0.7 |
AE | 2023-10-21 | 0.375 | 56 | 2.424828 | 0 | null | 0.7 |
AE | 2023-10-22 | 0.522727 | 88 | 3.878275 | 0 | null | 0.7 |
AE | 2023-10-22 | 0.318182 | 88 | 1.86581 | 0 | null | 0.7 |
AE | 2023-10-24 | 0.352941 | 51 | 2.207798 | 0 | null | 0.7 |
AE | 2023-10-24 | 0.384615 | 52 | 2.519432 | 0 | null | 0.7 |
AE | 2023-10-25 | 0.462687 | 67 | 3.287552 | 0 | null | 0.7 |
AE | 2023-10-25 | 0.470588 | 68 | 3.365294 | 0 | null | 0.7 |
AE | 2023-10-26 | 0.433333 | 60 | 2.998754 | 0 | null | 0.7 |
AE | 2023-11-15 | 0.365306 | 1,470 | 2.329453 | 0 | null | 0.7 |
AE | 2023-11-17 | 0.304682 | 1,303 | 1.732985 | 0 | null | 0.7 |
AE | 2023-11-18 | 0.361111 | 1,152 | 2.28818 | 0 | null | 0.7 |
AE | 2023-11-29 | 0.325548 | 3,511 | 1.938287 | 0 | null | 0.7 |
AE | 2023-11-29 | 0.90566 | 53 | 7.645846 | 0 | null | 0.7 |
AE | 2023-12-05 | 0.2861 | 741 | 1.550165 | 0 | null | 0.7 |
AE | 2023-12-15 | 0.901961 | 51 | 7.609447 | 0 | null | 0.7 |
AE | 2023-12-15 | 0.444444 | 54 | 3.108073 | 0 | null | 0.7 |
AE | 2023-12-16 | 0.281637 | 3,469 | 1.50626 | 0 | null | 0.7 |
AE | 2023-12-16 | 0.56 | 50 | 4.244991 | 0 | null | 0.7 |
AE | 2023-12-18 | 0.846154 | 52 | 7.060378 | 0 | null | 0.7 |
AE | 2023-12-18 | 0.37037 | 54 | 2.379279 | 0 | null | 0.7 |
AE | 2023-12-19 | 0.888889 | 54 | 7.480836 | 0 | null | 0.7 |
AE | 2023-12-19 | 0.574074 | 54 | 4.383462 | 0 | null | 0.7 |
AE | 2023-12-20 | 0.779661 | 59 | 6.406174 | 0 | null | 0.7 |
AE | 2023-12-20 | 0.807018 | 57 | 6.675327 | 0 | null | 0.7 |
AE | 2024-02-10 | 0.328687 | 2,224 | 1.969168 | 0 | null | 0.7 |
AE | 2024-02-27 | 0.282566 | 2,042 | 1.515397 | 0 | null | 0.7 |
AE | 2024-03-10 | 0.335851 | 2,516 | 2.039648 | 0 | null | 0.7 |
AE | 2024-03-15 | 0.30732 | 1,653 | 1.758944 | 0 | null | 0.7 |
AE | 2024-03-25 | 0.314286 | 1,750 | 1.827478 | 0 | null | 0.7 |
AE | 2024-03-30 | 0.288644 | 1,902 | 1.575192 | 0 | null | 0.7 |
AE | 2024-04-02 | 0.440781 | 819 | 3.072034 | 0 | null | 0.7 |
AE | 2024-04-18 | 0.33625 | 1,600 | 2.043578 | 0 | null | 0.7 |
AE | 2024-04-19 | 0.351268 | 1,301 | 2.191339 | 0 | null | 0.7 |
AE | 2024-04-27 | 0.320723 | 2,987 | 1.890814 | 0 | null | 0.7 |
AE | 2024-04-28 | 0.349125 | 1,372 | 2.170255 | 0 | null | 0.7 |
AE | 2024-05-01 | 0.360075 | 2,655 | 2.277989 | 0 | null | 0.7 |
AE | 2024-05-02 | 0.368273 | 3,549 | 2.358641 | 0 | null | 0.7 |
AE | 2024-05-07 | 0.334405 | 1,244 | 2.025427 | 0 | null | 0.7 |
AE | 2024-05-18 | 0.471698 | 53 | 3.376214 | 0 | null | 0.7 |
AE | 2024-05-18 | 0.377358 | 53 | 2.448033 | 0 | null | 0.7 |
AE | 2024-05-20 | 0.313502 | 1,933 | 1.81977 | 0 | null | 0.7 |
AE | 2024-05-21 | 0.288889 | 2,385 | 1.577606 | 0 | null | 0.7 |
AE | 2024-05-22 | 0.290245 | 1,671 | 1.590951 | 0 | null | 0.7 |
AE | 2024-05-29 | 0.381776 | 1,734 | 2.491498 | 0 | null | 0.7 |
AE | 2024-06-01 | 0.294831 | 1,896 | 1.63607 | 0 | null | 0.7 |
AE | 2024-06-04 | 0.34 | 1,400 | 2.080473 | 0 | null | 0.7 |
AE | 2024-06-12 | 0.851852 | 54 | 7.116439 | 0 | null | 0.7 |
AE | 2024-06-24 | 0.355956 | 3,551 | 2.237461 | 0 | null | 0.7 |
AE | 2024-06-25 | 0.320204 | 2,158 | 1.885705 | 0 | null | 0.7 |
AE | 2024-07-05 | 0.285375 | 4,205 | 1.543029 | 0 | null | 0.7 |
AE | 2024-07-08 | 0.627451 | 51 | 4.908623 | 0 | null | 0.7 |
AE | 2024-07-18 | 0.839286 | 56 | 6.992805 | 0 | null | 0.7 |
AE | 2024-07-31 | 0.304622 | 3,808 | 1.732398 | 0 | null | 0.7 |
AE | 2024-08-03 | 0.317988 | 2,107 | 1.8639 | 0 | null | 0.7 |
AE | 2024-08-11 | 0.404605 | 608 | 2.716106 | 0 | null | 0.7 |
AE | 2024-08-12 | 0.290105 | 1,334 | 1.58957 | 0 | null | 0.7 |
AE | 2024-08-20 | 0.287583 | 4,816 | 1.564758 | 0 | null | 0.7 |
AE | 2024-09-05 | 0.457627 | 59 | 3.237773 | 0 | null | 0.7 |
AE | 2024-09-05 | 0.5 | 62 | 3.654668 | 0 | null | 0.7 |
AE | 2025-07-19 | 0.335476 | 778 | 2.035959 | 0 | null | 0.7 |
AE | 2025-10-14 | 0.792453 | 53 | 6.532029 | 0 | null | 0.7 |
AE | 2025-11-18 | 0.538462 | 52 | 4.03308 | 0 | null | 0.7 |
AE | 2025-12-21 | 0.288462 | 52 | 1.573401 | 0 | null | 0.7 |
Voidly OONI Censorship Historical
A 10-year open archive for internet censorship research and ML.
Dataset Description
This dataset contains 10 years of global internet censorship measurements from 120+ countries:
- 1.6M+ daily measurements (2017-2026)
- 37K detected anomaly spikes
- 4.5K confirmed censorship events with labels
- 25+ known major incidents (Mahsa Amini protests, Myanmar coup, etc.)
Data Sources
- Primary: OONI (Open Observatory of Network Interference)
- Secondary: Voidly Research analysis and labeling
Files
| File | Description | Rows |
|---|---|---|
data/ooni-historical.parquet |
Daily measurements by country/test | 1.6M |
data/censorship-incidents.parquet |
Labeled anomaly spikes | 37K |
Usage
from datasets import load_dataset
# Load historical measurements
ds = load_dataset("emperor-mew/ooni-censorship-historical",
data_files="data/ooni-historical.parquet")
# Load labeled incidents (for ML training)
incidents = load_dataset("emperor-mew/ooni-censorship-historical",
data_files="data/censorship-incidents.parquet")
Schema
ooni-historical
| Column | Type | Description |
|---|---|---|
| country | string | ISO 3166-1 alpha-2 country code |
| test_name | string | OONI test type (web_connectivity, telegram, whatsapp) |
| date | date | Measurement date |
| measurement_count | int | Total measurements |
| anomaly_count | int | Measurements showing anomalies |
| confirmed_count | int | Confirmed blocked |
| anomaly_rate | float | Fraction showing anomalies (0-1) |
censorship-incidents
| Column | Type | Description |
|---|---|---|
| country | string | ISO 3166-1 alpha-2 country code |
| date | date | Incident date |
| anomaly_rate | float | Measured anomaly rate |
| measurement_count | int | Sample size |
| spike_magnitude | float | Z-score above baseline |
| label | int | 1=confirmed censorship, 0=not |
| event | string | Matched known event (if any) |
| confidence | float | Label confidence (0-1) |
Known Events Covered
- Iran Mahsa Amini protests (2022)
- Myanmar military coup (2021)
- Belarus election shutdown (2020)
- Russia Ukraine invasion blocks (2022+)
- Kazakhstan January protests (2022)
- Sudan military coup (2021)
- Cuba July protests (2021)
- Uganda election shutdown (2021)
- And 17+ more
Voidly Atlas ML Stack (2026-05-21)
This historical archive is the long-horizon training substrate for the
Voidly Atlas ML stack. The production stack is documented in dedicated
HuggingFace model cards under emperor-mew:
- Classifier v3.3 (
emperor-mew/voidly-classifier-v3.3) — country-day censorship classifier, GradientBoosting, regime-similarity-weighted contagion features. Honest cross-country generalization: leave-one-country-out median F1 0.87, mean F1 0.71. The fitted.pkl+ per-country thresholds ship in that repo. - Multi-horizon forecast (
emperor-mew/voidly-forecast-v1-multi-horizon) — 1d/7d/30d XGBoost + isotonic, LOCO AUC 0.91 / 0.88 / 0.84. - Unsupervised anomaly (
emperor-mew/voidly-anomaly-dbscan-v1) — CenDTect-style DBSCAN second-opinion signal. - 12 more model cards — search
emperor-mew/voidly-on the Hub.
Note on the older "F1 99.8% / AUC 1.000" claim: that figure was a stratified-random-split number on a now-superseded v2 model. It does not reflect cross-country generalization. The current honest metric is the LOCO (leave-one-country-out) F1 reported above — random splits inflate apparent accuracy because the model learns per-country base rates.
For a clean held-out evaluation task, use the companion benchmark
emperor-mew/voidly-bench-v1.
Citation
@dataset{voidly_ooni_historical_2026,
author = {Voidly Research},
title = {Voidly OONI Censorship Historical: 10 Years of Internet Measurement Data},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/emperor-mew/ooni-censorship-historical}
}
Links
License
CC BY 4.0 - Attribution required
- Downloads last month
- 36
