An open API service indexing awesome lists of open source software.

https://github.com/vdutts7/speedtests


https://github.com/vdutts7/speedtests

crawl data isp latency ookla speedtests

Last synced: about 6 hours ago
JSON representation

Awesome Lists containing this project

README

          


logo
logo
logo
logo
logo
logo
logo
logo


speedtests


crawl public speedtest result pages → ISP intelligence map


https://vdutts7.github.io/speedtests

---

map preview
screen recording

## Issue

result IDs opaque; no bulk export API

❌ one-off lookups: can't build ISP/regional aggregates from manual page visits

❌ third-party datasets: stale, licensed, or missing server/latency fields you need

❌ naive crawl: 403/rate limits on sequential IDs; gaps without checkpoint/resume

## Realization

public [speedtest.net](https://www.speedtest.net) result pages embed `window.OOKLA.INIT_DATA` JSON with **no auth**

## How it works

anchor from your own run:

1. open [speedtest.net](https://www.speedtest.net), run a test
2. result URL in the bar- trailing digits are the ID (`/result/19360616699`)
3. pass that ID to `sweep.py --start`
4. sweep decrements by 1 each fetch (`19360616699`, `19360616698`, …)
5. each `/result/{id}` hit parses into `data/*.jsonl`

```text
+-------------+ +----------+ +--------+ +----------+
| your result | --> | sweep.py | --> | data/ | --> | query.py |
| ID | | decrement| | jsonl | | |
+-------------+ +----------+ +--------+ +----------+
```

## Setup

```bash
python3 --version # stdlib only; sweep shells out to /usr/bin/curl
```

## Run

```bash
# --start = ID from your own speedtest.net result URL
OUTPUT=data/ookla_results.jsonl python3 sweep.py --start 19360616699 --count 50000
```

```bash
OUTPUT=data/ookla_results.jsonl python3 sweep.py --start 19360616699 --count 50000 --resume
```

```bash
python3 query.py --input data/ookla_results.jsonl --isp Airtel
python3 query.py --input data/ookla_results.jsonl --top 20
python3 query.py --input data/ookla_results.jsonl --csv > airtel_export.csv
```

## Gotchas

| symptom | fix | stability | why |
|---|---|---|---|
| 403 bursts | lower `RATE_S` or `--rate` | intermittent | speedtest.net edge throttle |
| sparse hit rate | re-run your own test; use fresher `--start` ID | stable | not every decremented ID exists |

## Tools Used

Python

## Contact

vd7.io  
/vdutts7