{"id":44729618,"url":"https://github.com/joenano/rpscrape","last_synced_at":"2026-02-15T18:11:12.467Z","repository":{"id":44057211,"uuid":"159706895","full_name":"joenano/rpscrape","owner":"joenano","description":"Scrape horse racing results data and racecards.","archived":false,"fork":false,"pushed_at":"2026-02-12T09:30:50.000Z","size":2666,"stargazers_count":203,"open_issues_count":3,"forks_count":79,"subscribers_count":40,"default_branch":"master","last_synced_at":"2026-02-12T12:09:46.156Z","etag":null,"topics":["horse-racing","python","scraper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/joenano.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2018-11-29T18:01:00.000Z","updated_at":"2026-02-12T09:30:59.000Z","dependencies_parsed_at":"2025-12-06T11:06:12.185Z","dependency_job_id":null,"html_url":"https://github.com/joenano/rpscrape","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/joenano/rpscrape","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joenano%2Frpscrape","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joenano%2Frpscrape/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joenano%2Frpscrape/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joenano%2Frpscrape/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/joenano","download_url":"https://codeload.github.com/joenano/rpscrape/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joenano%2Frpscrape/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29486081,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-15T15:33:17.885Z","status":"ssl_error","status_checked_at":"2026-02-15T15:32:53.698Z","response_time":118,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["horse-racing","python","scraper"],"created_at":"2026-02-15T18:11:12.034Z","updated_at":"2026-02-15T18:11:12.462Z","avatar_url":"https://github.com/joenano.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rpscrape\n\nHorse racing data has been hoarded by a few companies, enabling them to effectively extort the public for access to any worthwhile historical amount. Compared to other sports where historical data is easily and freely available to use and query as you please, racing data in most countries is far harder to come by and is often only available with subscriptions to expensive software.\n\nThe aim of this tool is to provide a way of gathering large amounts of historical data at no cost.\n\n#### Table of Contents\n\n- [Requirements](#requirements)\n- [Install](#install)\n- [Examples](#examples)\n- [Scrape Racecards](#scrape-racecards)\n- [Settings](#settings)\n- [Authentication](#authentication)\n\n### Requirements\n\nYou must have Python 3.13 or greater, and GIT installed. You can download the latest Python release [here](https://www.python.org/downloads/). You can download GIT [here](https://git-scm.com/downloads).\n\n- [curl_cffi](https://pypi.org/project/curl-cffi/)\n- [jarowinkler](https://pypi.org/project/jarowinkler/)\n- [LXML](https://lxml.de/)\n- [orjson](https://pypi.org/project/orjson/1.3.0/)\n- [python-dotenv](https://pypi.org/project/python-dotenv/)\n- [tomli](https://pypi.org/project/tomli/)\n- [TQDM](https://pypi.org/project/tqdm/)\n\nThe above Python modules are required, they can be installed using PIP(_included with Python_):\n\n```\npip3 install curl_cffi jarowinkler lxml orjson python-dotenv tomli tqdm\n```\n\n### Install\n\n```\ngit clone https://github.com/joenano/rpscrape.git\n```\n\n#### Command-Line Options\n\n```\n-d, --date      Single date or date range YYYY/MM/DD-YYYY/MM/DD.\n-y, --year      Year or year range (YYYY or YYYY-YYYY).\n-r, --region    Region code (e.g., gb, ire).\n-c, --course    Numeric course code.\n-t, --type      Race type: flat or jumps.\n\n--date-file     File containing dates (one per line, YYYY/MM/DD).\n\n--clean         Clear cache and data before running request.\n\n--regions       List or search regions.\n--courses       List/search courses or list courses in a region.\n```\n\n##### Notes\n\n--date and --year are mutually exclusive.\n\nYou cannot specify both --region and --course at the same time.\n\nWhen scraping jumps data, the year refers to the season start. For example, the 2019 Cheltenham Festival is in the 2018-2019 season: use 2018.\n\n### Examples\n\nAll races on a specific date:\n\n```\n./rpscrape.py -d 2020/10/01\n```\n\nOnly races from Great Britain:\n\n```\n./rpscrape.py -d 2020/10/01 -r gb\n```\n\nDate range:\n\n```\n./rpscrape.py -d 2019/12/15-2019/12/18\n```\n\nFlat races in Ireland (2019):\n\n```\n./rpscrape.py -r ire -y 2019 -t flat\n```\n\nJump races at Ascot (1999–2018):\n\n```\n./rpscrape.py -c 2 -y 1999-2018 -t jumps\n```\n\n##### Date File Mode\n\nScrape using a file with dates:\n\n```\n./rpscrape.py --date-file dates.txt\n```\n\none date per line, format: YYYY/MM/DD.\n\n```\n2020/10/01\n2020/11/02\n2020/12/03\n```\n\n##### Searching\n\nList all regions:\n\n```\n./rpscrape.py --regions\n```\n\nSearch regions:\n\n```\n./rpscrape.py --regions gb\n```\n\nList all courses:\n\n```\n./rpscrape.py --courses\n```\n\nSearch courses:\n\n```\n./rpscrape.py --courses Ascot\n```\n\nList courses in a region:\n\n```\n./rpscrape.py --courses gb\n```\n\n##### Settings\n\nThe [user_settings.toml](https://github.com/joenano/rpscrape/blob/master/user_settings.toml) file contains the data fields that can be scraped. You can turn fields on and off by setting them true or false. The order of fields in that file will be maintained in the output csv. The [default_settings.toml](https://github.com/joenano/rpscrape/blob/master/default_settings.toml) file should not be edited, its there as a backup and to introduce any new fields without changing user settings.\n\n## Scrape Racecards\n\nYou can scrape racecards using racecards.py which saves a file containing a json object of racecard information.\n\nThere are only three parameter options, --day N, --days N where N is a number 1-2, and --region N where N is a region (gb, ire, etc).\n\n##### Examples\n\nScrape today's racecards.\n\n```\n./racecards.py --day 1\n```\n\nScrape tomorrow's racecards.\n\n```\n./racecards.py --day 2\n```\n\nScrape today's and tomorrow's racecards.\n\n```\n./racecards.py --days 2\n```\n\nScrape today's and tomorrow's racecards by region.\n\n```\n./racecards.py --days 2 --region gb\n```\n\n##### Settings\n\nYou can customize which data is included in racecards using the settings file. The scraper uses `settings/user_racecard_settings.toml` if it exists, otherwise falls back to `settings/default_racecard_settings.toml`.\n\nTo customize:\n\n1. Copy `default_racecard_settings.toml` to `user_racecard_settings.toml`\n2. Edit the settings to enable/disable field groups and data collection options\n\nThe settings file lets you control:\n\n- **Data Collection**: Whether to fetch stats and profiles\n- **Field Groups**: Which groups of runner fields to include (core, basic_info, performance, jockey, trainer, etc.)\n\n#### Authentication\n\nCredentials are stored in a .env file in the root directory. Make sure .env is added to .gitignore.\n\n```\nEMAIL=your@email.com\nAUTH_STATE=your_auth_state\nACCESS_TOKEN=your_access_token\n```\n\nTo find your tokens, login to the site and open the cookies section in the storage tab of your browser's developer tools.\n\nYou need the values for auth_state and cognito access token (not to be confused with the AccessToken cookie).\n\nThere will be multiple keys beginning with `CognitoIdentityServiceProvider`, you want the value for the one that ends with `.accessToken`. It should be directly under email if keys are sorted by name.\n\n![alt text](https://i.postimg.cc/FK41xJ3W/20260103-113009.png)\n![alt text](https://i.postimg.cc/nLJM1QBg/20260103-113046.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoenano%2Frpscrape","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjoenano%2Frpscrape","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoenano%2Frpscrape/lists"}