{"id":38490055,"url":"https://github.com/wasabipesto/themis","last_synced_at":"2026-01-17T05:41:33.516Z","repository":{"id":212369139,"uuid":"730360869","full_name":"wasabipesto/themis","owner":"wasabipesto","description":"How accurate are prediction markets?","archived":false,"fork":false,"pushed_at":"2025-12-14T16:51:52.000Z","size":4425,"stargazers_count":26,"open_issues_count":56,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-17T02:25:27.747Z","etag":null,"topics":["astro","data-visualization","prediction-markets"],"latest_commit_sha":null,"homepage":"https://brier.fyi","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wasabipesto.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"wasabipesto"}},"created_at":"2023-12-11T18:52:48.000Z","updated_at":"2025-12-02T03:28:18.000Z","dependencies_parsed_at":"2025-11-30T03:04:19.321Z","dependency_job_id":null,"html_url":"https://github.com/wasabipesto/themis","commit_stats":null,"previous_names":["wasabipesto/themis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/wasabipesto/themis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wasabipesto%2Fthemis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wasabipesto%2Fthemis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wasabipesto%2Fthemis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wasabipesto%2Fthemis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wasabipesto","download_url":"https://codeload.github.com/wasabipesto/themis/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wasabipesto%2Fthemis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28500797,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T04:31:57.058Z","status":"ssl_error","status_checked_at":"2026-01-17T04:31:45.816Z","response_time":85,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["astro","data-visualization","prediction-markets"],"created_at":"2026-01-17T05:41:32.862Z","updated_at":"2026-01-17T05:41:33.506Z","avatar_url":"https://github.com/wasabipesto.png","language":"Rust","funding_links":["https://github.com/sponsors/wasabipesto"],"categories":[],"sub_categories":[],"readme":"# What is this\n\nThis is Project Themis, a suite of tools which powers [Brier.fyi](https://brier.fyi/) and previously [Calibration City](https://calibration.city/). The purpose of this project is to perform useful analysis of prediction market calibration and accuracy with data from each platform's public API.\n\n# How to run this yourself\n\n## Step 0. Install dependencies\n\nClone this repository and enter it:\n\n```bash\ngit clone git@github.com:wasabipesto/themis.git\ncd themis\n```\n\nInstall any other dependencies:\n\n- The downloader and extractor are written in rust. To install the rust toolchain, follow the instructions [here](https://www.rust-lang.org/tools/install). You could run these utilities in Docker but that is not officially supported.\n- The website is written with Astro, which uses `node` and `npm`. You can find the official node/npm installation instructions [here](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm), run everything in Docker, or use whatever version Debian stable is shipping.\n- [Docker](https://docs.docker.com/engine/install/) and the [docker compose](https://docs.docker.com/compose/install/linux) plugin are used to run the database and its connectors. It's possible to run these without docker by installing [Postgres](https://www.postgresql.org/download/) and [PostgREST](https://docs.postgrest.org/en/stable/tutorials/tut0.html) manually.\n- For running tasks I have provided a `justfile`, which requires `just` to run. You can install that by following the instructions [here](https://just.systems/man/en/packages.html). The `justfile` is very simple, and you can just run the commands by hand if you don't want to install it.\n- The script for site deployment uses `rclone` and thus can be deployed to any target supported by that utility. You can install rclone by following the instructions [here](https://rclone.org/install/), or deploy the site some other way.\n- Some other optional utilities:\n  - There are a few Python scripts I use for development in the `scripts` folder. If you want to use these, ensure you have `python` and `uv` [installed](https://docs.astral.sh/uv/getting-started/installation/).\n  - When testing API responses I use `jq` for filtering and general formatting. You can get that [here](https://jqlang.org/download/).\n  - A couple scripts for debugging are written with `rust-script`. Installation instructions are [here](https://rust-script.org/#installation).\n  - Some admin tools lean on an `ollama` API endpoint for extracting keywords, generating slugs, and more. You can find installation instructions [here](https://ollama.com/download). By default it expects that the service will be started and available on localhost.\n\n## Step 1. Downloading API data to disk\n\nIn previous versions of this program, we deserialized all API responses immediately upon receiving them in order to work in a type-safe rust environment. This works great if APIs never change. Since external APIs can change unexpectedly, we have broken the download flow into two programs: a downloader and an extractor. The downloader will grab all relevant data from the platform APIs, then the extractor will deserialize that data into something we can use.\n\nBefore downloading, make sure you have enough disk space, memory, and time:\n\n- By default the download program will download from all platforms in parallel to avoid getting bottle-necked by any one platform's API rate limit. In order to do this we first download the platform's bulk list as an index and load it into memory. If you are running in the default mode, expect to use around 6 GB of memory. If you run out of memory, you can run the platforms one at a time with the `--platform` option.\n- This program will download all relevant data from each platform's API to disk. We try to avoid reading or writing any more than necessary by buffering writes and appending data where possible. Still, a large amount of disk space will be required for this data. As of February 2025 it uses around 20 GB, but this will increase over time.\n- When run the first time, this utility takes a day or so to complete. It will first download each platform's index and make a download plan. Then it will queue up batches of downloads that run asynchronously. If you interrupt the program or it runs into an error, simply restart it. It will look for an existing index file and attempt to resume the downloads automatically.\n\nTo run the downloader:\n\n```bash\njust download --help # for options\njust download # run with default settings\n```\n\nThe download utility is designed to be robust so you can set it and forget it. Errors are much more likely in later steps. If the downloader crashes and resuming a few minutes later does not solve the problem, please [submit an issue](https://github.com/wasabipesto/themis/issues/new). This could be caused by a major shift in a platform's API structure or rate limits.\n\nNote: Do not run multiple instances of the download program to try and make it go faster! Site-specific rate limits are baked in to stay under the rate limits and prevent overloading the servers. The data downloader queues items sequentially, so you will end up with duplicate markets while also getting yourself IP-banned.\n\n## Step 2. Setting up the database\n\nWhile the downloader is running, set up the database.\n\nFirst, we'll create our environment file and update the connection passwords.\n\n```bash\ncp template.env .env\nsed -i \"s/^POSTGRES_PASSWORD=.*/POSTGRES_PASSWORD=$(openssl rand -base64 32 | tr -d '/+=' | cut -c1-32)/\" .env\nsed -i \"s/^PGRST_JWT_SECRET=.*/PGRST_JWT_SECRET=$(openssl rand -base64 32 | tr -d '/+=' | cut -c1-32)/\" .env\n```\n\nOnce the `.env` file has been created, you can go in and edit any settings you'd like.\n\nNext, we'll generate our JWT key to authenticate to PostgREST. You can do this with many services, but we'll generate it with this script:\n\n```bash\nsed -i \"s/^PGRST_APIKEY=.*/PGRST_APIKEY=$(python3 scripts/generate-db-jwt.py)/\" .env\n```\n\nThat key will be valid for 30 days, to refresh it just run that line again.\n\nTo actually start our database and associated services:\n\n```bash\njust db-up\n```\n\nThis command will start the database, the REST adapter, and the backup utility. These services need to be running during the `extract` process, group building process, and site building process. When the site is deployed it reaches out to the database for a few non-critical functions.\n\nThe database will run in Postgres, which will persist data in the `postgres_data` folder. You should never need to access or edit the contents of this folder. Another container handles backups, which will be placed in the `postgres_backups` folder daily.\n\nIf you ever change a setting in the `.env` file, you can re-run `just db-up` to reload the configuration and restart containers if necessary.\n\nTo manually run a backup or get the database schema:\n\n```bash\njust db-backup # run a backup and save in the postgres_backups folder\njust db-get-schema # extract the current schema and output to stdout\n```\n\nImport the schema, roles, and some basic data with the `db-load-schema` task:\n\n```bash\njust db-load-schema # run all provided SQL files\n```\n\nReload PostgREST for it to see the new schema:\n\n```bash\ndocker kill -s SIGUSR1 postgrest # trigger a reload\ndocker restart postgrest # or restart the whole container\n```\n\nThen you can test that everything is working with curl:\n\n```bash\njust db-curl platforms\n```\n\nYou should see data for each platform, formatted for readability with `jq`.\n\nWe don't need to do this yet, but to completely stop the database and services:\n\n```bash\njust db-down\n```\n\n## Step 3. Importing data from the cache into the database\n\nOnce everything has been downloaded from the platform APIs, we can extract and import that data into the database.\n\nThis utility will read the data files you just downloaded and make sure every item matches our known API schemas. If anything changes on the API end, this is where you will see the errors. Please [submit an issue](https://github.com/wasabipesto/themis/issues/new) if you encounter any fatal errors in this step.\n\nRunning this program in full on default settings will take about 5 minutes and probably produce a few dozen non-fatal errors. Every platform has a couple items that are \"invalid\" in some way, and we've taken those into account when setting up our error handling.\n\nAfter a few thousand items are ready to upload, the program will send them to the database through the PostgREST endpoints. It should fail quickly if it's unable to connect to the database.\n\nEnsure the database services are running and then run:\n\n```bash\njust extract --help # for options\njust extract --schema-only # check that schemas pass\njust extract # run with default settings\n```\n\nIf you get an error like `Connection refused (os error 111)`, make sure you imported all schemas and reloaded the PostGREST configuration from the previous section.\n\nThen you can test that everything is working with curl:\n\n```bash\njust db-curl \"markets?limit=10\"\njust db-curl \"markets?select=count\"\njust db-curl \"daily_probabilities?limit=10\"\njust db-curl \"daily_probabilities?select=count\"\n```\n\nYou should see a few sample markets and data points, with total counts for each.\n\nThe extract tool is designed to be safe to run multiple times. It will only overwrite items in the market table, and it will update items if they already exist. You can even run it while the download is in-progress to extract what's available.\n\n## Step 4. Creating and processing question groups\n\nThe heart of the site are \"questions\", which are small groups assembled from markets that are equivalent.\n\nIdeally every platform would predict every important event with the same method and resolve with the same criteria, but they don't. Some platforms are legally unable, some have differing market mechanisms, and some just don't like predicting certain things. Our goal is to find a few markets from different platforms that are similar enough to be comparable and link them together under a \"question\" and do this as many times as possible.\n\nRight now this is done manually in order to ensure that linked markets are actually similar enough to be comparable. For my purposes, two markets are similar enough if the differences in their resolution criteria would affect their probabilities by 1% or less. For instance, two markets with a duration over 6 months with close dates differing by 1 day are usually still similar enough to compare equitably. This requires a fair amount of human judgment, though I am experimenting with ways to automate it.\n\nIn the meantime, I have made a secondary Astro site with the tools you will need to search and view markets, link markets into groups, edit the question groups, and edit most other items in the database. To run it in the basic mode, run:\n\n```bash\njust group\n```\n\nThis will launch the site in Astro's dev mode, which will be enough for anything you need to do. The site can also be compiled statically and served in the same way as the main site, but I recommend against doing this since it will have your database admin credentials baked in.\n\nIf you want to use embeddings to find similar markets, you can generate them with the following script:\n\n```bash\nuv run scripts/update-embeddings.py\n```\n\nFor now I am intentionally not documenting specific features of the admin tools since they are not user-facing and I am constantly changing them to suit my needs better. The method I have found that works best for me is:\n\n- Sort all markets by volume, number of traders, or duration. Find one that seems interesting.\n- Find markets from other platforms that have equivalent or nearly-equivalent resolution criteria.\n- Sort those markets by volume, number of traders, or duration to find the one \"authoritative\" market per platform.\n- Create a question group with a representative title and slug consistent with your conventions.\n- Add all selected markets to the question by copying in their IDs.\n- Check that the probabilities overlap and set start/end date overrides if necessary.\n- Check that the resolutions match and invert questions if necessary.\n- While you have those searches open, look for other possible question groups in the same topic.\n- Once you have exhausted the markets in that topic, return to the top-level search and find another topic.\n\n## Step 5. Site preprocessing\n\nWhen you have finished grouping markets, you can calculate all market scores by running the grader tool:\n\n```bash\n# optional, fix criterion probabilities to be more intuitive for linked questions\nuv run scripts/fix-criterion-probs.py\n\n# caluclate all scores and grades\njust grade\n```\n\nThis tool will run through basically everything in the database and calculate some scores that are a little to compute-intensive to do at build time and refresh all the database views. This tool is non-destructive just like the others, you can run it over and over again and lose nothing but your time. Just make sure you re-run it every time you finish grouping markets before generating the site.\n\nYou will also need to generate embeddings for related questions. You can generate those with the following script:\n\n```bash\n# just generate embeddings for questions\nuv run scripts/update-embeddings.py --questions-only\n\n# regenerate embeddings for all items\nuv run scripts/update-embeddings.py --all\n```\n\n\n## Step 6. Generating site\n\nThe site is static and designed to be deployed behind any standard web server such as `nginx`. It could also be deployed to GitHub Pages, an AWS S3 bucket, or any other static site host.\n\nYou can view a preview of the site or build it like so:\n\n```bash\njust site-dev # live preview the site in a browser\njust site-build # build the site to the site/dist directory\n```\n\nThe first site page load (in preview mode) or build will take a while as items are downloaded from the server. Subsequent loads/builds will be much faster but will not reflect the database's current state. In order to clear the cache, run the task:\n\n```bash\njust site-cache-reset # invalidate site data cache\n```\n\nWe use `rclone` to deploy the site to your provider of choice. First, configure your `rclone` target and add the details to the `.env` file:\n\n```bash\nrclone config # set up a new target\nnano .env # add your rclone target path\n```\n\nThen, you can deploy the site at any time with this command:\n\n```bash\njust deploy # build and deploy site to rclone target\n```\n\n### I just want to develop on the site\n\nIf you're just developing on the site you don't actually need to use the download and extract tools!\n\nYou can build the site against my public database that the main site builds from by doing either of these:\n\n- Change the `PGRST_URL` variable in the `.env` environment file to `https://data.brier.fyi`.\n- Run the site development server with `PGRST_URL=\"https://data.brier.fyi\" just site-dev`.\n\nFirst load of the dev site will be slow while it caches some of the Big Data™. Other than that the Astro project should be pretty straightforward.\n\n## Step 7. Downloading new markets\n\nOver time, new markets will be added and other markets will be updated. In order to update the database with the freshest data, you can re-run the download and extract programs to load the new data.\n\nThe download program has two different arguments for resetting:\n\n- `--reset-index` will re-download the platform index and then follow any rules set for what to download. This is good for catching markets that have been added since the last download but will not refresh markets that already existed but were resolved since the last download. This is usually not what you want for updating a database.\n- `--reset-cache` used by itself will re-download _everything,_ updating the database with 100% fresh data. Unfortunately this will take several days unless used with one of the filters below.\n- `--resolved-since` will filter the market download queue to just those resolved since the given date. Must be in the form of an ISO 8601 string.\n- `--resolved-since-days-ago` will do the exact same as the previous option, but with a duration supplied instead of a date. This is usually the best option for a scripted refresh.\n\nAll `reset` options make a backup of the previous data files in case you want to look at past data.\n\n```bash\n# run a full refresh and add to the database\njust download --reset-cache \u0026\u0026 just extract\n\n# only download markets resolved recently and add to the database\njust download --reset-cache --resolved-since-days-ago 10 \u0026\u0026 just extract\n```\n\nAfter the data is downloaded, you can add groups and edit data in the database as before. Then, build the site again and see the results.\n\n### Wiping the Markets Table\n\nEventually you may want to wipe the markets table in the database, either because you are changing the database schema or because you want to start fresh. In order to do this without losing data you will need to first export your questions and market-question links. I've provided a script to do this:\n\n```bash\n# back up your database, just in case\njust db-backup\n\n# export the questions and market links\nuv run scripts/migrate.py --mode export\n\n# either drop all tables\njust db-run-sql schema/00-drop-all.sql\n# or wipe the data folder\njust db-down\nsudo rm -r postgres_data\njust db-up\n\n# load the schema\njust db-load-schema\n\n# reload the schema cache\ndocker kill -s SIGUSR1 postgrest\n\n# import the questions and market links\nuv run scripts/migrate.py --mode import\n\n# calculate stats and refresh everything else\njust grade\n\n# check and build the site\njust site-dev\njust site-build\n\n# check everything is in place\njust db-curl \"market_details?limit=10\u0026question_slug=not.is.null\"\n```\n\nNote that this is not necessary if you want to edit table views. To reload the database view schema, just run:\n\n```bash\njust db-run-sql schema/03-views.sql\n```\n\n# I just want the data\n\nThe production database is publicly readable via PostgREST at [https://data.brier.fyi](https://data.brier.fyi/). This will lead you to a full OpenAPI spec, which you could plug in to Swagger or your client generator of choice.\n\nFor example, to get items from various tables:\n\n```bash\ncurl -sf https://data.brier.fyi/question_details?limit=100\ncurl -sf https://data.brier.fyi/market_details?limit=100\ncurl -sf https://data.brier.fyi/daily_probability_details?limit=100\n```\n\nYou can find PostgREST documentation here:\n\n- https://docs.postgrest.org/en/stable/references/api/tables_views.html\n- https://docs.postgrest.org/en/stable/references/api/pagination_count.html\n\n# Notes, news, and disclaimers\n\nThis project has been awarded the following grants:\n\n- $3,500 as part of the [Manifold Community Fund](https://manifund.org/projects/wasabipestos-umbrella-project), an impact certificate-based funding round.\n- $8,864 as part of the [EA Community Choice](https://manifund.org/projects/calibration-city), a donation matching pool.\n\nThese grants have been used for furthering development but have not influenced the contents of this site towards or away from any viewpoint.\n\nThis project has been featured in the following instances:\n\n- [Leveraging Log Probabilities in Language Models to Forecast Future Events](https://arxiv.org/abs/2501.04880v1)\n- [Tangle News: Lessons from the election you could bet on](https://www.readtangle.com/otherposts/lessons-from-the-election-you-could-bet-on/)\n- [Forecasting Newsletter: June 2024](https://forecasting.substack.com/p/forecasting-newsletter-june-2024)\n- [Calibrations Blog: Should we let ourselves see the future?](https://www.calibrations.blog/p/should-we-let-ourselves-see-the-future)\n- [Lightcone News: Accuracy and Trust](https://lightcone.news/about)\n- [Valis Research: Unanswered Questions Surrounding Prediction Markets](https://valisresearch.xyz/work/unanswered-questions-surrounding-prediction-markets/index.html)\n- [Human Invariant: The Future of Play Money Prediction Markets](https://www.humaninvariant.com/blog/pm-play)\n\nI use prediction markets, mainly Manifold and Metaculus, as a personal exercise in calibration. This project grew out of an effort to see how useful they can be as information-gathering tools.\n\nAs with any statistics, this data can be used to tell many stories. I do my best to present this data in a way that is fair, accurate, and with sufficient context.\n\n## License\n\nThe code for this project as presented in this repository is copyright under the MIT License, attached.\n\nThe contents of the live published website and database, including the explanatory descriptions, market/question links, categorizations, graphics, and visualizations are copyright under CC BY-NC-SA 4.0 Deed, attached.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwasabipesto%2Fthemis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwasabipesto%2Fthemis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwasabipesto%2Fthemis/lists"}