{"id":14036960,"url":"https://github.com/allenai/winogrande","last_synced_at":"2025-10-13T15:57:11.778Z","repository":{"id":75564652,"uuid":"222851495","full_name":"allenai/winogrande","owner":"allenai","description":"WinoGrande: An Adversarial Winograd Schema Challenge at Scale","archived":false,"fork":false,"pushed_at":"2020-03-13T16:01:48.000Z","size":30,"stargazers_count":85,"open_issues_count":2,"forks_count":9,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-08-12T03:06:10.851Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://winogrande.allenai.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/allenai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-20T04:35:18.000Z","updated_at":"2024-08-12T03:06:11.622Z","dependencies_parsed_at":"2023-06-06T22:15:40.468Z","dependency_job_id":null,"html_url":"https://github.com/allenai/winogrande","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fwinogrande","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fwinogrande/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fwinogrande/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allenai%2Fwinogrande/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/allenai","download_url":"https://codeload.github.com/allenai/winogrande/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227762337,"owners_count":17816011,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-12T03:02:21.746Z","updated_at":"2025-10-13T15:57:11.721Z","avatar_url":"https://github.com/allenai.png","language":"Python","readme":"# WinoGrande \n\nVersion 1.1 \n\n- - - \n\n## Data\n\nDownload dataset by `download_winogrande.sh`\n\n    ./data/\n    ├── train_[xs,s,m,l,xl].jsonl          # training set with differnt sizes\n    ├── train_[xs,s,m,l,xl]-labels.lst     # answer labels for training sets\n    ├── dev.jsonl                          # development set\n    ├── dev-labels.lst                     # answer labels for development set\n    ├── test.jsonl                         # test set\n    ├── sample-submissions-labels.lst      # example submission file for leaderboard    \n    └── eval.py                            # evaluation script\n    \nYou can use `train_*.jsonl` for training models and `dev` for validation.\nPlease note that labels are not included in `test.jsonl`. To evaluate your models on `test` set, make a submission to our [leaderboard](https://leaderboard.allenai.org/winogrande/submissions/public).\n\n\n## Run experiments\n\n### Setup\n\n1. Download dataset by `download_winogrande.sh` \n1. `pip install -r requirements.txt`\n\n### Training (fine-tuning)\n\n1. You can train your model by `./scripts/run_experiment.py` (see `sample_training.sh`).\n\n        e.g., \n        export PYTHONPATH=$PYTHONPATH:$(pwd)\n\n        python scripts/run_experiment.py \\\n        --model_type roberta_mc \\ \n        --model_name_or_path roberta-large \\\n        --task_name winogrande \\\n        --do_eval \\\n        --do_lower_case \\\n        --data_dir ./data \\\n        --max_seq_length 80 \\\n        --per_gpu_eval_batch_size 4 \\\n        --per_gpu_train_batch_size 16 \\\n        --learning_rate 1e-5 \\\n        --num_train_epochs 3 \\\n        --output_dir ./output/models/ \\\n        --do_train \\\n        --logging_steps 4752 \\\n        --save_steps 4750 \\\n        --seed 42 \\\n        --data_cache_dir ./output/cache/ \\\n        --warmup_pct 0.1 \\\n        --evaluate_during_training\n\n1. If you have an access to [beaker](https://beaker.org/), you can run your experiments by `sh ./train_winogrande_on_bkr.sh`.\n\n1. Results will be stored under `./output/models/`. \n\n### Prediction (on the test set)\n\n1. You can make predictions by `./scripts/run_experiment.py` directly (see `sample_prediction.sh`).\n\n        e.g., \n        export PYTHONPATH=$PYTHONPATH:$(pwd)\n\n        python scripts/run_experiment.py \\\n        --model_type roberta_mc \\\n        --model_name_or_path .output/models \\\n        --task_name winogrande \\\n        --do_predict \\\n        --do_lower_case \\\n        --data_dir ./data \\\n        --max_seq_length 80 \\\n        --per_gpu_eval_batch_size 4 \\\n        --output_dir ./output/models/ \\\n        --data_cache_dir ./output/cache/ \\\n\n1. If you have an access to [beaker](https://beaker.org/), you can run your experiments  by `sh ./predict_winogrande_on_bkr.sh`.\n\n1. Result is stored in `./output/models/predictions_test.lst`\n\n\n## Evaluation\n\nYou can use `eval.py` for evaluation on the dev split, which yields `metrics.json`. \n\n    e.g., python eval.py --preds_file ./YOUR_PREDICTIONS.lst --labels_file ./dev-labels.lst\n\nIn the prediction file, each line consists of the predictions (1 or 2) by 5 training sets (ordered by `xs`, `s`, `m`, `l`, `xl`, separated by comma) for each evauation set question. \n\n     2,1,1,1,1\n     1,1,2,2,2\n     1,1,1,1,1\n     .........\n     .........\n\nNamely, the first column is the predictions by a model trained/finetuned on `train_xs.jsonl`, followed by a model prediction by `train_s.jsonl`, ... , and the last (fifth) column is the predictions by a model from `train_xl.jsonl`.\nPlease checkout a sample submission file (`sample-submission-labels.lst`) for reference.\n\n\n## Submission to Leaderboard\n\nYou can submit your predictions on `test` set to the [leaderboard](https://leaderboard.allenai.org/winogrande/submissions/public).\nThe submission file must be named as `predictions.lst`. The format is the same as above.  \n    \n## Reference\nIf you use this dataset, please cite the following paper:\n\n\t@article{sakaguchi2019winogrande,\n\t    title={WinoGrande: An Adversarial Winograd Schema Challenge at Scale},\n\t    author={Sakaguchi, Keisuke and Bras, Ronan Le and Bhagavatula, Chandra and Choi, Yejin},\n\t    journal={arXiv preprint arXiv:1907.10641},\n\t    year={2019}\n\t}\n\n\n## License \n\nWinogrande (codebase) is licensed under the Apache License 2.0. The dataset is licensed under CC-BY.\n\n\n## Questions?\n\nPlease file GitHub issues with your questions/suggestions. You may also ask us questions at our [google group](https://groups.google.com/a/allenai.org/forum/#!forum/winogrande).\n\n\n## Contact \n\nEmail: keisukes[at]allenai.org\n","funding_links":[],"categories":["Python","Anthropomorphic-Taxonomy","Benchmark"],"sub_categories":["Typical Intelligence Quotient (IQ)-General Intelligence evaluation benchmarks","English"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallenai%2Fwinogrande","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fallenai%2Fwinogrande","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallenai%2Fwinogrande/lists"}