{"id":14958822,"url":"https://github.com/obryanlouis/qa","last_synced_at":"2025-05-02T12:31:49.431Z","repository":{"id":70290731,"uuid":"103498926","full_name":"obryanlouis/qa","owner":"obryanlouis","description":"TensorFlow Models for the Stanford Question Answering Dataset","archived":false,"fork":false,"pushed_at":"2018-12-14T06:24:47.000Z","size":223,"stargazers_count":72,"open_issues_count":1,"forks_count":30,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-04-07T02:06:17.724Z","etag":null,"topics":["python3","question-answering","squad","stanford-nlp","tensorflow-tutorials"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/obryanlouis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-09-14T07:12:26.000Z","updated_at":"2025-01-21T03:56:36.000Z","dependencies_parsed_at":"2023-03-22T16:17:42.142Z","dependency_job_id":null,"html_url":"https://github.com/obryanlouis/qa","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obryanlouis%2Fqa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obryanlouis%2Fqa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obryanlouis%2Fqa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obryanlouis%2Fqa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/obryanlouis","download_url":"https://codeload.github.com/obryanlouis/qa/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252038201,"owners_count":21684649,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["python3","question-answering","squad","stanford-nlp","tensorflow-tutorials"],"created_at":"2024-09-24T13:18:21.349Z","updated_at":"2025-05-02T12:31:47.114Z","avatar_url":"https://github.com/obryanlouis.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Question Answering on SQuAD\n===========================\nThis project implements models that train on the\n[Stanford Question Answering Dataset](https://rajpurkar.github.io/SQuAD-explorer/)\n(SQuAD). The SQuAD dataset is comprised of pairs of passages and questions\ngiven in English text where the answer to the question is a span of text in the\npassage. The goal of a model that trains on SQuAD is to predict the answer to\na given passage/question pair. The project's main site has examples of some of\nthe passages, questions, and answers, as well as a ranking for the\nexisting models.\n\nSpecifically, this project implements:\n* [Match-LSTM](https://arxiv.org/abs/1608.07905)\n* [Rnet](https://www.microsoft.com/en-us/research/publication/mrc/)\n* [Mnemonic Reader](https://arxiv.org/abs/1705.02798)\n* [Fusion Net](https://arxiv.org/abs/1711.07341)\n\nI primarily made this for my own education, but the code could be used as a\nstarting point for another project. The models are written in TensorFlow and \nthe project uses (optional) AWS S3 storage for model checkpointing and\ndata storage.\n\n\nResults\n------------\n|Model                    | Dev Em            | Dev F1   | Details |\n| ------------------------|:-----------------:| -------- |:------: |\n|Fusion Net               | 73.5%             | 82.0%    | Checkout [82feaa3f78a51eaeb66c5578c5d5a9f125711312](https://github.com/obryanlouis/qa/commit/82feaa3f78a51eaeb66c5578c5d5a9f125711312) `python3 train_local.py --model_type=fusion_net --rnn_size=128 --batch_size=16 --input_dropout=0.4 --rnn_dropout=0.3 --dropout=0.4` training time ~11 hours over 2 1080 Ti GPUs, ~31 min/epoch        |\n|Mnemonic reader          | 71.2%               | 80.1%    | Checkout [82feaa3f78a51eaeb66c5578c5d5a9f125711312](https://github.com/obryanlouis/qa/commit/82feaa3f78a51eaeb66c5578c5d5a9f125711312) `python3 train_local.py --model_type=mnemonic_reader --rnn_size=40 --batch_size=65 --input_dropout=0.3 --rnn_dropout=0.3 --dropout=0.3` training time ~6 hours over 2 1080 Ti GPUs, ~8 min/epoch     |\n|Rnet                     | ~60%             | ~70%    |         |\n|Match LSTM               | ~58%             | ~68%    |         |\n\nAll results are for a single model rather than an ensemble.\nI didn't train all models for the same duration and there may be bugs or\nunoptimized hyperparameters in my implementation.\n\nThanks to [@Bearsuny](https://github.com/Bearsuny) for identifying an issue\nin the evaluation. It now uses the official/correct scoring mechanism.\n\nRequirements\n-------------\n* [Python 3](https://www.python.org/downloads/)\n* [spaCy](https://spacy.io/) and the \"en\" model\n* [Cove vectors](https://github.com/salesforce/cove) - You can skip this part\n  but will probably need to manually remove any cove references in the setup.\n  This also requires [pytorch](http://pytorch.org/).\n* Tensorflow 1.4\n* cuDNN 7 recommended, GPUs required\n\nUsing AWS S3\n--------------\nIn order to use AWS S3 for model checkpointing and data storage, you must set\nup AWS credentials.\n[This page](http://docs.aws.amazon.com/cli/latest/userguide/cli-config-files.html)\nshows how to do it.\n\nAfter your credentials are set up, you can enable S3 in the project by setting\nthe `use_s3` flag to `True` and setting `s3_bucket_name` to the name of your\nS3 bucket.\n\n```\nf.DEFINE_boolean(\"use_s3\", True, ...)\n...\nf.DEFINE_string(\"s3_bucket_name\", \"\u003cYOUR S3 BUCKET HERE\u003e\",...)\n```\n\nHow to run it\n-------------\n### Setup\n```\npython3 setup.py\n```\n\n### Training\nThe following command will start model training and create or restore the\ncurrent model parameters from the last checkpoint (if it exists). After each\nepcoh, the Dev F1/Em are calculated, and if the F1 score is a new high score,\nthen the model parameters are saved. There is no mechanism to automatically\nstop training; it should be done manually.\n```\npython3 train_local.py --num_gpus=\u003cNUMBER OF GPUS\u003e\n```\n\n### Evaluation\nThe following command will evaluate the model\non the Dev dataset and print out the exact match and f1 scores.\nTo make it easier to use the compatible SQuAD-formatted model outputs, the\npredicted strings for each question will be written to the `evaluation_dir`\nin a file called `predictions.json.`\nIn addition, if the `visualize_evaluated_results` flag is `true`, then\nthe passsages, questions, and ground truth spans will be written to output\nfiles specified in the `evaluation_dir` flag.\n\n```\npython3 evaluate_local.py --num_gpus=\u003cNUMBER OF GPUS\u003e\n```\n\n### Visualizing training\nYou can visualize the model loss, gradients, exact match, and f1 scores as the\nmodel trains by using TensorBoard at the top level directory of this\nrepository.\n```\ntensorboard --logdir=log\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobryanlouis%2Fqa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fobryanlouis%2Fqa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobryanlouis%2Fqa/lists"}