{"id":16387151,"url":"https://github.com/artitw/bert_qa","last_synced_at":"2025-03-21T02:31:27.869Z","repository":{"id":57414693,"uuid":"230797422","full_name":"artitw/BERT_QA","owner":"artitw","description":"Accelerating the development of question-answering systems based on BERT and TF 2.0","archived":false,"fork":false,"pushed_at":"2020-02-01T03:19:19.000Z","size":122,"stargazers_count":19,"open_issues_count":7,"forks_count":4,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-17T19:52:19.050Z","etag":null,"topics":["artificial-intelligence","bert","machine-learning","natural-language-processing","natural-language-understanding","nlp"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1909.05017","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/artitw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-12-29T20:02:35.000Z","updated_at":"2023-06-22T11:02:43.000Z","dependencies_parsed_at":"2022-09-10T04:03:45.835Z","dependency_job_id":null,"html_url":"https://github.com/artitw/BERT_QA","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artitw%2FBERT_QA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artitw%2FBERT_QA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artitw%2FBERT_QA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/artitw%2FBERT_QA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/artitw","download_url":"https://codeload.github.com/artitw/BERT_QA/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244725553,"owners_count":20499628,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","bert","machine-learning","natural-language-processing","natural-language-understanding","nlp"],"created_at":"2024-10-11T04:25:21.132Z","updated_at":"2025-03-21T02:31:27.526Z","avatar_url":"https://github.com/artitw.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# BERT-QA\nBuild question-answering systems using state-of-the-art pre-trained contextualized language models, e.g. BERT. We are working to accelerate the development of question-answering systems based on BERT and TF 2.0!\n\n## Background\n\nThis project is based on our study: [Question Generation by Transformers](https://arxiv.org/abs/1909.05017).\n\n### Citation\n\nTo cite this work, use the following BibTeX citation.\n\n```\n@article{question-generation-transformers@2019,\n  title={Question Generation by Transformers},\n  author={Kriangchaivech, Kettip and Wangperawong, Artit},\n  journal={arXiv preprint arXiv:1909.05017},\n  year={2019}\n}\n```\n\n## Requirements\nTensorFlow 2.0 will be installed if not already on your system\n\n## Installation\n```\npip install bert_qa\n```\n\n## Example usage\nRun Colab demo notebook [here](https://colab.research.google.com/drive/1-tLvxSuI0ik2BaruaY_Ivoh_4eobWzEW).\n\n### download pre-trained models and SQuAD data\n```\nwget -q https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/uncased_L-12_H-768_A-12.tar.gz\ntar -xvzf uncased_L-12_H-768_A-12.tar.gz\nmv -f home/hongkuny/public/pretrained_models/keras_bert/uncased_L-12_H-768_A-12 .\n```\n\n### download SQuAD data\n```\nwget -q https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json\nwget -q https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json\n```\n\n### import, initialize, pre-process data, finetune, and predict!\n```\nfrom bert_qa import squad\nqa = squad.SQuAD()\nqa.preprocess_training_data()\nqa.fit()\npredictions = qa.predict()\n```\n\n### evaluate\n```\nimport json\npred_data = json.load(open('model/predictions.json'))\ndev_data = json.load(open('dev-v1.1.json'))['data']\nqa.evaluate(dev_data, pred_data)\n```\n\n## Advanced usage\n\n### Model type\nThe default model is an uncased Bidirectional Encoder Representations from Transformers (BERT) consisting of 12 transformer layers, 12 self-attention heads per layer, and a hidden size of 768. Below are all models currently supported that you can specify with `hub_module_handle`. We expect that more will be added in the future. For more information, see [TensorFlow's BERT GitHub](https://github.com/tensorflow/models/blob/master/official/nlp/bert/README.md).\n\n*   **[`BERT-Large, Uncased (Whole Word Masking)`](https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/wwm_uncased_L-24_H-1024_A-16.tar.gz)**:\n    24-layer, 1024-hidden, 16-heads, 340M parameters\n*   **[`BERT-Large, Cased (Whole Word Masking)`](https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/wwm_cased_L-24_H-1024_A-16.tar.gz)**:\n    24-layer, 1024-hidden, 16-heads, 340M parameters\n*   **[`BERT-Base, Uncased`](https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/uncased_L-12_H-768_A-12.tar.gz)**:\n    12-layer, 768-hidden, 12-heads, 110M parameters\n*   **[`BERT-Large, Uncased`](https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/uncased_L-24_H-1024_A-16.tar.gz)**:\n    24-layer, 1024-hidden, 16-heads, 340M parameters\n*   **[`BERT-Base, Cased`](https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/cased_L-12_H-768_A-12.tar.gz)**:\n    12-layer, 768-hidden, 12-heads , 110M parameters\n*   **[`BERT-Large, Cased`](https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/cased_L-24_H-1024_A-16.tar.gz)**:\n    24-layer, 1024-hidden, 16-heads, 340M parameters\n\n\n## Contributing\nBERT-QA is an open-source project founded and maintained to better serve the machine learning and data science community. Please feel free to submit pull requests to contribute to the project. By participating, you are expected to adhere to BERT-QA's [code of conduct](CODE_OF_CONDUCT.md).\n\n## Questions?\nFor questions or help using BERT-QA, please submit a GitHub issue.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fartitw%2Fbert_qa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fartitw%2Fbert_qa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fartitw%2Fbert_qa/lists"}