{"id":23725985,"url":"https://github.com/voidful/gsqa","last_synced_at":"2025-10-05T16:10:39.898Z","repository":{"id":169068585,"uuid":"623309450","full_name":"voidful/GSQA","owner":"voidful","description":"Generative Spoken Question Answering","archived":false,"fork":false,"pushed_at":"2024-01-31T10:28:39.000Z","size":487,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-05T00:25:05.981Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://voidful.github.io/GSQA/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/voidful.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-04T05:46:04.000Z","updated_at":"2024-08-13T16:37:17.000Z","dependencies_parsed_at":"2024-06-16T00:01:49.754Z","dependency_job_id":null,"html_url":"https://github.com/voidful/GSQA","commit_stats":null,"previous_names":["voidful/gsqa-generativespokenquestionanswering","voidful/gsqa"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/voidful/GSQA","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2FGSQA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2FGSQA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2FGSQA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2FGSQA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/voidful","download_url":"https://codeload.github.com/voidful/GSQA/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidful%2FGSQA/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273541898,"owners_count":25124056,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-04T02:00:08.968Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-31T00:18:06.701Z","updated_at":"2025-10-05T16:10:34.849Z","avatar_url":"https://github.com/voidful.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GSQA\n\n## Environment Settings\n```\npip3 install -r requirements.txt\n# pip3 install -r requirements_2.txt # Oscar's local env settings\n```\n\n\n## Fine-tuned LM List\nHuBERT Unit:[long-t5-base-SQA-hubert-100](https://huggingface.co/Oscarshih/long-t5-base-SQA)  \nmHuBERT Unit:[long-t5-base-SQA-mhubert-1000](https://huggingface.co/voidful/long-t5-base-SQA-mhubert-1000)  \n\n\n## Training\nDatasets: [NMSQA](https://huggingface.co/datasets/voidful/NMSQA-CODE)\n\nT5-series Model:[long-T5](https://huggingface.co/voidful/long-t5-encodec-tglobal-base/tree/main)\n\n\u003c!-- LLaMA Model:[LLaMA v2]() --\u003e\n\nTraining Script:\n```bash=\npython3 main.py\n```\n\n\n\n\n---\n\n\n## Multi-Task Training\nDatasets\n\u003e Unit Datasets: [GSQA/speech-alpaca-gpt4-unit](https://huggingface.co/datasets/GSQA/speech-alpaca-gpt4-unit)\n\u003e Speech Datasets [GSQA/spoken-alpaca-gpt4](https://huggingface.co/datasets/GSQA/spoken-alpaca-gpt4)\n\n[Models Hub](https://huggingface.co/GSQA)\n\u003e T5-series Model:[long-T5](https://huggingface.co/voidful/long-t5-encodec-tglobal-base/tree/main)\n\u003e alpaca-TQA-init T5-series Model: [LongT5-alpaca-TQA](https://huggingface.co/GSQA/LongT5-alpaca-TQA)\n\n### 1. setting\nlogin GSQA authorized huggingface account\n```\n$ huggingface-cli login\n```\nlogin wandb account to record training figures\n```\n$ wandb login --relogin\n```\n\n### 2. training script\n\n\n\n```bash=\n# select one of the aux_task in choices to fill after --aux_task\n$ python3 main_multiTask.py --aux_task qt,at,qu\n(choices=['qt,qu', 'qt,at,qu', \"qu,at\", \"at\"])\n```\n\n\u003c!-- ### step3\nEvaluating Script:\n\n```\npython3 whisper_evaluate.py \npython3 BertScore_eval.py # Remember to check the name of output files.\n``` --\u003e\n### 3. after finish training, push model to https://huggingface.co/GSQA\n\n\n---\n\n## Unit-to-unit Evaluation\nASR Model:[Whisper]() --\u003e TBD\n\n\u003c!-- Language Model:[Long-T5-HuBERT-Unit](https://huggingface.co/Oscarshih/long-t5-base-SQA), [Long-T5-mHuBERT-Unit](https://huggingface.co/voidful/long-t5-base-SQA-mhubert-1000) --\u003e\n\nEvaluating Script:\n```\n# stpe1: run\npython3 whisper_evaluate.py --model /path/to/the/huggingface/model --auto_split_dataset\n# (for more optional arguments check whisper_evaluate.py)\n\n# step 2: for alpaca dataset BertScore, run\npython3 BertScore_eval.py\n# (remember to change the evaluation file path first)\n\n# step 2: for dataset with context, run\npython3 eval_score.py # Remember to check the name of output files.\n# Note: Please put the best reported score to Overleaf Table.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidful%2Fgsqa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvoidful%2Fgsqa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidful%2Fgsqa/lists"}