{"id":13958537,"url":"https://github.com/yufan-aslp/AliMeeting","last_synced_at":"2025-07-21T00:31:10.820Z","repository":{"id":54526007,"uuid":"419168564","full_name":"yufan-aslp/AliMeeting","owner":"yufan-aslp","description":"The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.","archived":false,"fork":false,"pushed_at":"2022-06-10T02:51:32.000Z","size":504,"stargazers_count":114,"open_issues_count":7,"forks_count":17,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-11-28T02:34:47.742Z","etag":null,"topics":["aishell-4","alimeeting","asr","challenge","m2met","multi-speaker-asr","speaker-diarization"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yufan-aslp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-10-20T03:10:37.000Z","updated_at":"2024-11-21T03:31:07.000Z","dependencies_parsed_at":"2022-08-13T18:40:34.547Z","dependency_job_id":null,"html_url":"https://github.com/yufan-aslp/AliMeeting","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yufan-aslp/AliMeeting","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yufan-aslp%2FAliMeeting","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yufan-aslp%2FAliMeeting/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yufan-aslp%2FAliMeeting/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yufan-aslp%2FAliMeeting/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yufan-aslp","download_url":"https://codeload.github.com/yufan-aslp/AliMeeting/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yufan-aslp%2FAliMeeting/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266221259,"owners_count":23894965,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aishell-4","alimeeting","asr","challenge","m2met","multi-speaker-asr","speaker-diarization"],"created_at":"2024-08-08T13:01:42.623Z","updated_at":"2025-07-21T00:31:10.087Z","avatar_url":"https://github.com/yufan-aslp.png","language":"Python","funding_links":[],"categories":["Datasets","语音识别"],"sub_categories":["Diarization datasets","网络服务_其他"],"readme":"# M2MeT challenge baseline -- AliMeeting\n\n\nThis project provides the baseline system recipes for the ICASSP 2020 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT). The challenge mainly consists of two tracks, named ***Automatic Speech Recognition (ASR)*** and ***Speaker Diarization***. For each track, detailed descriptions can be found in its corresponding directory. The goal of this project is to simplify the training and evaluation procedures and make it flexible for participants to reproduce the baseline experiments and develop novelty methods.  \n\n\n## Setup\n\n```shell\ngit clone https://github.com/yufan-aslp/AliMeeting.git\n```\n\n## Introduction\n\n* [Speech Recognition Track](asr): Follow the detailed steps in `./asr`. \n* [Speaker Diarization Track](speaker): Follow the detailed steps in `./speaker`. \n  \n\n## General steps\n\n1. Prepare the training data for speaker diarization and ASR model, respectively\n2. Follow the running steps of the speaker diarization experiment and obtain the `rttm` file. The `rttm` file includes the voice activity detection (VAD) and speaker diarization results, which will be used to compute the final Diarization Error Rate (DER) scores.\n3. For ASR track, we can train the single-speaker or multi-speaker ASR models. The evaluation metric of ASR systems is Character Error Rate (CER).\n\n\n\n\n## Citation\n\nIf you use the challenge dataset or our baseline systems, please consider citing the following:\n\n    @inproceedings{Yu2022M2MeT,\n      title={M2{M}e{T}: The {ICASSP} 2022 Multi-Channel Multi-Party Meeting Transcription Challenge},\n      author={Yu, Fan and Zhang, Shiliang and Fu, Yihui and Xie, Lei and Zheng, Siqi and Du, Zhihao and Huang, Weilong and Guo, Pengcheng and Yan, Zhijie and Ma, Bin and Xu, Xin and Bu, Hui},\n      booktitle={Proc. ICASSP},\n      year={2022},\n      organization={IEEE}\n    }\n\n    @inproceedings{Yu2022Summary,\n      title={Summary On The {ICASSP} 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge},\n      author={Yu, Fan and Zhang, Shiliang and Guo, Pengcheng and Fu, Yihui and Du, Zhihao and Zheng, Siqi and Huang, Weilong and Xie, Lei  and Tan, Zheng-Hua and Wang, DeLiang and Qian, Yanmin and Lee, Kong Aik and Yan, Zhijie and Ma, Bin and Xu, Xin and Bu, Hui},\n      booktitle={Proc. ICASSP},\n      year={2022},\n      organization={IEEE}\n    }\n\nChallenge introduction paper: M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge (https://arxiv.org/abs/2110.07393?spm=a3c0i.25445127.6257982940.1.111654811kxLMY\u0026file=2110.07393)\n\n\nChallenge summary paper: Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge (https://arxiv.org/abs/2202.03647?spm=a3c0i.25445127.6257982940.2.111654811kxLMY\u0026file=2202.03647)\n\n\nThe AliMeeting data download at https://www.openslr.org/119\n\n\nRoom config of AliMeeting Train set download at https://speech-lab-share-data.oss-cn-shanghai.aliyuncs.com/AliMeeting/AliMeeting_Trainset_Room.xlsx\n\n\nM2MeT challege codalab(Open evaluation platform for Eval and Test sets of both Tracks): https://codalab.lisn.upsaclay.fr/competitions/?q=M2MeT\n\n\n## Organizing Committee \n* Lei Xie, AISHELL Foundation, China, xielei21st@gmail.com\n* Bin Ma, Principal Engineer at Alibaba, Singapore, b.ma@alibaba-inc.com\n* DeLiang Wang, Professor, Ohio State University, USA, dwang@cse.ohio-state.edu\n* Zheng-Hua Tan, Professor, Aalborg University, Denmark, zt@es.aau.dk\n* Kong Aik Lee, Senior Scientist, Institute for Infocomm Research, A*STAR, Singapore, kongaik.lee@ieee.org\n* Zhijie Yan, Director of Speech Lab at Alibaba, China, zhijie.yzj@alibaba-inc.com\n* Yanmin Qian, Associate Professor, Shanghai Jiao Tong University, China,\nyanminqian@sjtu.edu.cn\n* Hui Bu, CEO, AIShell Inc., China, buhui@aishelldata.com\n\n## Contributors\n\n[\u003cimg width=\"300\" height=\"100\" src=\"https://github.com/qq379840315/AliMeeting/blob/main/alibaba.png\"/\u003e](https://damo.alibaba.com/labs/speech/?lang=zh)[\u003cimg width=\"300\" height=\"100\" src=\"https://github.com/qq379840315/AliMeeting/blob/main/fig_aishell.jpg\"/\u003e](http://www.aishelltech.com/sy)[\u003cimg width=\"300\" height=\"100\" src=\"https://github.com/qq379840315/AliMeeting/blob/main/ISCA.png\"/\u003e](https://isca-speech.org/iscaweb/)\n\n## Code license \n\n[Apache 2.0](./LICENSE)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyufan-aslp%2FAliMeeting","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyufan-aslp%2FAliMeeting","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyufan-aslp%2FAliMeeting/lists"}