{"id":21602369,"url":"https://github.com/sugarcane-mk/finetuning_wav2vec2","last_synced_at":"2026-05-09T01:36:32.458Z","repository":{"id":261549384,"uuid":"872319354","full_name":"sugarcane-mk/finetuning_wav2vec2","owner":"sugarcane-mk","description":"This repo provides step by step process from sctatch to fine tune facebook's wav2vec2-large model using transformers","archived":false,"fork":false,"pushed_at":"2024-11-07T05:41:53.000Z","size":43,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-18T13:25:15.531Z","etag":null,"topics":["asr","asr-model","cuda","facebook","fairseq","fine-tuning","finetuning","huggingface","librosa","python","torch","transformers","wav2vec2","wav2vec2-large-960h"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sugarcane-mk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-14T08:29:12.000Z","updated_at":"2024-11-12T03:13:30.000Z","dependencies_parsed_at":"2024-11-07T06:30:48.595Z","dependency_job_id":"a4b87ead-bcdf-435a-9400-88282ff786cf","html_url":"https://github.com/sugarcane-mk/finetuning_wav2vec2","commit_stats":null,"previous_names":["sugarcane-mk/finetuning_wav2vec2"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sugarcane-mk/finetuning_wav2vec2","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sugarcane-mk%2Ffinetuning_wav2vec2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sugarcane-mk%2Ffinetuning_wav2vec2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sugarcane-mk%2Ffinetuning_wav2vec2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sugarcane-mk%2Ffinetuning_wav2vec2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sugarcane-mk","download_url":"https://codeload.github.com/sugarcane-mk/finetuning_wav2vec2/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sugarcane-mk%2Ffinetuning_wav2vec2/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32804252,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"ssl_error","status_checked_at":"2026-05-08T08:22:45.650Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asr","asr-model","cuda","facebook","fairseq","fine-tuning","finetuning","huggingface","librosa","python","torch","transformers","wav2vec2","wav2vec2-large-960h"],"created_at":"2024-11-24T19:13:06.126Z","updated_at":"2026-05-09T01:36:32.443Z","avatar_url":"https://github.com/sugarcane-mk.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fine-tuning Wav2Vec2 for Tamil Speech Recognition\n\nThis repository contains the Jupyter Notebook and resources for fine-tuning the Wav2Vec2 model for Tamil speech recognition using the Hugging Face Transformers library.\n\n## Table of Contents\n\n- [Introduction](#introduction)\n- [Requirements](#requirements)\n- [Dataset](#dataset)\n- [Training](#training)\n- [Inference](#inference)\n- [Results](#results)\n- [Acknowledgements](#acknowledgments)\n\n## Introduction\n\nWav2Vec2 is a state-of-the-art model for automatic speech recognition (ASR). This project aims to adapt Wav2Vec2 for the Tamil language, leveraging available datasets to improve performance in recognizing spoken Tamil.\n\n## Requirements\n\nTo run this project, ensure you have the following installed:\n\n- Python 3.7 or higher\n- Jupyter Notebook\n- PyTorch\n- Transformers\n- Datasets\n- Librosa\n- Soundfile\n- [CUDA](https://developer.nvidia.com/cuda-downloads)\n\nYou can install the required packages using the following command:\n```bash\npip install -r requirements.txt\n```\n\n## Dataset\nWe use Tamil Speech Dataset for fine-tuning the model. The dataset consists of audio files in Tamil along with their transcriptions. Please ensure you download the dataset and place it in an accessible directory.\nRefer datapreprocessing.py\n\n## Training\nTo fine-tune the Wav2Vec2 model, open the [Jupyter Notebook](https://github.com/sugarcane-mk/finetuning_wav2vec2/blob/main/Finetune_wav2vec2_xlsr_tamil.ipynb) and follow the instructions provided within the notebook to execute the training process.\n\n## Inference\nAfter training, you can perform inference using the code snippets provided in the Jupyter Notebook. Ensure to replace the paths with your specific audio files.\n\n##  Results\nThe performance of the model can be evaluated using standard metrics such as Word Error Rate (WER). The notebook contains sections on evaluating the model's performance.\n```bash\npip install jiwer\n\n```\n```python\nimport jiwer\n\noriginal_transcript = \"God is great\"  # Example script replace with your transcription\noutput_transcription = \"good is great\"\n\n# Compute WER\nwer = jiwer.wer(reference, hypothesis)\nprint(f\"Word Error Rate (WER): {wer:.2f}\")\n\n```\n## Acknowledgments\nFor further reference please visit: [Fairseq Wav2Vec2](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsugarcane-mk%2Ffinetuning_wav2vec2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsugarcane-mk%2Ffinetuning_wav2vec2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsugarcane-mk%2Ffinetuning_wav2vec2/lists"}