{"id":25339882,"url":"https://github.com/eroydev/speech2text-api","last_synced_at":"2025-07-28T14:42:25.461Z","repository":{"id":277417677,"uuid":"932369503","full_name":"ERoydev/Speech2Text-API","owner":"ERoydev","description":null,"archived":false,"fork":false,"pushed_at":"2025-02-13T20:06:48.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-13T20:37:36.225Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ERoydev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-13T19:58:06.000Z","updated_at":"2025-02-13T20:06:51.000Z","dependencies_parsed_at":"2025-02-13T20:37:42.932Z","dependency_job_id":null,"html_url":"https://github.com/ERoydev/Speech2Text-API","commit_stats":null,"previous_names":["eroydev/speech2text-api"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ERoydev%2FSpeech2Text-API","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ERoydev%2FSpeech2Text-API/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ERoydev%2FSpeech2Text-API/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ERoydev%2FSpeech2Text-API/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ERoydev","download_url":"https://codeload.github.com/ERoydev/Speech2Text-API/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247833360,"owners_count":21003750,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-14T07:43:21.375Z","updated_at":"2025-04-08T11:30:53.715Z","avatar_url":"https://github.com/ERoydev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Speech2Text-API\r\n\r\nWhisperTranscriber is a lightweight microservice that leverages [faster_whisper](https://github.com/guillaumekln/faster-whisper) to convert speech to text. Built with FastAPI and pydub, it provides a simple REST API endpoint for uploading audio files and receiving transcriptions.\r\n\r\n## Features\r\n\r\n- **Speech-to-Text Transcription:** Uses faster_whisper to transcribe `.wav` audio files.\r\n- **REST API:** A POST endpoint to upload audio files and get transcriptions.\r\n- **Temporary File Handling:** Uses Python's `tempfile` to manage audio files during processing.\r\n- **Asynchronous Processing:** Built on FastAPI to handle concurrent requests.\r\n\r\n## Requirements\r\n\r\n- Python 3.8+\r\n- [FastAPI](https://fastapi.tiangolo.com/)\r\n- [Uvicorn](https://www.uvicorn.org/)\r\n- [faster_whisper](https://github.com/guillaumekln/faster-whisper)\r\n- [pydub](https://github.com/jiaaro/pydub)\r\n- [ffmpeg](https://ffmpeg.org/) (required by pydub)\r\n\r\n## Installation\r\n\r\n1. **Clone the repository:**\r\n\r\n   ```bash\r\n   git clone https://github.com/yourusername/whisper-transcriber.git\r\n   cd whisper-transcriber\r\n\r\n\r\n2. **Examples:**\r\n\r\n- Create a POST request using whatever http client you want and make a request to this url /audio_transcription\r\n- You need to have audio.wav file and create multipart/formdata\r\n```py\r\n    with requests.Session() as session: # THIS IS REQUEST TO MY MICROSERVICE TO GET THE TRANSCIPTION =\u003e TODO: IMPLEMENT ASYNCHRONOUS BEHAVIOUR\r\n        files = {'file': (audio_file.name, audio_file, audio_file.content_type)}  # I create multipart/formdata\r\n        response = session.post('http://127.0.0.1:9000/audio_transcription', files=files)\r\n\r\n  You need to use ur server url and post request to /audio_transcription\r\n```\r\n\r\nThe microservice response is:\r\n```py\r\n{\r\n    \"transcription_text\": \" Hello, I want to test the functionality of my backend. I don't know why I am getting these errors.\",\r\n    \"transcribed_audio\": {\r\n        \"text\": \" Hello, I want to test the functionality of my backend. I don't know why I am getting these errors.\",\r\n        \"segments\": [],\r\n        \"language\": \"en\"\r\n    },\r\n    \"audio_duration\": 7.32\r\n}\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feroydev%2Fspeech2text-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feroydev%2Fspeech2text-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feroydev%2Fspeech2text-api/lists"}