https://github.com/jiahaoxiang2000/cvoice

Cvoice is one tools for voice recognition and synthesis. which can change one pice of video audio from one person to another person.
https://github.com/jiahaoxiang2000/cvoice

audio-processing llm-application

Last synced: 9 months ago
JSON representation

Cvoice is one tools for voice recognition and synthesis. which can change one pice of video audio from one person to another person.

Host: GitHub
URL: https://github.com/jiahaoxiang2000/cvoice
Owner: jiahaoxiang2000
Created: 2025-01-28T12:46:40.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-20T03:04:25.000Z (over 1 year ago)
Last Synced: 2025-02-20T03:27:20.465Z (over 1 year ago)
Topics: audio-processing, llm-application
Language: Python
Homepage:
Size: 32.2 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

          # cvoice

Cvoice is one tools for voice recognition and synthesis. which can change one pice of video audio from one person to another person.

## Dependencies

- ffmpeg

- pip install -r requirements.txt

## Issues

- the deepseek platform api website is closed, so we stop the model development. time : 2025-01-30.  

## TODO

- [x] use the log system to replace the print

- [ ] use the online model to replace the offline model

  - [x] use the deepseek r1 model to optimize the srt file is **accuracy**.

- [x] let the cli and args can do one small function, like the text to audio, audio to text, and so on.

- [ ] optimize the running logic, let the result video can be more accurate.

## Usage

the voice recognition example, which default output the .srt file on the `data` folder.

```shell

python cli.py transcribe --input data/extracted_audio.wav

```

## How it works?

- First, we need to split the audio from the video.

- Then we need to convert the audio to text.

- After that, we need to convert the text to another text. Let the text more accurate.

- Then we need to convert the text to audio.

- Finally, we need to merge the audio to the video.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jiahaoxiang2000/cvoice

Awesome Lists containing this project

README