https://github.com/akspa0/ComfyUI-FapMixPlus

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/akspa0/ComfyUI-FapMixPlus
Owner: akspa0
Created: 2024-11-11T02:45:18.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-11-11T02:59:10.000Z (8 months ago)
Last Synced: 2024-11-11T03:29:09.365Z (8 months ago)
Language: Python
Size: 0 Bytes
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-comfyui - **ComfyUI-FapMixPlus** - mixed outputs with optional audio appended to the end. The script organizes processed files into structured folders with sanitized filenames and retains original timestamps for continuity. (All Workflows Sorted by GitHub Stars)

README

# preFapMix.py

**preFapMix.py** is an audio processing script that applies soft limiting, optional loudness normalization, and optional slicing for transcription. It can also produce stereo-mixed outputs with optional audio appended to the end. The script organizes processed files into structured folders with sanitized filenames and retains original timestamps for continuity.

## Features

1. **Soft Limiting**: Reduces loud peaks in audio to prevent clipping.
2. **Optional Loudness Normalization**: Adjusts audio levels to achieve consistent loudness.
3. **Conditional Slicing and Transcription**: Options to slice and transcribe files in the left or right channels separately, or both channels together.
4. **Stereo Mixing with Optional Tone Appending**: Optionally appends a custom tone (`tones.wav`) to the end of stereo-mixed audio.
5. **Organized Output Structure**: Outputs are saved in structured folders with sanitized filenames.
6. **Timestamp Preservation**: Maintains the original timestamps for all output files.

## Installation Requirements

- **Python 3.x**
- **Pydub** for audio processing
```bash
pip install pydub
```
- **FFmpeg**: Required by Pydub for handling audio files
```bash
sudo apt-get install ffmpeg
```
- **fap**: The transcription tool, assumed to be installed and accessible via the command line.

## Usage

### Command Line

Run the script from the command line with the following arguments:

```bash
python preFapMix.py --input-dir --output-dir [options]
```

### Options

- **`--input-dir`**: Directory containing input audio files (required).
- **`--output-dir`**: Directory where processed files will be saved (required).
- **`--transcribe`**: Enables transcription for both left and right channels. Implies both `--transcribe_left` and `--transcribe_right`.
- **`--transcribe_left`**: Enables transcription only for the left channel.
- **`--transcribe_right`**: Enables transcription only for the right channel.
- **`--normalize`**: Enables loudness normalization on the audio.
- **`--tones`**: Appends the contents of `tones.wav` to the end of each stereo output file.
- **`--num-workers`**: Specifies the number of workers to use for transcription (default is 2).

### Workflow

1. **Pre-Processing**:
- Applies a soft limiter at -6 dB to control peaks.
- If `--normalize` is enabled, normalizes loudness to -23 LUFS for consistency.

2. **Conditional Slicing and Transcription**:
- If `--transcribe` is enabled, slices audio files to smaller segments and transcribes each segment, generating `.lab` files.
- With `--transcribe_left` or `--transcribe_right`, transcribes only files in the left or right folders, respectively.

3. **Stereo Mixing with Optional Tone Appending**:
- Combines left and right channels into a stereo file.
- If `--tones` is enabled, appends `tones.wav` to the end of each stereo file.

4. **File Naming and Organization**:
- Names each sliced audio file with its original numeric name, followed by the first 12 words (or fewer) from its `.lab` file.
- All filenames are sanitized for UTF-8 compliance.

### Output Structure

The output structure is organized within `/run_` as follows:

- **`normalized/`**: Contains normalized versions of the input audio files.
- **`left/`** and **`right/`**: Contains sliced (and optionally transcribed) audio files in respective left and right channel folders.
- **`stereo/`**: Contains stereo-mixed files with optional tone appended to the end.
- **`transcribed-and-sliced/`**:
- Root: Contains combined `.lab` files for each original input.
- **`left/`** and **`right/`**: Contains subfolders of sliced audio files and corresponding `.lab` files.

### Example Command

```bash
python preFapMix.py --input-dir ./my_audio_files --output-dir ./processed_audio --transcribe --normalize --tones --num-workers 3
```

This command will:
1. Process the audio files in `./my_audio_files` with soft limiting and loudness normalization.
2. Slice and transcribe each file in the left and right channels.
3. Mix each pair of left and right channels into a stereo file and append `tones.wav` to the end of each stereo output.

# fapMixPlus

This project provides an end-to-end audio processing pipeline to automate the extraction, separation, slicing, transcription, and renaming of audio files. The resulting files are saved in a structured output directory with cleaned filenames and optional ZIP archives for easier distribution or storage.

## Features

- **Download Audio**: Fetches audio files from a URL or uses local input files.
- **Convert to WAV**: Converts audio files to WAV format.
- **Separate Vocals**: Isolates vocal tracks from the WAV files.
- **Slice Audio**: Segments the separated vocal track for transcription.
- **Transcribe**: Generates transcriptions from audio slices.
- **Sanitize and Rename Files**: Creates sanitized filenames with a numerical prefix, limited to 128 characters.
- **Generate ZIP Files**: Compresses processed files into ZIP archives for easy storage and distribution.

## Prerequisites

- **Python 3.x**
- Install required Python packages:
```bash
pip install yt-dlp
```
- **Fish Audio Preprocessor (`fap`)** should be installed and available in the PATH.

### Installing the Fish Audio Preprocessor (`fap`)

1. Clone the [Fish Audio Preprocessor repository](https://github.com/fishaudio/audio-preprocess):
```bash
git clone https://github.com/fishaudio/audio-preprocess.git
```

2. Navigate to the repository directory:
```bash
cd audio-preprocess
```

3. Install the package from the cloned repository:
```bash
pip install -e .
```

This step installs `fap` and makes it accessible as a command-line tool, which is essential for `fapMixPlus.py` to function correctly.

4. Verify the installation by checking the version:
```bash
fap --version
```

## Usage

### Command-line Arguments

| Argument | Description |
|-----------------|----------------------------------------------------------------------|
| `--url` | URL of the audio source (YouTube or other supported link). |
| `--output_dir` | Directory for saving all outputs. Default is `output/`. |
| `input_dir` | Path to a local directory of input files (optional if `--url` used). |

### Example Command

```bash
python fapMixPlus.py --url https://youtu.be/example_video --output_dir my_output
```

This command will download the audio from the URL, process it, and save the results in the `my_output` folder.

### Output Structure

The output directory will contain a timestamped folder with the following structure:

```
output_/
├── wav_conversion/ # WAV-converted audio files
├── separation_output/ # Separated vocal track files
├── slicing_output/ # Sliced segments from separated audio
├── final_output/ # Final, sanitized, and renamed .wav and .lab files
├── zip_files/ # Compressed ZIP archives of processed files
```

### ZIP File Details

In addition to organizing output files by processing stages, `fapMixPlus` can generate ZIP archives for convenience. Each ZIP file in the `zip_files/` directory will contain a set of processed audio and transcription files, with names based on their content and timestamp. The ZIP filenames will follow this format:

```
output_.zip
```

Each ZIP file will include:
- The WAV and `.lab` files from `final_output/`, with sanitized filenames.
- These ZIP files are ideal for transferring or archiving processed audio.

## Functionality Details

1. **Download Audio**: Downloads audio from a URL, saving it in `.m4a` format.
2. **WAV Conversion**: Converts audio to WAV using `fap to-wav`.
3. **Separation**: Separates vocals from the WAV files using `fap separate`.
4. **Slicing**: Segments the separated vocal track into smaller audio slices.
5. **Transcription**: Uses `fap transcribe` to transcribe each slice.
6. **Sanitization and Renaming**:
- Extracts the first 10 words from each `.lab` file.
- Replaces spaces with underscores, removes special characters, and limits to 128 characters.
- Applies a numerical prefix if no valid content is in the `.lab` file.
7. **ZIP File Creation**:
- After processing, the final `.wav` and `.lab` files are compressed into ZIP archives in `zip_files/` for each session, making it easy to organize or share the output.

## Example File Names in Final Output

Final output files in `final_output` will be structured like:
- `0001_Hello_this_is_a_sample_transcription.wav`
- `0001_Hello_this_is_a_sample_transcription.lab`

Files without usable `.lab` content will retain the numerical prefix, e.g., `0002.wav` and `0002.lab`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/akspa0/ComfyUI-FapMixPlus

Awesome Lists containing this project

README