https://github.com/starkdmi/mossformer_se_mlx
MossFormer2 speech enhancement in MLX (Python and Swift)
https://github.com/starkdmi/mossformer_se_mlx
audio mlx speech-enhancement speech-separation
Last synced: 8 months ago
JSON representation
MossFormer2 speech enhancement in MLX (Python and Swift)
- Host: GitHub
- URL: https://github.com/starkdmi/mossformer_se_mlx
- Owner: starkdmi
- License: apache-2.0
- Created: 2025-10-01T15:52:10.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-10-22T11:46:27.000Z (8 months ago)
- Last Synced: 2025-10-22T13:28:44.988Z (8 months ago)
- Topics: audio, mlx, speech-enhancement, speech-separation
- Language: Swift
- Homepage:
- Size: 92.8 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MossFormer2 Speech Enhancement
Speaker enhancement models for extracting speech from noisy audio using MLX.
## Usage
### Python
```bash
cd python
pip install -r requirements.txt
python generate.py --input noisy.wav --output clean.wav --precision fp32
```
### Swift
```bash
cd swift
xcodebuild build -scheme generate -configuration Release -destination 'platform=macOS' -derivedDataPath .build/DerivedData -quiet
.build/DerivedData/Build/Products/Release/generate noisy.wav --precision fp32
```
## Performance
| Framework | Speed (× faster than input) |
| ---------- | --------------------------- |
| Python MLX | **25×** |
| Swift MLX | **30×** |
## Models
| Precision | Model Size |
| --------- | ---------- |
| [FP32](https://huggingface.co/starkdmi/MossFormer2_SE_48K_MLX/resolve/main/model_fp32.safetensors) | **221 MB** |
| [FP16](https://huggingface.co/starkdmi/MossFormer2_SE_48K_MLX/resolve/main/model_fp16.safetensors) | **111 MB** |
HuggingFace: [starkdmi/MossFormer2_SE_48K_MLX](https://huggingface.co/starkdmi/MossFormer2_SE_48K_MLX)
Source: [ClearerVoice-Studio](https://github.com/modelscope/ClearerVoice-Studio)
## License
See [LICENSE](LICENSE).