https://github.com/intothevoid/rss2podcast
Parse, summarise and convert rss feeds into an audio podcast
https://github.com/intothevoid/rss2podcast
ai coqui golang ollama podcast rss tts
Last synced: 8 months ago
JSON representation
Parse, summarise and convert rss feeds into an audio podcast
- Host: GitHub
- URL: https://github.com/intothevoid/rss2podcast
- Owner: intothevoid
- License: mit
- Created: 2024-03-07T11:40:17.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-07T12:17:00.000Z (about 2 years ago)
- Last Synced: 2025-01-31T20:22:35.595Z (over 1 year ago)
- Topics: ai, coqui, golang, ollama, podcast, rss, tts
- Language: Go
- Homepage:
- Size: 1.13 MB
- Stars: 9
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# rss2podcast

A locally hosted, AI generated podcast from an rss feed.





Powered by -

## Features
- RSS feed parsing and article extraction
- Article summarization using Ollama
- Text-to-speech conversion using multiple engines:
- Kokoro TTS (Recommended)
- MLX Audio TTS
- Coqui TTS
- Podcast generation with customizable settings
- Web interface for configuration and control
## Requirements
- Go 1.21 or later
- Ollama (for article summarization)
- One of the following TTS engines:
- Kokoro TTS (recommended)
- MLX Audio TTS
- Coqui TTS
## Installation
1. Clone the repository:
```bash
git clone https://github.com/intothevoid/rss2podcast.git
cd rss2podcast
```
2. Install dependencies:
```bash
go mod download
```
3. Configure the application by editing `config.yaml` or using the web interface.
## Configuration
The application can be configured using the web interface or by editing the `config.yaml` file. The following settings are available:
### RSS Settings
- `url`: The RSS feed URL to parse
- `max_articles`: Maximum number of articles to process
- `filters`: List of filters to apply to articles
### Ollama Settings
- `end_point`: The Ollama API endpoint
- `model`: The Ollama model to use for summarization
### Podcast Settings
- `subject`: The podcast subject
- `podcaster`: The podcaster name
### TTS Settings
- `engine`: The TTS engine to use ("kokoro", "mlx", or "coqui")
- `kokoro`: Kokoro TTS settings
- `url`: The Kokoro TTS API endpoint
- `voice`: The voice to use
- `speed`: The speech speed (0.25 to 4.0)
- `format`: The audio format (mp3, opus, flac, wav, pcm)
- `mlx`: MLX Audio TTS settings
- `url`: The MLX Audio TTS API endpoint
- `voice`: The voice to use
- `speed`: The speech speed (0.5 to 2.0)
- `format`: The audio format (mp3, wav)
- `coqui`: Coqui TTS settings
- `url`: The Coqui TTS API endpoint
## Usage
1. Start the application:
```bash
go run cmd/rss2podcast/main.go
```
2. Access the web interface at `http://localhost:8080`
3. Configure the application using the web interface or edit `config.yaml`
4. The application will:
- Parse the RSS feed
- Extract and summarize articles
- Convert the summary to audio using the selected TTS engine
- Generate a podcast file
## TTS Engines
### Kokoro TTS (Recommended)
Kokoro TTS offers OpenAI-compatible speech synthesis with support for multiple voices and formats. It provides excellent quality with low latency.
### MLX Audio TTS
MLX Audio TTS is a powerful text-to-speech engine that provides high-quality speech synthesis with support for multiple voices and formats. It offers additional features like direct audio playback and output folder management.
### Coqui TTS
Coqui TTS provides high-quality speech synthesis with support for multiple voices and formats.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgments
- [Ollama](https://ollama.ai/) for the LLM API
- [Kokoro TTS](https://github.com/kokoro-tts/kokoro) for the TTS engine
- [MLX Audio TTS](https://github.com/mlx-audio/mlx-tts) for the TTS engine
- [Coqui TTS](https://github.com/coqui-ai/TTS) for the TTS engine
## How it works
The application reads an rss feed, extracts the articles and summarises them.
RSS + Ollama + TTS = Podcast
### RSS
The application reads an rss feed and extracts the articles. Each of these articles are then processed by scraping the article content.
### Ollama
The application uses a locally hosted version of Ollama. The Ollama API is used to summarise the article content. Default model used is mistral:7b
### TTS
The summarised article content is then converted into an audio podcast using the Coqui TTS API.
## Dependencies
This project requires the following dependencies to be installed on your system.
### Ollama
You can install the Ollama server by following the instructions on the [official website](https://ollama.com).
Ollama needs to be running on your local machine for the application to work. The application is configured to use the default Ollama server URL `http://localhost:11434/api/generate`. This can be changed via the config.yaml file.
### ffmpeg
`ffmpeg` is a command-line tool for handling multimedia files. It is used to convert the generated audio files to the MP3 format.
#### macOS
You can use Homebrew to install `ffmpeg` on macOS:
```bash
brew install ffmpeg
```
#### Windows
1. Download the `ffmpeg` build for Windows from the [official website](https://ffmpeg.org/download.html).
2. Extract the downloaded ZIP file.
3. Add the `bin` directory from the extracted folder to your system's PATH.
#### Linux
The installation command depends on your Linux distribution.
##### Ubuntu/Debian
```bash
sudo apt update
sudo apt install ffmpeg
```
### Kokoro TTS (Recommended)
Kokoro TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the Kokoro TTS server by following the instructions on the [official website](https://github.com/nazdridoy/kokoro-tts).
#### Docker:
Create a docker-compose.yml file and add the following:
```yaml
services:
kokoro-fastapi-cpu:
ports:
- 8880:8880
image: ghcr.io/remsky/kokoro-fastapi-cpu:latest # or v0.2.3 for last stable version
```
Start the server by running the following command:
```bash
docker compose up -d
```
This will start the Kokoro TTS server on port 8880. The server provides a REST API for text-to-speech conversion.
### Coqui TTS
Coqui TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the Coqui TTS server by following the instructions on the [official website](https://coqui.ai/tts).
#### Docker
Start the container by using the following command:
```bash
docker run -d -p 5002:5002 --platform linux/amd64 --entrypoint /usr/local/bin/tts-server ghcr.io/coqui-ai/tts-cpu --model_name tts_models/en/ljspeech/vits
```
### MLX Audio TTS
MLX Audio TTS is a text-to-speech synthesis system that uses deep learning to create human-like speech from text. You can install the MLX Audio TTS server by following the instructions on the [official website](https://github.com/mlx-audio/mlx-tts).
#### Docker
As of this writing, MLX Audio TTS needs to be run locally as Docker does not allow GPU access on Apple Silicon.
```bash
# Install the package
pip install mlx-audio
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Install the dependencies
pip install -r requirements.txt
# Run the server
mlx_audio.server
```
rss2podcast will automatically request the MLX Audio TTS server to generate the audio file.
## Testing
To run the tests, use the following command:
```bash
go test ./...
```