https://github.com/hay/wikiaudify

Generate audio summaries of Wikipedia articles using OpenAI and ElevenLabs
https://github.com/hay/wikiaudify

Last synced: 4 months ago
JSON representation

Generate audio summaries of Wikipedia articles using OpenAI and ElevenLabs

Host: GitHub
URL: https://github.com/hay/wikiaudify
Owner: hay
License: mit
Created: 2024-11-03T21:23:05.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-11-03T21:35:18.000Z (8 months ago)
Last Synced: 2025-01-13T11:38:21.927Z (6 months ago)
Language: Python
Size: 9.82 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# wikiaudify
> Generate audio summaries of Wikipedia articles using OpenAI and ElevenLabs

By [Hay Kranen](http://www.haykranen.nl)

## Introduction
This was a hackathon project made during the [Wikimedia NL Mini Hackathon 2024](https://nl.wikimedia.org/wiki/Mini_Hackathon_November_2024) to generate audio summaries like the ones from [NotebookLM](https://blog.google/technology/ai/notebooklm-audio-overviews/). Obviously it's not as good as that one, but it makes quite enjoyable fun short audio conversations.

Here's an example of a generated audio summary about the article on [Grilled cheese](https://en.wikipedia.org/wiki/Grilled_cheese).

https://github.com/user-attachments/assets/59149473-146d-4c11-8b6d-c97e1c0eb01d

And you can find a transcription here.

## Install
What you'll need:
* An [OpenAI API key](https://openai.com/api/)
* An [ElevenLabs API key](https://elevenlabs.io/api)
* Python 3.13+ (it probably works with older versions too, but no guarantees)

There is an option to use a local LLM ([like Ollama](https://ollama.com/)) but i didn't get very good results, but you could try to make it work!

To use this script:
1. Clone this repo
```bash
git clone https://github.com/hay/wikiaudify.git
```

2. Make a virtual environment and install the `requirements.txt`
```bash
python -m venv env
source env/bin/activate
pip install -r requirements.txt
```

3. Copy the `example-config.toml` to a new file (e.g. `test.toml`) and fill in your API keys and other details

4. Try running it from the command line like this:
```bash
python generate.py -a "Grilled_Cheese" -q "At what temperature will my cheese melt?" -c test.toml
```

Note that the Wikipedia article you give with the `-a` option should have underscores, e.g. the path in the URL of the article.

By default this will generate two files in the root of this project: a `summary.mp3` containing the summary and a `summary.txt` with a transcription.

## Troubleshooting
If you add the `-v` (verbose) flag `audio2text` will give much more debug information.

## All options
You'll get this when doing `python generate.py -h`

```bash
usage: generate.py [-h] [-a ARTICLE] [-c CONFIG] [-na] [-nt] [-o OUT]
[-ot OUT_TRANSCRIPT] [-q QUESTION] [-v]

Generate an audio summary of a Wikipedia article

options:
-h, --help show this help message and exit
-a, --article ARTICLE
Article you want the audio summary to be about
-c, --config CONFIG Path to a TOML file with configuration
-na, --no-audio Don't generate audio output
-nt, --no-transcript Don't generate an audio transcript
-o, --out OUT Path of output MP3 file
-ot, --out-transcript OUT_TRANSCRIPT
Path of output transcript
-q, --question QUESTION
User question that will be included in the summary
-v, --verbose Show debug information
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hay/wikiaudify

Awesome Lists containing this project

README