https://github.com/ym496/idioms-bcs
Collections of all the idioms used in the TV series Better Call Saul, generated using Gemini 1.5 Flash.
https://github.com/ym496/idioms-bcs
bash-script beautifulsoup4 gemini-ai gemini-api gemini-flash generative-ai python
Last synced: 6 months ago
JSON representation
Collections of all the idioms used in the TV series Better Call Saul, generated using Gemini 1.5 Flash.
- Host: GitHub
- URL: https://github.com/ym496/idioms-bcs
- Owner: ym496
- License: mit
- Created: 2024-10-20T01:38:39.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2024-10-21T11:35:12.000Z (12 months ago)
- Last Synced: 2025-02-09T02:17:40.313Z (8 months ago)
- Topics: bash-script, beautifulsoup4, gemini-ai, gemini-api, gemini-flash, generative-ai, python
- Language: Python
- Homepage: https://github.com/ym496/idioms-bcs/tree/master/Idioms
- Size: 813 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# idioms-bcs
A comprehensive collection of idioms used in the TV series *Better Call Saul*. This collection was generated using Google's *Gemini 1.5 Flash* LLM model. You can start browsing the collection [here](https://github.com/ym496/idiom-bcs/tree/master/Idioms).![]()
# Running
I've provided the scripts used to generate the files, allowing you to tweak the prompts or make adjustments as needed.## File Descriptions
- **`scrape.py`**: This script scrapes the transcripts from *Better Call Saul* and saves them as individual text files for each episode.
- **`query_model.py`**: This file sends the scraped text files to the Gemini model for processing. It takes the filename as an argument and retrieves idioms from the provided transcript.
- **`gen-idioms.sh`**: This shell script iterates through the text files generated by `scrape.py`, passing each one to `query-model.py`. It saves the output idioms in Markdown format in the `Idioms` directory.
## Running the Scripts
* **Clone the repository:**
```bash
git clone git@github.com:ym496/idioms-bcs.git
cd idioms-bcs
```
* **Set up a virtual environment:**
```bash
python3 -m venv .venv
source .venv/bin/activate
```
* **Install required Python packages:**
```bash
pip install -r requirements.txt
```
* **Give executable permissions and run:**
```bash
chmod +x gen-idioms.sh
./gen-idioms.sh
```
Please wait for the script to finish running. You can open another terminal tab to browse the files being created in your `Idioms` directory.## Empty files
The files for some episodes are empty because gemini keeps giving an error when I pass those.
You can check what files are empty by running:
```bash
find ./Idioms -name "*.md" -type f -empty
```
The latest output of this command was:```
./Idioms/S6/S6ep03.md
./Idioms/S6/S6ep13.md
./Idioms/S6/S6ep08.md
./Idioms/S1/S1ep06.md
./Idioms/S1/S1ep08.md
```