https://github.com/jfdev001/lingua-loop

Full stack, single-page web app for learning languages via transcription.
https://github.com/jfdev001/lingua-loop

fastapi language-learning youtube-api

Last synced: 14 days ago
JSON representation

Full stack, single-page web app for learning languages via transcription.

Host: GitHub
URL: https://github.com/jfdev001/lingua-loop
Owner: jfdev001
License: mit
Created: 2026-03-06T18:46:05.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-06-07T10:44:12.000Z (16 days ago)
Last Synced: 2026-06-07T12:14:18.042Z (16 days ago)
Topics: fastapi, language-learning, youtube-api
Language: Python
Homepage: https://jfdev001.github.io/lingua-loop/
Size: 5.45 MB
Stars: 3
Watchers: 0
Forks: 0
Open Issues: 10
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# lingua-loop

> Train your listening skills by transcribing real speech from YouTube videos.

`lingua-loop` is a single-page web application designed to improve language
listening skills through active transcription. It is based on a
transcription-first learning approach where users repeatedly listen to short
audio segments and attempt to write down what they've heard.

## Installation

To run the web application on your computer, you need Python version 3.12 or
higher installed. It is recommended to create a dedicated virtual environment
first before installing `lingua-loop`.

Using conda:

```
conda create --name lingua-loop python=3.12
conda activate lingua-loop
```

Using venv on Windows, if you know your python version is 3.12 or higher,
open `Command Prompt`:

```
python -m venv .venv_lingua_loop
.\.venv_lingua_loop\Scripts\activate
```

Using venv on Linux/macOS, if you know your python version is 3.12 or higher,
open a terminal:

```
python -m venv .venv_lingua_loop
. .venv_lingua_loop/bin/activate
```

Then you can install the package:

```
pip install git+https://github.com/jfdev001/lingua-loop.git
```

For a faster install for users more familiar with Python packages and virtual
environments, you can use the [uv](https://github.com/astral-sh/uv) package
manager, go to section [Installation with uv](#installation-with-uv) for details.

## Usage

You can launch the app, which will automatically open a web browser, with the
following command in a terminal:

```
lingua-loop
```

Here's how you use the app:

1. Select the target language you want to study.
2. Copy and paste a YouTube video URL corresponding to that target language.
Ideally, the video should have **official** transcripts since those will be
used as the reference for scoring your transcription attempt.
3. Listen to short segments of the video (e.g., 10 seconds) and type what you
hear. You may replay each segment as many times as needed.
4. Once you have made an honest attempt, you can submit your transcription
attempt and will receive a score from 0 to 1. A score of 1 indicates a
perfect transcription and a score of 0 indicates a complete mismatch.
5. Repeat as long as desired.

The scoring algorithm uses [Gestalt pattern
matching](https://en.wikipedia.org/wiki/Gestalt_pattern_matching), which
compares your input to the reference transcript by finding the longest common
subsequences of characters. For example, if you type

$$
\text{wikim}\textcolor{red}{an}\text{ia}
$$

but the official transcript has

$$
\text{wikim}\textcolor{red}{ed}\text{ia}
$$

you would receive a score of 0.70 because, despite the error, there are common
characters that you correctly typed.

Note that this package is an alpha state and subject to breaking changes and
bug fixes. Please open a GitHub issue if you encounter problems!

# Future Work

Some possible extension ideas:

1. Significantly improve the styling/UX.
2. Integrate Anki-style flash cards for selected words. A word should be added
to a flash card deck in addition to the context in which it occurred (e.g.,
sentence or phrase). The user *should* write additional context to keep them
actively engaged in the learning process.
3. Support more languages. Currently only Indo-European languages are
supported, though support for languages such as Mandarin would be extremely
valuable. For Mandarin, I think text normalization would require
[pinyin](https://en.wikipedia.org/wiki/Pinyin).
4. More sophisticated scoring algorithms.
5. Integrate an AI chat bot that is given the full transcript as context and
subsequently asks the user comprehension questions or engages in
conversation about the video topic. Could be executed using local models
or combined with API keys.

# Developers

The directory structure is inspired by a number of projects, so it is
intended to be as standard and intuitive as possible:

```
lingua-loop
├── src
│   └── lingua_loop
│   ├── api
│   │   └── routers
│   │   └── transcript.py
│   ├── constants.py
│   ├── db
│   │   ├── models.py
│   │   ├── session.py
│   │   └── transcript.py
│   ├── exceptions.py
│   ├── integrations
│   │   └── youtube
│   │   ├── types.py
│   │   └── wrapper.py
│   ├── main.py
│   ├── schemas
│   │   └── transcript.py
│   ├── scripts
│   │   └── run.py
│   ├── services
│   │   ├── text_normalization.py
│   │   └── transcript.py
│   ├── static
│   │   ├── app.js
│   │   └── style.css
│   └── templates
│   └── index.html
└── tests
├── boundary
│   ├── conftest.py
│   └── test_youtube_transcript_api.py
├── conftest.py
├── constants.py
├── integration
│   └── test_integration.py
└── unit
├── api
│   └── test_api.py
├── db
│   └── test_db.py
└── services
└── test_services.py
```

The fast package manger `uv` was used in the development of this project.
You can install `lingua-loop` in development mode by doing the following:

```shell
git clone git@github.com:jfdev001/lingua-loop.git
cd lingua-loop
uv sync
```

As a sanity check, you should verify the FastAPI backend tests pass:

```shell
pytest -v tests/
```

There are tests to verify the outputs of the `youtube_transcript_api`
dependency, however, these make requests to YouTube, and are therefore *not*
run by default. You can run those tests if you want by including the `--slow`
flag like:

```shell
pytest -v --slow tests/
```

You should also make sure that your files are formatted according to
the conventions defined in `.pre-commit-config.yaml`. To that end, use
the fast git hook runner framework [prek](https://github.com/j178/prek):

```
prek install
```

If you'd like to contribute, consider opening an issue first and/or addressing
one of the issues already available. AI contributions will **not** be accepted.
You should fork the project first before contributing. Pull requests should
come from branches with the following structure:

```
/-
```

The frontend is vanilla JS/HTML/CSS, so using `npm` is not strictly necessary,
but if you'd like type hints for the Youtube iframe API, you can install
the frontend dependencies with:

```
npm install
```

# Extras

## Installation with uv

If you do not already have the `uv` package manager installed, you can go to
Astral's official documentation and follow the [installation
instructions](https://docs.astral.sh/uv/getting-started/installation/).

If you already have a conda base environment and are okay installing
`lingua-loop` there with `uv`, do the following:

```
uv pip install git+https://github.com/jfdev001/lingua-loop.git --system
```

Otherwise, you can follow whichever `uv` workflow you're comfortable with, but
for those familiar with `pip`, the following is the easiest (on Linux/macOS):

```
python -m venv .venv_lingua_loop
. .venv_lingua_loop/bin/activate
uv pip install git+https://github.com/jfdev001/lingua-loop.git
```

For Windows users, you could execute the above commands using `Command Prompt`:

```
python -m venv .venv_lingua_loop
.\.venv_lingua_loop\Scripts\activate
uv pip install git+https://github.com/jfdev001/lingua-loop.git
```

### Docker

You can also run `lingua-loop` with Docker. Build the image:

```
git clone --depth 1 https://github.com/jfdev001/lingua-loop.git
cd lingua-loop
docker build -t lingua-loop .
```

Then run the container:

```
docker run -p 49152:49152 lingua-loop
```

> **Note:** The browser auto-open feature does not work inside a container. Manually open http://localhost:49152 in your browser.

# References

Below are a series of references, in no particular order, that I used to gain
the necessary background knowledge for the project:

[1] [Python FastAPI Tutorial (Part 1): Getting Started - Web App + REST
API](https://www.youtube.com/watch?v=7AMjmCTumuo)

[2] [Learn Fast API With This ONE
Project](https://www.youtube.com/watch?v=SR5NYCdzKkc)

[3] [Anatomy of a Scalable Python Project
(FastAPI)](https://www.youtube.com/watch?v=Af6Zr0tNNdE)

[4] [Developing and Testing an Asynchronous API with FastAPI and
Pytest](https://testdriven.io/blog/fastapi-crud/) &
[github.com/fastapi-crud-async](https://github.com/testdrivenio/fastapi-crud-async)

[5] [Netflix/dispatch](https://github.com/Netflix/dispatch)

[6] [polarsource/polar](https://github.com/polarsource/polar)

[7] [FastAPI: Modern Web Development (Lubanovic
2024)](https://www.oreilly.com/library/view/fastapi/9781098135492/)

[8] [ecmwf/forecast-in-a-box](https://github.com/ecmwf/forecast-in-a-box)

[9] [fastapi-docs: Testing
Dependencies](https://fastapi.tiangolo.com/advanced/testing-dependencies/)

[10] [Build Advanced Youtube Player UI --
Video](https://www.youtube.com/watch?v=lsu-g-_6i_A&list=PLzKme01IAXkLBmWBihSwIhLk8Qluli0Jl&index=2)
and [Build Advanced Youtube Player UI -- Source
Code](https://codepen.io/PixelPerfectLabs/pen/PwwrJge)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jfdev001/lingua-loop

Awesome Lists containing this project

README