Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/nedlir/languagepod101-scraper

Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
https://github.com/nedlir/languagepod101-scraper

beautifulsoup chinese-language course download japanese japanese-language japanesepod japanesepod101 jpod101 language language-learning learn learn-chinese learn-japanese learn-spanish podcast python requests scraping spanish-language

Last synced: 23 days ago
JSON representation

Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨

Lists

README

        

:zap: languagepod101-scraper :zap:



Language selection


languagepod101-scraper is a resource for dozen of language learning courses and study material for FREE.



## :mortar_board: About

languagepod101-scraper helps you download full language courses and save them to a local directory.
The courses are produced and distributed by [Innovative Language](https://www.innovativelanguage.com/online-language-courses),
who provides language learning courses from a selection of dozens of languages. Each lesson is usually 10-20 minutes long.

To get started, [choose one of the languages courses](https://www.innovativelanguage.com/online-language-courses)
offered by Innovative Language and create a free account.

## :pushpin: Usage

To use the script, fulfill the requirements and follow the example as demonstrated below.

### :electric_plug: Requirements

- Download and install [Python 3.9+](https://www.python.org/).
- Install required packages from [requirements.txt](requirements.txt) file using
[pip](https://packaging.python.org/tutorials/installing-packages/).

```sh
pip install -r requirements.txt
```

### :bookmark_tabs: Example

For the sake of example, the process of downloading of a course from
[Japanese Pod 101](https://www.japanesepod101.com/) will be demonstrated.

Japanese Pod 101 and all other sites have a similar structure which looks as following:

```
Japanesepod101
├─ Level 1 - Absolute Beginner
│ ├─ Newbie Season 1
│ │ ├─ lesson 01
│ │ ├─ lesson 02
│ │ ├─ lesson 03
│ │ ├─ ...
│ ├─ Newbie Season 2
│ ├─ ...
├─ Level 2 - Beginner
│ ├─ Lower Beginner Season 1
│ │ ├─ lesson 01
│ │ ├─ lesson 02
│ │ ├─ lesson 03
│ │ ├─ ...
│ ├─ ...
├─ Level 3 - Intermediate
│ ├─ ...
│ │ ├─ ...
│ │ ├─ ...
│ ├─ ...
│ ├─ ...
├─ Level 4 - Upper Intermediate
│ ├─ ...
├─ Level 5 - Advanced
│ ├─ ...
```

- To download *Lower Beginner Season 1* we will have to use our web browser to navigate
to `lesson 1` of this course (any other lesson url from the **same course** is ok too...).

Navigation would look like this: `Japanesepod101` → `Level 2 - Beginner` → `Lower Beginner Season 1` → `lesson 01`.

Save the URL for `lesson 01` from the address bar, as you will have to provide it to the script later on.

- Create a directory in your PC for this course, and enter into it.

- Run the [language101_scraper.py](language101_scraper.py) script, and follow the instructions.
You will have to provide:

- the email you used to sign up for the course
- your password for the course
- the course's lesson URL you have navigated through earlier
(in our example: `lesson 01` of the `Lower Beginner Season 1` course).

- Alternatively, you can pass the data as parameters when invoking the script:

```sh
./language101_scraper.py -u $USERNAME -p $PASSWORD --url YOUR_LESSON_URL
```

- The script will start downloading the MP3/MP4/M4V files into the local navigated folder.
Any possible errors would be printed out.

- Output inside folder should look like this:

```
├─01 - A Formal Japanese Introduction - JapanesePod101 - Dialogue.mp3
├─01 - A Formal Japanese Introduction - JapanesePod101 - Review.mp3
├─01 - A Formal Japanese Introduction - JapanesePod101 - Main Lesson.mp3
├─02 - Which Famous Tokyo Tower is That - JapanesePod101 - Dialogue.mp3
├─02 - Which Famous Tokyo Tower is That - JapanesePod101 - Main Lesson.mp3
├─02 - Which Famous Tokyo Tower is That - JapanesePod101 - Review.mp3
├─03 - Networking in Japan - JapanesePod101 - Dialogue.mp3
├─03 - Networking in Japan - JapanesePod101 - Main Lesson.mp3
├─03 - Networking in Japan - JapanesePod101 - Review.mp3
├─...
```

## :clipboard: Disclaimer and known issues

- Any usage of the script is under user's responsibility only. Users of the script must act according to site's terms.

- As of today, Innovative Language's terms of use does not forbid usage of crawlers or scrapers on any of their sites.
This may change in the future, so be aware.

- If you like the services Innovative Language provides you should consider a monthly subscription. Basic programs start at around $5 per month and include support from native speaker teachers.

- As with all websites, the site's structure may change in the future and thus, as often happens with scraping scripts, deprecate it. It is not really a question of *if* the site's source code will change but rather **when** (so enjoy it while it's still working :grin:).

## :lock: License

All of the content presented in the websites belongs to the original creators (Innovative Language) and I have nothing to do with it.

The license below refers only to the script and not to the downloaded content.

[License - MIT](LICENSE.md)

## :speech_balloon: Status and changelog

- **23.03.2022**:
Added support for basic video downloading (nothing fancy, just m4v and mp4 files)
Added error handling for when a lesson library/lesson contents URL is used instead of the first lesson (user is now warned)
- **11.05.2021**:
Headers and waiting time added, script is alive again.