https://github.com/enzet/emmio
Tool box for learning
https://github.com/enzet/emmio
flashcards language-learning learning lexicon vocabulary vocabulary-test
Last synced: 11 days ago
JSON representation
Tool box for learning
- Host: GitHub
- URL: https://github.com/enzet/emmio
- Owner: enzet
- License: mit
- Created: 2018-01-17T20:51:50.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2023-08-01T23:03:51.000Z (almost 2 years ago)
- Last Synced: 2023-08-02T00:24:07.697Z (almost 2 years ago)
- Topics: flashcards, language-learning, learning, lexicon, vocabulary, vocabulary-test
- Language: Python
- Homepage:
- Size: 715 KB
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 18
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
__Emmio__ is an experimental project on languages and learning. It provides
learning and testing algorithms:
1. [learning](#learning) system based on spaced repetition,
2. [lexicon](#lexicon) (vocabulary) level checking,and manages four kinds of artifacts:
- [dictionaries](#dictionary),
- [sentence translations](#sentences),
- [frequency and word lists](#frequency-and-word-lists),
- [audio for words and sentences](#audio).## Installation
Requires __Python 3.10__.
```shell
pip install git+https://github.com/enzet/emmio
```## Get started
To run Emmio, just run
```shell
emmio
```You may specify data directory with `--data` option and username with `--user`
option. If not specified data directory is assumed to be `~/.emmio` and username
is assumed to be the current system username.## Lexicon
```
> lexicon
```The algorithm will randomly (based on frequency) offer you words of the target
language. For each word you have to decide1. either you know at least one meaning of this word (press y or
Enter),
2. or you don't know any meaning of this word (press n),
3. or the word is often used as a proper name or doesn't exist at all (press
-).To finish press q.
After that algorithm will provide you a non-negative number called _rate_, that
somehow describes your vocabulary. 0 means you know not a single word of the
language and infinity means you know absolutely all words in the frequency list.
The better use of the rate is to track your language learning progress and to
compare vocabulary of different people using one frequency list.| Rate | Level |
|-------------|----------------------------------|
| near 3 | Beginner, elementary |
| near 5 | Intermediate, upper intermediate |
| near 7 | Advanced, proficient |
| more than 9 | Native |Lexicon configuration:
```
"lexicon": {
"": "",
...
}
```* `language` is 2-letters ISO 639-1 language code (e.g. `en` for
English and `ru` for Russian).
* `frequency list id` is an identifier of [full frequency file](#frequency).
__Important__: for Lexicon you can use only full (not stripped) frequency
list.### Wiktionary
Wiktionary project contains
[frequency lists](https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists) for
different languages.## Data directory structure
Emmio data directory is located by default in `~/.emmio` and contains all
downloaded artifacts and their configuration files and collected user data.- `dictionaries` — single word translations.
- `sentences` — whole sentence translations.
- `lists` — frequency lists and simple word lists.
- `users` — user data.
- ``
- `config.json` — user configuration file.
- `learn` — user learning process data.
- `lexicon` — user lexicon checking data.### Dictionaries
Dictionaries are entities that provide definitions and translations for single
words. Artifacts are controlled by configuration file
`dictionaries/config.json`.Emmio supports:
- dictionaries stored in JSON files,
- English Wiktionary (through
[WiktionaryParser](https://github.com/Suyash458/WiktionaryParser)).### Frequency lists and word lists
Frequency list is a relation between unique words and the number of their
occurrences in some text of a corpus of texts. Some frequency lists are
stripped (e.g. 6,500-lemma list based on the New Corpus for Ireland).#### FrequencyWords (Opensubtitles)
There is [Hermit Dave](https://github.com/hermitdave)'s project
[FrequencyWords](https://github.com/hermitdave/FrequencyWords), which contains
full and stripped frequency lists extracted from subtitles in
[Opensubtitles](https://www.opensubtitles.org) project.