Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chriskempson/japanese-subtitles-word-kanji-frequency-lists
A word frequency list derived from subtitles from Japanese drama, anime and films.
https://github.com/chriskempson/japanese-subtitles-word-kanji-frequency-lists
Last synced: about 2 months ago
JSON representation
A word frequency list derived from subtitles from Japanese drama, anime and films.
- Host: GitHub
- URL: https://github.com/chriskempson/japanese-subtitles-word-kanji-frequency-lists
- Owner: chriskempson
- License: mit
- Created: 2019-06-15T06:05:22.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-12-08T23:15:09.000Z (about 1 year ago)
- Last Synced: 2023-12-09T00:52:14.000Z (about 1 year ago)
- Size: 1.49 MB
- Stars: 25
- Watchers: 4
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Japanese Subtitles Word & Kanji Frequency Lists
A word frequency and kanji frequency list derived from subtitles from Japanese drama, anime and films.The data set was comprised of 12,277 subtitle files taken from https://github.com/Matchoo95/JP-Subtitles. The frequeny lists were generated with JParser and cb's Japanese Text Analysis Tool.
## Format of Word Frequency Report:
- Field 1: Number of times word was encountered
- Field 2: Word
- Field 3: Frequency Group
- Field 4: Frequency Rank
- Field 5: Percentage (Field 1 / Total number of words)
- Field 6: Cumulative percentage
- Field 7: Part-of-speech## Format of Kanji Frequency Report:
- Field 1: Number of times kanji was encountered
- Field 2: Kanji
- Field 3: Frequency Group
- Field 4: Frequency Rank
- Field 5: Percentage (Field 1 / Total number of kanji)
- Field 6: Cumulative percentage