https://github.com/skywind3000/lemma.en
English Lemma Database - Compiled by Referencing British National Corpus
https://github.com/skywind3000/lemma.en
Last synced: 11 months ago
JSON representation
English Lemma Database - Compiled by Referencing British National Corpus
- Host: GitHub
- URL: https://github.com/skywind3000/lemma.en
- Owner: skywind3000
- License: mit
- Created: 2017-03-28T10:30:43.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2024-09-23T08:53:41.000Z (almost 2 years ago)
- Last Synced: 2025-07-07T07:43:04.933Z (12 months ago)
- Size: 771 KB
- Stars: 31
- Watchers: 4
- Forks: 4
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Preface
English Lemma Database - Compiled by Referencing British National Corpus
Compiled by Lin Wei (https://github.com/skywind3000), Mar 28, 2017 by referencing the 100M+ words in the British Nation Corpus (BNC), NodeBox Linguistics and Yasumasa Someya's lemma list.
This lemma list is provided "as is" and is free to use for any research and/or educational purposes. The list currently contains 186,523 words (tokens) in 84,487 lemma groups.
## Data Format
Definition
```text
word/bnc-frequence -> form1 (, form2 (, form3...))
```
Data Sample:
```text
be/4109826 -> is,was,are,were,'s,been,being,'re,'m,am,m
have/1315648 -> had,has,'ve,having,'s,'d,d,ve
it/1213224 -> its,they
he/1196022 -> his,him,they
i/1133697 -> my,me,we,is
they/841960 -> their,them,'em
you/804279 -> your,ya,ye
not/767330 -> n't
she/653505 -> her
do/535646 -> did,does,done,doing,du,d'
we/503360 -> our,us
will/334612 -> 'll,wo,ll
say/317317 -> said,says,saying
would/278414 -> 'd
can/263138 -> ca,cans,can,could
go/227247 -> going,went,gone,goes,goin'
get/212569 -> got,getting,gets,gotten
make/209818 -> made,making,makes
up/206976 -> ups,upping,upped
see/184969 -> seen,saw,seeing,sees
other/181277 -> others
time/181080 -> times,timed,timing
know/177717 -> knew,known,knows,knowing
take/172773 -> took,taken,taking,takes
year/161649 -> years
```
## About
If you have any questions or comments about this lemma list, feel free to contact me (skywind3000@163.com), at any time...