Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/thomasthiebaud/spacy-fastlang
Language detection using Spacy and Fasttext
https://github.com/thomasthiebaud/spacy-fastlang
fasttext fasttext-python language-detection spacy spacy-extensions
Last synced: about 1 month ago
JSON representation
Language detection using Spacy and Fasttext
- Host: GitHub
- URL: https://github.com/thomasthiebaud/spacy-fastlang
- Owner: thomasthiebaud
- License: mit
- Created: 2020-03-26T16:31:30.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2023-12-17T10:52:56.000Z (about 1 year ago)
- Last Synced: 2024-12-07T05:20:46.337Z (about 2 months ago)
- Topics: fasttext, fasttext-python, language-detection, spacy, spacy-extensions
- Language: Python
- Homepage:
- Size: 863 KB
- Stars: 54
- Watchers: 4
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# spacy_fastlang
## Install
Assuming you have a working python environment, you can simply install it using
```
pip install spacy_fastlang
```## Usage
The library exports a pipeline component called `language_detector` that will set two spacy extensions
- doc.\_.language = ISO code of the detected language or `xx` as a fallback
- doc.\_.language_score = confidence```
import spacy_fastlang # noqa: F401 # pylint: disable=unused-import
nlp = spacy.load("...")
nlp.add_pipe("language_detector")
doc = nlp(en_text)doc._.language == "..."
doc._.language_score >= ...
```## Options
[Check the tests](./tests/test_spacy_fastlang.py) to see more examples and available options
## License
Everythin is under `MIT` except the default model which is distributed under [Creative Commons Attribution-Share-Alike License 3.0](https://creativecommons.org/licenses/by-sa/3.0/) by facebook [here](https://fasttext.cc/docs/en/language-identification.html)