Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chartbeat-labs/textacy
NLP, before and after spaCy
https://github.com/chartbeat-labs/textacy
natural-language-processing nlp python spacy
Last synced: 3 days ago
JSON representation
NLP, before and after spaCy
- Host: GitHub
- URL: https://github.com/chartbeat-labs/textacy
- Owner: chartbeat-labs
- License: other
- Created: 2016-02-03T16:52:45.000Z (almost 9 years ago)
- Default Branch: main
- Last Pushed: 2023-09-22T23:38:28.000Z (over 1 year ago)
- Last Synced: 2025-01-16T05:05:21.139Z (10 days ago)
- Topics: natural-language-processing, nlp, python, spacy
- Language: Python
- Homepage: https://textacy.readthedocs.io
- Size: 31.4 MB
- Stars: 2,214
- Watchers: 87
- Forks: 249
- Open Issues: 35
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-nlp - textacy - 在spaCy上構建的更高級別的自然與儼處理。 (函式庫 / 書籍)
- awesome-python-machine-learning-resources - GitHub - 11% open · ⏱️ 06.03.2022): (文本数据和NLP)
- awesome-nlp-note - textacy - Higher level NLP built on spaCy (Libraries / Videos and Online Courses)
- awesome-list - textacy - a Python library for performing a variety of natural language processing tasks, based on spaCy. (Natural Language Processing / General Purpose NLP)
README
## textacy: NLP, before and after spaCy
`textacy` is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, `textacy` focuses primarily on the tasks that come before and follow after.
[![build status](https://img.shields.io/travis/chartbeat-labs/textacy/master.svg?style=flat-square)](https://travis-ci.org/chartbeat-labs/textacy)
[![current release version](https://img.shields.io/github/release/chartbeat-labs/textacy.svg?style=flat-square)](https://github.com/chartbeat-labs/textacy/releases)
[![pypi version](https://img.shields.io/pypi/v/textacy.svg?style=flat-square)](https://pypi.python.org/pypi/textacy)
[![conda version](https://anaconda.org/conda-forge/textacy/badges/version.svg)](https://anaconda.org/conda-forge/textacy)### features
- Access and extend spaCy's core functionality for working with one or many documents through convenient methods and custom extensions
- Load prepared datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments
- Clean, normalize, and explore raw text before processing it with spaCy
- Extract structured information from processed documents, including n-grams, entities, acronyms, keyterms, and SVO triples
- Compare strings and sequences using a variety of similarity metrics
- Tokenize and vectorize documents then train, interpret, and visualize topic models
- Compute text readability and lexical diversity statistics, including Flesch-Kincaid grade level, multilingual Flesch Reading Ease, and Type-Token Ratio... *and much more!*
### links
- Download: https://pypi.org/project/textacy
- Documentation: https://textacy.readthedocs.io
- Source code: https://github.com/chartbeat-labs/textacy
- Bug Tracker: https://github.com/chartbeat-labs/textacy/issues### maintainer
Howdy, y'all. 👋
- Burton DeWilde ()