https://github.com/chartbeat-labs/textacy

NLP, before and after spaCy
https://github.com/chartbeat-labs/textacy

natural-language-processing nlp python spacy

Last synced: about 1 year ago
JSON representation

NLP, before and after spaCy

Host: GitHub
URL: https://github.com/chartbeat-labs/textacy
Owner: chartbeat-labs
License: other
Created: 2016-02-03T16:52:45.000Z (over 10 years ago)
Default Branch: main
Last Pushed: 2023-09-22T23:38:28.000Z (almost 3 years ago)
Last Synced: 2025-04-11T02:51:42.869Z (over 1 year ago)
Topics: natural-language-processing, nlp, python, spacy
Language: Python
Homepage: https://textacy.readthedocs.io
Size: 31.4 MB
Stars: 2,219
Watchers: 86
Forks: 248
Open Issues: 35
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

awesome-nlprojects - link
awesome-python-machine-learning-resources - GitHub - 11% open · ⏱️ 06.03.2022): (文本数据和NLP)
awesome-nlp-note - textacy - Higher level NLP built on spaCy (Libraries / Videos and Online Courses)
awesome-list - textacy - a Python library for performing a variety of natural language processing tasks, based on spaCy. (Natural Language Processing / General Purpose NLP)
awesome-python-data-science - textacy - NLP, before and after spaCy. (Feature Extraction / Text/NLP)
awesome-ai-content-pipelines - Textacy
awesome-machine-learning - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
awesome-machine-learning - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
awesome-machine-learning - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
awesome-nlp - textacy - 在spaCy上構建的更高級別的自然與儼處理。 (函式庫 / 書籍)
fucking-awesome-machine-learning - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
awesome-machine-learning - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
awesome-machine-learning - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
awesome-chinese-nlp - textacy
awesome-machine-learning - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
awesome-advanced-metering-infrastructure - textacy - higher-level NLP built on Spacy. (Python / General-Purpose Machine Learning)
https-github.com-keon-awesome-nlp - textacy - Higher level NLP built on spaCy (Packages / Libraries)
awesome-nlp - textacy - 在spaCy上構建的更高級別的自然與儼處理。 (函式庫 / 書籍)

README

## textacy: NLP, before and after spaCy

`textacy` is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, `textacy` focuses primarily on the tasks that come before and follow after.

[![build status](https://img.shields.io/travis/chartbeat-labs/textacy/master.svg?style=flat-square)](https://travis-ci.org/chartbeat-labs/textacy)
[![current release version](https://img.shields.io/github/release/chartbeat-labs/textacy.svg?style=flat-square)](https://github.com/chartbeat-labs/textacy/releases)
[![pypi version](https://img.shields.io/pypi/v/textacy.svg?style=flat-square)](https://pypi.python.org/pypi/textacy)
[![conda version](https://anaconda.org/conda-forge/textacy/badges/version.svg)](https://anaconda.org/conda-forge/textacy)

### features

- Access and extend spaCy's core functionality for working with one or many documents through convenient methods and custom extensions
- Load prepared datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments
- Clean, normalize, and explore raw text before processing it with spaCy
- Extract structured information from processed documents, including n-grams, entities, acronyms, keyterms, and SVO triples
- Compare strings and sequences using a variety of similarity metrics
- Tokenize and vectorize documents then train, interpret, and visualize topic models
- Compute text readability and lexical diversity statistics, including Flesch-Kincaid grade level, multilingual Flesch Reading Ease, and Type-Token Ratio

... *and much more!*

### links

- Download: https://pypi.org/project/textacy
- Documentation: https://textacy.readthedocs.io
- Source code: https://github.com/chartbeat-labs/textacy
- Bug Tracker: https://github.com/chartbeat-labs/textacy/issues

### maintainer

Howdy, y'all. 👋

- Burton DeWilde ()

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/chartbeat-labs/textacy

Awesome Lists containing this project

README