https://github.com/tristan-mcinnis/spacy-models-setup-and-testing
A Python utility for downloading, storing, and testing Spacy language models for English and Chinese NLP tasks.
https://github.com/tristan-mcinnis/spacy-models-setup-and-testing
chinese english nlp python simple-project spacy testing
Last synced: 6 months ago
JSON representation
A Python utility for downloading, storing, and testing Spacy language models for English and Chinese NLP tasks.
- Host: GitHub
- URL: https://github.com/tristan-mcinnis/spacy-models-setup-and-testing
- Owner: tristan-mcinnis
- License: mit
- Created: 2025-02-08T23:07:17.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-02-08T23:15:11.000Z (8 months ago)
- Last Synced: 2025-02-09T00:19:13.096Z (8 months ago)
- Topics: chinese, english, nlp, python, simple-project, spacy, testing
- Language: Python
- Homepage:
- Size: 9.77 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Spacy-Models-Setup-and-Testing
A Python utility for downloading, storing, and testing Spacy language models for English and Chinese NLP tasks.# Spacy Models Setup and Testing
A Python utility for downloading, storing, and testing Spacy language models for English and Chinese NLP tasks.
## Overview
This repository contains scripts to automate the setup and testing of Spacy language models, specifically:
- English: `en_core_web_sm`
- Chinese: `zh_core_web_sm`, `zh_core_web_md`, `zh_core_web_lg`The models are downloaded and stored in a custom directory (`./data/models/`) for easy access and management.
## Model Sizes
- zh_core_web_sm: ~12MB
- zh_core_web_md: ~23MB
- zh_core_web_lg: ~46MB
- en_core_web_sm: ~12MB## Requirements
```bash
pip install spacy
```## Directory Structure
.
├── README.md
├── setup_spacy_models.py
└── data/
└── models/
├── en_core_web_sm/
├── zh_core_web_sm/
├── zh_core_web_md/
└── zh_core_web_lg/## Usage
import spacy# Load English model
nlp_en = spacy.load('./data/models/en_core_web_sm')# Load Chinese model (large)
nlp_zh = spacy.load('./data/models/zh_core_web_lg')# Process text
doc_en = nlp_en("This is a test sentence.")
doc_zh = nlp_zh("这是一个测试句子。")•Automated download and installation of Spacy models
•Custom directory storage
•Model testing with sample texts
•Token and POS tag demonstration
•Named Entity Recognition testing
•Error handling and validation