spaCy

spaCy is a free library for advanced Natural Language Processing (NLP) in Python. It’s designed specifically for production use and helps you build applications that process and “understand” large volumes of text. It can be used to build information extraction or natural language understanding systems.
- GitHub: https://github.com/topics/spacy
- Wikipedia: https://en.wikipedia.org/wiki/SpaCy
- Repo: https://github.com/explosion/spaCy
- Created by: Explosion
- Related Topics: machine-learning, natural-language-processing, text-classification, named-entity-recognition, tokenization, entity-linking, dependency-parsing, relation-extraction, part-of-speech-tagging, lemmatization,
- Last updated: 2025-05-12 00:26:57 UTC
- JSON Representation
https://github.com/farahibrar/programming-in-python
Explore a comprehensive collection of Python programming for diverse data analysis and data science projects. This repository covers data exploration, visualization, statistical analysis, machine learning, NLP, and model deployment. Perfect for enthusiasts looking to delve into practical examples and advanced techniques.
beautifulsoup dataanalysis docker flask folium jupyter-notebook machine-learning matplotlib nltk numpy pandas python pytorch scikit-learn scikitlearn scipy seaborn spacy statsmodels tensorflow
Last synced: 01 Feb 2025
https://github.com/nluninja/text-mining-dataviz
Data Visualization and Text Mining course repository: it provides notebook implementation for data analysis and machine learning applied to text content - UNICATT:
embeddings llm lstm nlp spacy text-mining transformers
Last synced: 21 Apr 2025
https://github.com/f1uctus/p4a-recipes
📱 🐍 A collection of recipes for p4a (Python for Android).
android blis docker numpy p4a python python-for-android spacy
Last synced: 16 Jan 2025
https://github.com/o19s/bad-libs
:memo: Automatically converts any book into a Mad-Libs style game of silliness using spaCy. Free Charles Dickens included!
Last synced: 20 Nov 2024
https://github.com/kasakee/spacy-nlp-node
A library that will expose the parse method of SpaCy to Node.js
natural-language-processing nlp node node-js nodejs spacy spacy-nlp spacy-nlp-node spacy-node
Last synced: 15 Feb 2025
https://github.com/cornerstone-ondemand/modelkit-imdb
NLP sample project leveraging modelkit and the imdb reviews dataset
fastapi imdb-dataset machine-learning mlops modelkit natural-language-processing nlp python rest-api spacy tensorflow
Last synced: 13 Mar 2025
https://github.com/senisioi/rolegal
A Spacy Package for Romanian Legal Document Processing
floret legal-documents ner romanian-language spacy
Last synced: 15 Feb 2025
https://github.com/neurotech-hq/swahili-ner-spacy
Swahili NER model trained using spacy
Last synced: 20 Feb 2025
https://github.com/sloev/sentimental-onix
sentiment analysis for spacy pipeline in python
onnx sentiment-analysis spacy spacy-pipeline
Last synced: 12 Apr 2025
https://github.com/eliask93/debertav3-for-aspect-based-sentiment-analysis
Application for training the pretrained transformer model DeBERTaV3 on an Aspect Based Sentiment Analysis task
amazon-reviews aspect-based-sentiment-analysis deberta deberta-v3 nlp simpletransformers spacy
Last synced: 06 Apr 2025
https://github.com/diyclassics/la_senter
Repository for training spaCy-compatible sentence segmenter for Latin
Last synced: 28 Mar 2025
https://github.com/bemxio/julia-robotczyk
A Facebook Messenger chatbot based on my classmate's messages
facebook markov-chain markovify messenger nlp python spacy
Last synced: 05 Mar 2025
https://github.com/metalcorebear/spacy-affect-model
A Spacy model for measuring emotional affect.
affect affect-analysis model nltk-python nrclex sentiment-analysis sentiment-classification spacy spacy-nlp vader-sentiment-analysis
Last synced: 26 Dec 2024
https://github.com/conflictingtheories/spacy_ws
Websocket example with Spacy.io
nlp spacy spacy-models spacy-ner websocket
Last synced: 11 Apr 2025
https://github.com/bees4ever/seaqube
Semantic Quality Benchmark for Word Embeddings, i.e. Natural Language Models in Python. Acronym `SeaQuBe` or `seaqube`.
augmentation benchmark fasttext gensim nlp spacy spacy-nlp wordembeddings
Last synced: 21 Apr 2025
https://github.com/amrrs/intro_to_nlp_with_spacy
Introduction to NLP with Spacy - Bangpypers October Talk
Last synced: 12 Apr 2025
https://github.com/gaving/zorya
:grapes: Build NER graphs from YouTube transcripts
neo4j ner spacy youtube-transcripts
Last synced: 07 Apr 2025
https://github.com/anushadatta/natural-language-processing
📑 NLP applications with NLTK, spaCy and PyTorch.
natural-language-processing nltk pytorch spacy
Last synced: 30 Mar 2025
https://github.com/jtlicardo/spacy-ner
A demo app that extracts process tasks from text
named-entity-recognition spacy streamlit
Last synced: 09 May 2025
https://github.com/spacexnu/job_finder
Automate Job Search & Analysis Using AI
ai automation django job nlp openai python search spacy
Last synced: 11 Apr 2025
https://github.com/jbahire/semantic-similarity
This project gives implemetations of semantic similarity using various text embeddings and you can easily compare results using API provided. Go ahead and build your own API for integration in your use case.
bert elmo machine-learning natural-language-processing semantic-similarity spacy word2vec
Last synced: 06 Apr 2025
https://github.com/chanind/reddit-words
What have Spacy's sense2vec 2019 word vectors learned from Reddit?
sense2vec spacy spacy-nlp word2vec
Last synced: 26 Mar 2025
https://github.com/andrehaguiar/jones_granatyr
Cursos IA Expert Academy - Udemy
nlp nlp-machine-learning nltk nltk-python opencv python spacy tensorflow udemy
Last synced: 28 Dec 2024
https://github.com/direct-phonology/spacy-och
the old chinese language for spaCy
Last synced: 05 Apr 2025
https://github.com/nanxstats/pdf-word-extraction
Extract meaningful words from a collection of PDF documents and count their frequencies
ftfy natural-language-processing pypdf research-paper spacy wordcloud
Last synced: 22 Apr 2025
https://github.com/surajiyer/spacybert
BERT inference (with similar function to hanxiao/bert-as-service) for spaCy with custom extension attributes
bert huggingface huggingface-transformers language-model machine-learning natural-language-processing nlp pytorch pytorch-model spacy spacy-extension spacy-pipeline
Last synced: 22 Mar 2025
https://github.com/lykmapipo/us-inaugural-addresses
Python scripts to download, process, and analyze US Inaugural Addresses
beautifulsoup4 gensim joblib lykmapipo natural-language-processing nlp nltk python python-scripts requests spacy text-analysis text-analytics text-extraction text-processing web-scraping
Last synced: 08 Apr 2025
https://github.com/5hirish/django_adam_qas
ADAM - QA -- Front-end using Django and Material Design.
django natural-language-processing python3 question-answering spacy
Last synced: 26 Feb 2025
https://github.com/populated/compare
A simple Python-based code to compare texts for similarities.
comparsion nlp numpy spacy text
Last synced: 29 Mar 2025
https://github.com/shubhamjai9/emotion-based-counsellor-bot
An Artificial Intelligence based Chat Bot using python tools like Numpy, Pandas, Spacy etc. Counsellor Bot will mimic the characteristics and emotion interpretation skills of human and generate response on basis of emotion of engager.
chatbot gradient-boosting-classifier machine-learning naive-bayes-classifier nlp numpy pandas python-2 spacy
Last synced: 13 Mar 2025
https://github.com/woctezuma/steam-descriptions
Retrieve semantically similar Steam games.
discovery game games gensim glove glove-embeddings glove-vectors spacy steam steam-api steam-descriptions steam-game steam-games steam-store-descriptions word2vec
Last synced: 06 Dec 2024
https://github.com/samestrin/llm-services-api
A FastAPI-powered REST API offering a comprehensive suite of natural language processing services using machine learning models with PyTorch and Transformers, packaged in a Docker container to run efficiently.
api docker fastapi hugging-face hugging-face-transformers huggingface-transformers keybert llm openai-compatible-api python python3 pytorch rest rest-api spacy torch transformers uvicorn
Last synced: 05 Apr 2025
https://github.com/kr1shnasomani/webscrub
Python code which extracts the html content, converts it to clean text and pre-processes the text
beautifulsoup html2text natural-language-processing pypi scikit-learn selenium spacy
Last synced: 07 Apr 2025
https://github.com/inanyan/spacy_pat_match_dsl
A simple DSL for creating spaCy pattern matchers
Last synced: 22 Mar 2025
https://github.com/izuna385/scispacy-candidate-generator
Generating Candidate Entities with ScispaCy
allennlp entity-linking natural-language-processing scispacy spacy
Last synced: 28 Mar 2025
https://github.com/etdds/redditquotebot
A Reddit comment bot for detecting and replying to famous quotes.
bot chatbot natural-language-processing nlp praw python reddit spacy
Last synced: 17 Mar 2025
https://github.com/yarosj/prestige-of-districts
:mag_right: This application parses sites and retrieves data associated with failures of public services to display districts' prestige
amqp apollo-client apollo-server docker-compose graphql mapbox-gl ner neural-network nlp nodejs parsing pika python3 rabbitmq react scraping semantic-ui-react spacy taskscheduler webpack
Last synced: 12 Mar 2025
https://github.com/umactually/papanatas
Papanatas Autómata Multiparadigma IV. El bot oficial de mi server de discord, Sociedad de Patanes.
discord discord-bot discord-py ffmpeg pillow pycord python spacy
Last synced: 24 Mar 2025
https://github.com/ljvmiranda921/ud-tagalog-spacy
Training a POS Tagger and Dependency Parser for a Low-Resource Language (Tagalog)
low-resource-languages machine-learning nlp spacy tagalog
Last synced: 28 Mar 2025
https://github.com/herambvd/spoken2written
A source of python package which converts language styles in speech to its equivalent written form.
artificial-intelligence entity machine-learning named-entity-recognition natural-language-processing spacy speech-recognition token-matcher
Last synced: 12 Apr 2025
https://github.com/tritonix711/ai-content-verifier
AI Content Verifier is a tool that finds out if text is written by AI or humans. It uses machine learning and natural language processing to give clear results and confidence scores. With an easy-to-use interface, it helps everyone from researchers to content creators check if the content is real or not.
git machine-learning nlp nltk numpy pandas python scikit-learn spacy tkinter
Last synced: 09 Jan 2025
https://github.com/muneeb1030/finetune-tiny-llama
Fine-tuning the Tiny Llama model to mimic my professor's writing style using the Llama Factory. The project involves data collection, preprocessing, preparation, fine-tuning, and evaluation.
data data-preparation data-preprocessing finetuning llama-factory llm pymupdf selenium-python spacy tinyllama webscraping
Last synced: 13 Mar 2025
https://github.com/darkrockmountain/spacy-ewc
A spaCy library for Named Entity Recognition with Elastic Weight Consolidation.
catastrophic-forgetting clasificacion entity-recognition ewc labeling machine-learning machine-learning-algorithms model ner nlp spacy spacy-nlp thinc
Last synced: 02 Jan 2025
https://github.com/ucrel/pymusas-models
PyMUSAS Models
models natural-language-processing nlp spacy spacy-models
Last synced: 22 Nov 2024
https://github.com/gtoffoli/commons-textanalysis
Text-analysis support for Django clients, talking through HTTP API to an extended spaCy deployment.
django nlp python spacy text-analysis
Last synced: 07 May 2025
https://github.com/yash22222/terrorist-activity-forecasting-and-risk-assessment-system
In an era marked by global security challenges, the "TAFRAS" emerges as a cutting-edge solution to tackle the ever-evolving threat of terrorism. The project is grounded in the urgent need for predictive systems that can anticipate, assess, and mitigate potential terrorist activities.
corpora data-vizualisation folium-maps gensim global-terrorism-database lda machine-learning matplotlib networkx nltk nmf numpy pandas python random-forest-classifier seaborn sklearn spacy textblob vader-sentiment-analysis
Last synced: 24 Feb 2025
https://github.com/miliar/app-name-matching
Matching apps from googleplay and ios
fuzzywuzzy googleplay ios-app jupyter-notebook machine-learning python random-forest similar similarity similarity-score similarity-search spacy
Last synced: 09 Apr 2025
https://github.com/teakulo/eventime-app
Eventime App is an event management platform using Angular, Spring Boot, Flask, and PostgreSQL. It offers AI-powered event recommendations, social features, and secure authentication. Users can manage events, chat with a chatbot, and view their calendar.
ai angular authentication calendar chatbot event flask lemmatization nlp nltk postgresql spacy springboot
Last synced: 09 Apr 2025
https://github.com/acdh-oeaw/acdh-prodigy-utils
custom loaders for spaCy's prodigy
Last synced: 16 Mar 2025
https://github.com/timuroeztuerk/data-science-lecture-s24
This is the webpage of the Data Science course offered by VWL 7 for the summer semester 2024.
economics natural-language-processing nltk spacy text-classification
Last synced: 24 Feb 2025
https://github.com/marmg/moviener
Code for the NER demo. Prepare data, train and extract entities from movie reviews.
extract-entities movie-reviews ner spacy
Last synced: 13 Mar 2025
https://github.com/pyladiesams/nlp-beginner-nov2020
Intro to NLP with NLTK, spaCy, and gensim
gensim nlp nlp-machine-learning nltk python spacy
Last synced: 22 Feb 2025
https://github.com/mfkimbell/reviews-nlp-sentiment-analysis
This project investigates various NLP tools, compares them, and then uses the NLP tool to add a sentiment field to a PostgreSQL database in an efficient batch format.
asyncio asyncpg docker nltk pdm postgresql spacy tensorflow toml transformers yml
Last synced: 16 Mar 2025
https://github.com/bram-code/llm-anonymization
This repository provides utilities for anonymizing, pseudonymizing, and simplifying Dutch text using various NLP techniques.
anonymization dutch large-language-models llm named-entity-recognition ner pseudonymisation simplification spacy
Last synced: 01 Feb 2025
https://github.com/innerdoc/spacy-for-datashare
Let spaCy do the parsing of Named Entities for documents in the Datashare platform
datashare elasticsearch named-entity-recognition natural-language-processing spacy
Last synced: 14 Mar 2025
https://github.com/randika00/ism-web-automation-y23cp-web
Web scraping refers to the extraction of data from a website. Be it a spreadsheet or an API.
2captcha-api beautifulsoup regex scrapy selenium spacy webdriver
Last synced: 28 Mar 2025
https://github.com/sloev/spacy_onnx_sentiment_english
english sentiment model for spacy
onnx-models sentiment-analysis spacy spacy-pipeline
Last synced: 28 Mar 2025
https://github.com/Keshabkjha/ClimaSense
ClimaSense is a web application that provides real-time weather information based on the user's location or any searched city. It features automatic location detection, manual search, and a chatbot , built using Python (Streamlit & SpaCy), that responds to weather-related queries.
html-css-javascript niet-codetantra niet-training python python3 spacy spacy-nlp streamlit weather-api weather-app
Last synced: 13 Mar 2025
https://github.com/parthapray/noaa_nlp_weather_forecast_summarizer
This repo contains codes for summarization of NOAA api JSON data into human readable texts for forecasting
gradio nlp noaa openai spacy summarization transformers weather-app weather-forecast
Last synced: 27 Feb 2025
https://github.com/whatevery1says/preprocessing
WE1S Preprocessing -- workflow preparing documents for import as WE1S data
digital-humanities humanities news nltk preprocessing spacy topic-modeling
Last synced: 04 Mar 2025
https://github.com/csfelix/nlp-0-spacy-course
💬 Advanced NLP with Spacy Course
natural-language-processing nlp python spacy
Last synced: 26 Mar 2025
https://github.com/somenath203/named-entity-recognizer
Click below to checkout the website
huggingface huggingface-spaces named-entity-recognition ner spacy streamlit torch transformers
Last synced: 04 Mar 2025
https://github.com/izuna385/arxiv-checker
Single Page Application and its deployment for GCE.
docker docker-compose fastapi nginx react react-bootstrap spacy tdd
Last synced: 28 Mar 2025
https://github.com/sivkri/nlp-llm-textmining_genesearch
Text Mining (PubMed Search) with NLP & LLM
ensemble-machine-learning llm minilm named-entity-recognition ner nlp nlpmodels nltk pubmed pubmed-abstracts spacy textmining
Last synced: 12 Mar 2025
https://github.com/neuledge/spacy-api
An spaCy API service
docker machine-learning microservice nlp python spacy
Last synced: 13 Apr 2025
https://github.com/bjam24/agh-natural-language-processing
This respository contains projects made for the NLP course at the AGH UST in 2024 / 2025. They received maximum grade 5.0.
agh elasticsearch language-modeling language-modelling levenshtein llm ner neural-search nlp prompt-enginering question-answering rag regex spacy text-classificaiton text-classification
Last synced: 17 Mar 2025
https://github.com/raju-2003/indiaai-cyberguard-ai-hackathon
An NLP-powered system to simplify cybercrime reporting by analyzing descriptions, categorizing incidents, and providing actionable insights.
matplotlib nltk numpy pandas python random-forest-classifier re scikit-learn seaborn shap spacy wordcloud
Last synced: 17 Mar 2025
https://github.com/robgc/sento-processing
A Natural Language Processing tool designed to perform sentiment analysis on tweets and store the results obtained.
async asyncpg nlp python sentiment-analysis spacy spacy2
Last synced: 12 Apr 2025
https://github.com/wesslen/spacy-ecfr-ner
spaCy-Prodigy workflow for NER Citation model on eCFR Banking Regulation
Last synced: 06 Apr 2025
https://github.com/pyladiesams/nlp-projects-with-spacy-may2024
NLP projects with spaCy
Last synced: 05 Apr 2025
https://github.com/tomhalloin/Springboard-Berkshire
Topic model analysis of Berkshire Hathaway annual letters (Completed Capstone Project #2)
gensim nlp spacy springboard textacy topic-modeling
Last synced: 10 May 2025
https://github.com/aitechhero/nonullsense-nlp
Natural Language Processing (NLP) with libraries like spaCy, Transformers, and NLTK.
ai artificial-intelligence huggingface natural-language-processing nlp nltk python spacy text-analysis transformers
Last synced: 26 Feb 2025
https://github.com/surajiyer/topic-analysis
Python library to perform topic detection on textual data that are generated over time.
agglomerative-clustering gaussian-mixture-models nlp spacy spectral-clustering textual-data topic-analysis topic-modeling
Last synced: 29 Mar 2025
https://github.com/bghorvath/TextMiningTheBechdelTest
Text mining movie scripts to explore long-term trend of female representation in movies according to the Bechdel test
bechdel bechdel-test coreference-resolution neuralcoref spacy
Last synced: 09 May 2025
https://github.com/inshh04/codealpha_chatbotforfaqs_inshanadeem
The FAQ Chatbot is a Python-based conversational agent designed to interact with users and respond to frequently asked questions. It offers a simple and engaging way to provide automated responses, handle polite interactions like thanking the user, and end conversations gracefully. This project serves as a basic template for building more advanced.
chatbot faqbot faqchatbot faqs keyword-extraction nlp nlp-machine-learning progressive-web-app project python python3 pythonprojects spacy spacy-nlp
Last synced: 05 Apr 2025
https://github.com/den1ksk/nlp-with-disastertweets
Kaggle competition
bert data-science deeplearning kaggle machine-learning nlp nltk pytorch spacy transformers xgboost
Last synced: 15 Mar 2025
https://github.com/giuliosmall/twitter-trending-topics-pipeline
This project demonstrates trending topic detection using Apache Spark and MinIO. It processes Twitter JSON data with PySpark, leveraging distributed data processing and cloud storage. The entire project is containerized with Docker for easy deployment across architectures.
docker minio nlp pyspark pytest spacy spark streamlit
Last synced: 30 Mar 2025
https://github.com/geetisha/advanced-eda-and-text-mining
Advanced EDA and Text Mining
jupyter-notebook matplotlib nltk numpy pandas python spacy textblob wordcloud
Last synced: 18 Mar 2025
https://github.com/isabelleysseric/sentiment-analysis
Sentiment analysis with dependency tree.
bag-of-words-model corpus dependency-analysis dependency-parsing dependency-tree dependency-trees displacy embedding multigram nlp nltk parsing pos-tagging scope-of-negation sentiment-analysis sentimental-analysis sentiwordnet spacy text-classification
Last synced: 20 Feb 2025
https://github.com/navaneethelite/ner_streamlit
A genreal purpose Named Entity Recognition model using Spacy v3. This web app was built using streamlit and deployed to Heroku.
Last synced: 22 Mar 2025
https://github.com/debugger404/multilanguage-pos
Named Entity Recognition with SpaCy - 🌐📝 Repository for NER using SpaCy's MultiLanguage module. Supports multiple languages.
multilanguage named-entity-recognition ner python3 spacy
Last synced: 08 Apr 2025
https://github.com/kailejie/ner
This repository implements Named Entity Recognition (NER) using spaCy, NLTK, and BERT (from the Hugging Face Transformers library). The project runs on a Streamlit web application, allowing users to upload a CSV file containing subject lines to perform NER and visualize the results. It can be run locally or on Google Colab.
Last synced: 05 Apr 2025
https://github.com/2pa4ul2/mcq-quiz-maker-nlp
Quizzable a quiz generator for short reviews with Spacy and NLTK
flask nlp nltk python question-generation quizapp spacy
Last synced: 05 Apr 2025
https://github.com/sukanyadutta52/sentiment-analysis
An Analysis of How Machine Perceives Women and How Women Feel about Themselves As a Result of This Perception: Sentiment Analysis
flair matplotlib nltk-library pandas regular-expression sentiment-analysis spacy textblob vader-sentiment-analysis women-beauty-standard
Last synced: 28 Mar 2025
https://github.com/sudip-13/nlp
This repo for tutorial NLP dialog flow chat bot back end configured
dialogflow fastapi fasttext mogodb ner regex spacy tf-idf
Last synced: 29 Mar 2025
https://github.com/ivan-kleshnin/spacy-benchmarks
Comparison of Spacy performance with different architectures, corpuses, hyperparams...
clearnlp nlp penn-treebank spacy universal-dependencies universaldependencies
Last synced: 07 Mar 2025
https://github.com/datarohit/nlp-course-files
The files in this Repo are files for the online NLP-Course from Udemy.com which I completed.
nlp nlp-machine-learning nltk numpy panda python sklearn spacy
Last synced: 09 Apr 2025
https://github.com/karimosman89/legal-document-nlp
Create a tool that uses NLP to extract key information from legal documents, contracts, or agreements.Use NLP techniques for named entity recognition and text classification.Streamline the review process for legal teams by automating information extraction.
nltk python scikit-learn spacy
Last synced: 19 Feb 2025
https://github.com/izuna385/pubtator-multiprocess-parser
Specifically for Entity Linking. Quick demo with MedMentions and NCBI datasets is also included.
allennlp bioinformatics entity-disambiguation entity-linking natural-language-processing pubtator spacy
Last synced: 28 Mar 2025
https://github.com/thyripian/core
This repository contains the Centralized Operational Reporting Engine (CORE), designed for processing diverse datasets and integrating with Elasticsearch, PostgreSQL, and SQLite. It features a React-based UI for interacting with the backend, offering data extraction, processing, and search functionalities.
api csv data-science elasticsearch flask fullstack-development javascript pandas postgresql python react spacy sqlite
Last synced: 01 Apr 2025