Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/amazon-science/refined

ReFinED is an efficient and accurate entity linking (EL) system.

entity-extraction entity-linking entity-resolution nlp pytorch

Last synced: 08 Jan 2025

https://github.com/explosion/displacy-ent

:boom: displaCy-ent.js: An open-source named entity visualiser for the modern web

css javascript named-entities natural-language-processing nlp spacy visualization

Last synced: 25 Sep 2024

https://github.com/milaan9/Python_Natural_Language_Processing

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

bag-of-words inversedocumentfrequency ipython-notebook lemmatization named-entity-recognition nlp partofspeech-tagger python4datascience python4everybody sentence-segmentation stemming stopwords termfrequency tf-idf tokenization tutor-milaan9 vocabulary-matching

Last synced: 25 Dec 2024

https://github.com/dkpro/dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.

dkpro java natural-language-processing nlp uima uima-components

Last synced: 03 Jan 2025

https://github.com/milaan9/python_natural_language_processing

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

bag-of-words inversedocumentfrequency ipython-notebook lemmatization named-entity-recognition nlp partofspeech-tagger python4datascience python4everybody sentence-segmentation stemming stopwords termfrequency tf-idf tokenization tutor-milaan9 vocabulary-matching

Last synced: 07 Jan 2025

https://github.com/MaartenGr/Concept

Concept Modeling: Topic Modeling on Images and Text

computer-vision image-processing nlp topic-modeling

Last synced: 05 Nov 2024

https://github.com/fanhuaandluomu/parselawdocuments

对收集的法律文档进行一系列分析,包括根据规范自动切分、案件相似度计算、案件聚类、法律条文推荐等(试验目前基于婚姻类案件,可扩展至其它领域)。

law nlp text-classification

Last synced: 12 Nov 2024

https://github.com/iPieter/RobBERT

A Dutch RoBERTa-based language model

bert bert-model language-model nlp nlp-resources roberta transformers

Last synced: 17 Nov 2024

https://github.com/stanford-oval/genie-toolkit

The Genie open source kit for voice assistant (formerly known as Almond)

hacktoberfest natural-language nlp semantic-parsers voice-assistant

Last synced: 06 Jan 2025

https://github.com/textvec/textvec

Text vectorization tool to outperform TFIDF for classification tasks

machine-learning natural-language-processing nlp python text-analysis text-classification text-processing tf-idf

Last synced: 05 Jan 2025

https://github.com/OpenNewsLabs/guri-vr

https://gurivr.com

nlp virtual-reality vr webvr

Last synced: 25 Nov 2024

https://github.com/dusty-nv/jetson-voice

ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT

deep-learning jetson jetson-nano nlp pytorch speech-recognition tensorrt text-to-speech

Last synced: 09 Jan 2025

https://github.com/WZBSocialScienceCenter/tmtoolkit

Text Mining and Topic Modeling Toolkit for Python with parallel processing power

evaluation nlp parallel-processing python socialscience text-processing topic-modeling

Last synced: 13 Nov 2024

https://github.com/yanndubs/hash-embeddings

PyTorch implementation of Hash Embeddings (NIPS 2017). Submission to the NIPS Implementation Challenge.

embeddings hashing nips nips-challenge nlp pytorch reproducible-research word-embeddings

Last synced: 27 Oct 2024

https://github.com/franck-dernoncourt/pubmed-rct

PubMed 200k RCT dataset: a large dataset for sequential sentence classification.

corpus machine-learning medical nlp randomized-controlled-trials sentence-classification

Last synced: 01 Dec 2024

https://github.com/soumyadip007/microsoft-student-partner-workshop-learning-materials-ai-nlp

This repository contains all codes and materials of the current session. It contains the required code on Natural Language Processing, Artificial intelligence.

ai cloud distributed-networking microsoft nlp peer-to-peer workshop

Last synced: 10 Jan 2025

https://github.com/intelligo-mn/neuro

🔮 Neuro.js is machine learning library for building AI assistants and chat-bots.

ai ai-assistants bot chat-bot chat-bots chatbot machine-learning natural-language-processing nlp nodejs

Last synced: 04 Jan 2025

https://github.com/guotong1988/NL2SQL-RULE

Content Enhanced BERT-based Text-to-SQL Generation https://arxiv.org/abs/1910.07179

bert deep-learning knowledge knowledge-representation nl2sql nlp pytorch rule-inject-to-model semantic-parsing text2sql

Last synced: 11 Nov 2024

https://github.com/thammegowda/nllb-serve

Meta's "No Language Left Behind" models served as web app and REST API

machine-translation multilingual nlp transformers translation

Last synced: 05 Jan 2025

https://github.com/tomasonjo/neogpt-explorer

Knowledge-graph based chatbot using GPT3 and Neo4j

chatbot gpt-3 graph neo4j nlp streamlit

Last synced: 09 Jan 2025

https://github.com/dair-ai/emotion_dataset

:smile: Dataset for Emotion Recognition Research

dataset machine-learning nlp pytorch

Last synced: 27 Dec 2024

https://github.com/ines/spacy-js

🎀 JavaScript API for spaCy with Python REST API

javascript natural-language-processing nlp python rest-api spacy

Last synced: 06 Jan 2025

https://github.com/ropensci/tokenizers

Fast, Consistent Tokenization of Natural Language Text

nlp peer-reviewed r r-package rstats text-mining tokenizer

Last synced: 22 Nov 2024

https://github.com/microsoft/presidio-research

This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.

deep-learning flair machine-learning named-entity-recognition natural-language-processing ner nlp pii privacy spacy transformers

Last synced: 04 Jan 2025

https://github.com/ShawnyXiao/2017-CCF-BDCI-AIJudge

2017-CCF-BDCI-让AI当法官(初赛):7th/415 (Top 1.68%)

2017 bdci ccf data-mining multiclass-classification nlp

Last synced: 01 Nov 2024

https://github.com/Attempto/APE

Parser for Attempto Controlled English (ACE)

ace attempto cnl nlp swi-prolog

Last synced: 14 Nov 2024

https://github.com/sorgerlab/indra

INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system interfacing with NLP systems and databases to collect knowledge, and through a process of assembly, produce causal graphs and dynamical models.

bioinformatics biology computational-biology indra modeling nlp pysb sbml systems-biology

Last synced: 04 Jan 2025

https://github.com/explosion/spacymoji

💙 Emoji handling and meta data for spaCy with custom extension attributes

emoji emoji-unicode emojis natural-language-processing nlp spacy spacy-extension spacy-pipeline

Last synced: 05 Jan 2025

https://github.com/beader/ruijin_round2

瑞金医院MMC人工智能辅助构建知识图谱大赛复赛

nlp relation-extraction tianchi

Last synced: 12 Nov 2024

https://github.com/hrwhisper/SpamMessage

中文垃圾短信识别(手写分类器)

machine-learning nlp python

Last synced: 19 Nov 2024

https://github.com/princeton-nlp/trime

[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674

language-model nlp

Last synced: 11 Nov 2024

https://github.com/mannefedov/compling_nlp_hse_course

Материалы курса по компьютерной лингвистике Школы Лингвистики НИУ ВШЭ

computational-linguistics course hse machine-learning natural-language-processing nlp python

Last synced: 13 Nov 2024

https://github.com/martinomensio/spacy-universal-sentence-encoder

Google USE (Universal Sentence Encoder) for spaCy

models nlp spacy tensorflow-hub use

Last synced: 06 Jan 2025

https://github.com/daspartho/prompt-extend

extending stable diffusion prompts with suitable style cues using text generation

deep-learning gpt-2 huggingface-spaces huggingface-transformers machine-learning nlp prompt stable-diffusion text-generation

Last synced: 15 Dec 2024

https://github.com/simongray/clojure-dsl-resources

A curated list of Clojure resources for dealing with domain-specific languages.

data-transformation domain-specific-language dsl nlp parsing

Last synced: 10 Dec 2024

https://github.com/opensemanticsearch/open-semantic-entity-search-api

Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names

api disambiguation entity-extraction knowledge-graph knowledgebase linked-data linked-data-api linkeddata named-entities named-entity-recognition natural-language-processing nlp python reconciliation reconciliation-service rest-api semantic semantic-analysis semantic-annotation thesaurus

Last synced: 27 Oct 2024

https://github.com/ownthink/semantic

语义理解/口语理解,项目包含有词法分析:中文分词、词性标注、命名实体识别;口语理解:领域分类、槽填充、意图识别。

nlp nlu slu

Last synced: 20 Dec 2024

https://github.com/PaddlePaddle/PALM

a Fast, Flexible, Extensible and Easy-to-use NLP Large-scale Pretraining and Multi-task Learning Framework.

baidu multi-task-learning nlp paddlepaddle pretrain-model transformers

Last synced: 27 Nov 2024

https://github.com/xatkit-bot-platform/xatkit

The simplest way to build all types of smart chatbots and digital assistants

bot chatbot-framework chatbots conversational-ai digital-assistant dsl low-code nlp no-code

Last synced: 07 Nov 2024

https://github.com/coastalcph/lex-glue

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

benchmark lawtech legal legaltech nlp

Last synced: 02 Nov 2024

https://github.com/dengbocong/text-similarity

文本相似度(匹配)计算,提供Baseline、训练、推理、指标分析...代码包含TensorFlow/Pytorch双版本

bert deep-learning mechine-learing model nlp pytorch similarity text-classification transformer

Last synced: 09 Jan 2025

https://github.com/dccuchile/wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

bias-detection bias-reduction fairness-ai fairness-ml library nlp nlp-library python3 word-embedding-evaluation word-embedding-fairness word-embeddings

Last synced: 22 Nov 2024

https://github.com/fiddler-labs/fiddler-auditor

Fiddler Auditor is a tool to evaluate language models.

ai-observability evaluation generative-ai langchain llms nlp robustness

Last synced: 05 Jan 2025

https://github.com/yohasebe/wp2txt

A command-line toolkit to extract text content and category data from Wikipedia dump files

corpus machine-learning nlp ruby wikipedia wikipedia-dump

Last synced: 04 Jan 2025

https://github.com/shjwudp/shu

中文书籍收录整理, Collection of Chinese Books

books dataset nlp

Last synced: 02 Dec 2024

https://github.com/princeton-nlp/cofipruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

bert model-compression nlp pruning

Last synced: 11 Nov 2024

https://github.com/uzay-g/espial

Espial is an engine for automated organization and discovery of personal knowledge

knowledge knowledge-graph nlp python

Last synced: 27 Oct 2024

https://github.com/Uzay-G/espial

Espial is an engine for automated organization and discovery of personal knowledge

knowledge knowledge-graph nlp python

Last synced: 01 Nov 2024

https://github.com/j2kao/fcc_nn_research

(somewhat) cleaned-up notebooks used in researching public comments for FCC Proceeding 17-108 (Net Neutrality Repeal)

fcc net-neutrality nlp

Last synced: 29 Nov 2024

https://github.com/cyberzhg/keras-xlnet

Implementation of XLNet that can load pretrained checkpoints

glue keras language-model nlp xlnet

Last synced: 27 Sep 2024

https://github.com/CyberZHG/keras-xlnet

Implementation of XLNet that can load pretrained checkpoints

glue keras language-model nlp xlnet

Last synced: 16 Nov 2024

https://github.com/HKUSTDial/NL2SQL_Handbook

This is a continuously updated handbook for readers to easily track the latest NL2SQL techniques in the literature and provide practical guidance for researchers and practitioners.

awesome finetuning llms nl-to-code nl-to-sql nl2sql nlp nlp-resources survey text-to-sql text2sql tutorial

Last synced: 02 Nov 2024

https://github.com/hscspring/all4nlp

All For NLP, especially Chinese.

ai deeplearning machinelearning nlp

Last synced: 03 Jan 2025

https://github.com/avidale/compress-fasttext

Tools for shrinking fastText models (in gensim format)

fasttext fasttext-embeddings nlp python word-embeddings

Last synced: 13 Nov 2024

https://github.com/irudnyts/openai

An R package-wrapper around OpenAI API

api ml nlp openai package r

Last synced: 04 Dec 2024

https://github.com/ethancaballero/improved-dynamic-memory-networks-dmn-plus

Theano Implementation of DMN+ (Improved Dynamic Memory Networks) from the paper by Xiong, Merity, & Socher at MetaMind, http://arxiv.org/abs/1603.01417 (Dynamic Memory Networks for Visual and Textual Question Answering)

babi-tasks deep-learning deep-neural-networks neural-network nlp question-answering

Last synced: 15 Dec 2024

https://github.com/akshaynagpal/w2n

Convert number words (eg. twenty one) to numeric digits (21)

nlp numeric-digits python word-to-number

Last synced: 22 Nov 2024

https://github.com/IlyaGusev/summarus

Models for automatic abstractive summarization

deep-learning machine-learning nlp pytorch summarization

Last synced: 04 Nov 2024

https://github.com/rylans/getlang

Natural language detection package in pure Go

language-model natural-language nlp

Last synced: 26 Oct 2024

https://github.com/prrao87/fine-grained-sentiment

A comparison and discussion of different NLP methods for 5-class sentiment classification on the SST-5 dataset.

fasttext flair nlp python pytorch sentiment-analysis text-classification transformers

Last synced: 08 Jan 2025

https://github.com/algolisted-org/algolisted

Algolisted is an AI-powered platform dedicated to assisting computer science students in preparing for placements and internships. Our services include tracking and analytics across various platforms and topics.

ai css firebase hacktoberfest-2023 javascript mern-stack ml nlp python3 react-js web-scraping

Last synced: 05 Jan 2025

https://github.com/doc-analysis/xfund

XFUND: A Multilingual Form Understanding Benchmark

dataset natural-language-processing nlp

Last synced: 01 Dec 2024

https://github.com/princeton-nlp/optiprompt

[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240

nlp probing prompt

Last synced: 11 Nov 2024

https://github.com/princeton-nlp/OptiPrompt

[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240

nlp probing prompt

Last synced: 19 Nov 2024

https://github.com/doc-analysis/XFUND

XFUND: A Multilingual Form Understanding Benchmark

dataset natural-language-processing nlp

Last synced: 06 Nov 2024

https://github.com/natasha/navec

Compact high quality word embeddings for Russian language

embeddings glove nlp python quantization russian word2vec

Last synced: 05 Jan 2025

https://github.com/indix/whatthelang

Lightning Fast Language Prediction 🚀

fasttext language-detection languages nlp python

Last synced: 09 Jan 2025

https://github.com/kuutsav/information-retrieval

Neural information retrieval / Semantic search / Bi-encoders

information-retrieval machine-learning nlp semantic-search

Last synced: 15 Nov 2024

https://github.com/crazyofapple/Reading_groups

A paper & resource list of large language models, including course, paper, demo, figures

chatgpt gpt-3 gpt-4 large-language-models llm llms natural-language-processing nlp

Last synced: 10 Nov 2024

https://github.com/geekjr/quickai

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

ai artificial-intelligence bert deep-learning dl easy-to-use fast gpt gpt-neo huggingface-transformers ml neural-network nlp object-detection python pytorch quickai research tensorflow2 yolo

Last synced: 09 Nov 2024

https://github.com/platisd/duplicate-code-detection-tool

A simple Python3 tool to detect similarities between files within a repository

code-duplication gensim nlp

Last synced: 07 Jan 2025

https://github.com/NPCai/Open-IE-Papers

Open Information Extraction (OpenIE) and Open Relation Extraction (ORE) papers and data.

information-extraction literature-review nlp openie papers relation-extraction tuples

Last synced: 10 Nov 2024