Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/vrasneur/pyfasttext

Yet another Python binding for fastText

fasttext machine-learning nlp numpy python python-bindings word-vectors

Last synced: 07 Nov 2024

https://github.com/BLLIP/bllip-parser

BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.

ai artificial-intelligence computational-linguistics machine-learning natural-language-processing nlp nlp-library parsing

Last synced: 30 Oct 2024

https://github.com/swabhs/open-sesame

A frame-semantic parsing system based on a softmax-margin SegRNN.

crf deep-learning dynet frame-semantic-parsing natural-language-processing nlp python27

Last synced: 12 Oct 2024

https://github.com/maxim5/cs224n-2017-winter

All lecture notes, slides and assignments from CS224n: Natural Language Processing with Deep Learning class by Stanford

cs224n deep-learning machine-learning nlp stanford-nlp

Last synced: 05 Nov 2024

https://github.com/hppRC/bert-classification-tutorial

【2023年版】BERTによるテキスト分類

bert deep-learning japanese nlp python pytorch transformers

Last synced: 06 Nov 2024

https://github.com/daac-tools/vaporetto

🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer

analyzer japanese morphological-analysis nlp rust segmentation tokenization tokenizer

Last synced: 07 Nov 2024

https://github.com/hpprc/bert-classification-tutorial

【2023年版】BERTによるテキスト分類

bert deep-learning japanese nlp python pytorch transformers

Last synced: 01 Nov 2024

https://github.com/houbb/pinyin

The high performance pinyin tool for java.(java 高性能中文转拼音工具。支持同音字。)

dfa high-performance nlp pinyin pinyin-analysis pinyin-data pinyin-segmentation pinyin4j segment tiny tiny-pinyin tongyinzi

Last synced: 07 Nov 2024

https://github.com/fedml-ai/fednlp

FedNLP: An Industry and Research Integrated Platform for Federated Learning in Natural Language Processing, Backed by FedML, Inc. The Previous Research Version is Accepted to NAACL 2022

federated-learning machine-learning natural-language-processing nlp

Last synced: 08 Nov 2024

https://github.com/FedML-AI/FedNLP

FedNLP: An Industry and Research Integrated Platform for Federated Learning in Natural Language Processing, Backed by FedML, Inc. The Previous Research Version is Accepted to NAACL 2022

federated-learning machine-learning natural-language-processing nlp

Last synced: 11 Nov 2024

https://github.com/vzhong/embeddings

Fast, DB Backed pretrained word embeddings for natural language processing.

deep-learning neural-network nlp

Last synced: 13 Nov 2024

https://github.com/natasha/slovnet

Deep Learning based NLP modeling for Russian language

bert deep-learning machine-learning morphology ner nlp python pytorch russian syntax

Last synced: 11 Oct 2024

https://github.com/mindflowai/mindflow

🧠 AI-powered CLI git wrapper, boilerplate code generator, chat history manager, and code search engine to streamline your dev workflow 🌊

chat-gpt cli code-generation command-line-interface dev-tools git git-wrapper information-retrieval large-language-models llm machine-learning modern-dev-tools nlp openai openai-api python search search-engine

Last synced: 29 Oct 2024

https://github.com/sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

bert correference-resolution entity-linking entity-typing natural-language-inference nlp prompt-learning sentence-classification sentiment-analysis tensorflow text-classification zero-shot

Last synced: 03 Aug 2024

https://github.com/openvenues/node-postal

NodeJS bindings to libpostal for fast international address parsing/normalization

address address-parser binding international native nlp

Last synced: 09 Nov 2024

https://github.com/soskek/bert-chainer

Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

bert chainer google natural-language-processi natural-language-understanding nlp transformer

Last synced: 02 Nov 2024

https://github.com/mmxgn/spacy-clausie

Implementation of the ClausIE information extraction system for python+spacy

clausie information-extraction nlp problog python-spacy spacy

Last synced: 30 Sep 2024

https://github.com/IngestAI/embedditor

⚡ GUI for editing LLM vector embeddings. No more blind chunking. Upload content in any file extension, join and split chunks, edit metadata and embedding tokens + remove stop-words and punctuation with one click, add images, and download in .veml to share it with your team.

datapreprocessing datascience embedding-vectors embeddings genai laravel llm markup-language ml nlp nltk php vector-database vector-search vectorization veml

Last synced: 31 Oct 2024

https://github.com/naver/claf

CLaF: Open-Source Clova Language Framework

clova framework language natural-language-processing nlp pytorch

Last synced: 08 Nov 2024

https://github.com/bnosac/udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

conll dependency-parser lemmatization natural-language-processing nlp pos-tagging r r-package r-pkg rcpp text-mining tokenizer udpipe

Last synced: 11 Nov 2024

https://github.com/cohere-ai/sandbox-topically

Topic modeling helpers using managed language models from Cohere. Name text clusters using large GPT models.

machine-learning nlp python topic-modeling

Last synced: 07 Oct 2024

https://github.com/jaidevd/numerizer

A Python module to convert natural language numerics into ints and floats.

information-extraction nlp regular-expressions spacy spacy-extension

Last synced: 14 Oct 2024

https://github.com/brutalcoding/aub.ai

AubAI brings you on-device gen-AI capabilities, including offline text generation and more, directly within your app.

android dart flutter gemini gemini-nano gen-ai genai indiedev ios ipados linux llamacpp localllama macos mistral-7b native-apps nlp on-device on-device-ai pubdev

Last synced: 10 Oct 2024

https://github.com/ticki/eudex

A blazingly fast phonetic reduction/hashing algorithm.

nlp

Last synced: 02 Nov 2024

https://github.com/akaza-im/akaza

Yet another Japanese IME for IBus/Linux

ibus ime nlp rust

Last synced: 07 Nov 2024

https://github.com/akoksal/Turkish-Word2Vec

Pre-trained Word2Vec Model for Turkish

gensim nlp turkish word2vec

Last synced: 12 Nov 2024

https://github.com/vipul-sharma20/sharingan

Tool to extract news articles from newspaper and give the context about the news

context-extraction news-extraction nlp opencv

Last synced: 10 Nov 2024

https://github.com/Fixy-TR/fixy

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.

acikhack2 ai artificial-intelligence bert data-science deep-learning deeplearning keras natural-language-processing neural-network neural-networks nlp python

Last synced: 12 Nov 2024

https://github.com/davidberenstein1957/classy-classification

This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.

few-shot-classifcation hacktoberfest machine-learning natural-language-processing nlp nlu sentence-transformers spacy text-classification

Last synced: 14 Oct 2024

https://github.com/ayaka14732/llama-2-jax

JAX implementation of the Llama 2 model

jax llama llama2 natural-language-processing nlp

Last synced: 26 Oct 2024

https://github.com/nisaaragharia/advanced_rag

Advanced Retrieval-Augmented Generation (RAG) through practical notebooks, using the power of the Langchain, OpenAI GPTs ,META LLAMA3 ,Agents.

agent agents ai chatgpt genai langchain llama3 llm machine-learning nlp openai rag retrival-augmented vectordb

Last synced: 10 Oct 2024

https://github.com/thunlp/thuctc

An Efficient Chinese Text Classifier

chinese-nlp nlp

Last synced: 10 Nov 2024

https://github.com/neuml/rag

🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.

large-language-models llm machine-learning nlp python rag retrieval-augmented-generation search txtai

Last synced: 20 Oct 2024

https://github.com/coteries/cedille-ai

✒️ Cedille is a large French language model (6B), released under an open-source license

machine-learning nlg nlp

Last synced: 04 Nov 2024

https://github.com/thunlp/THUCTC

An Efficient Chinese Text Classifier

chinese-nlp nlp

Last synced: 08 Nov 2024

https://github.com/erfanzar/easydel

Accelerate, Optimize performance with streamlined training and serving options with JAX.

easydel flax gpt jax machine-learning mojo nlp optax transformers

Last synced: 07 Nov 2024

https://github.com/explosion/displacy-ent

:boom: displaCy-ent.js: An open-source named entity visualiser for the modern web

css javascript named-entities natural-language-processing nlp spacy visualization

Last synced: 25 Sep 2024

https://github.com/sea-snell/implicit-language-q-learning

Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"

implicit-q-learning iql language-model nlp offline-rl python pytorch q-learning reinforcement-learning

Last synced: 27 Oct 2024

https://github.com/kavgan/rouge-2.0

ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output.

evaluation evaluation-toolkit java metrics nlp rouge rouge-l rouge-n rouge-s rouge-su text-summarization unicode-text

Last synced: 30 Oct 2024

https://github.com/dkpro/dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.

dkpro java natural-language-processing nlp uima uima-components

Last synced: 13 Nov 2024

https://github.com/fanhuaandluomu/parselawdocuments

对收集的法律文档进行一系列分析,包括根据规范自动切分、案件相似度计算、案件聚类、法律条文推荐等(试验目前基于婚姻类案件,可扩展至其它领域)。

law nlp text-classification

Last synced: 12 Nov 2024

https://github.com/MaartenGr/Concept

Concept Modeling: Topic Modeling on Images and Text

computer-vision image-processing nlp topic-modeling

Last synced: 05 Nov 2024

https://github.com/vishwasg217/fin-sight

FinSight - Financial Insights at Your Fingertip: FinSight is a cutting-edge AI assistant tailored for portfolio managers, investors, and finance enthusiasts. It streamlines the process of gaining crucial insights and summaries about a company in a user-friendly manner.

fintech langchain llama-index llms nlp streamlit

Last synced: 10 Oct 2024

https://github.com/stanford-oval/genie-toolkit

The Genie open source kit for voice assistant (formerly known as Almond)

hacktoberfest natural-language nlp semantic-parsers voice-assistant

Last synced: 13 Nov 2024

https://github.com/OpenNewsLabs/guri-vr

https://gurivr.com

nlp virtual-reality vr webvr

Last synced: 06 Aug 2024

https://github.com/textvec/textvec

Text vectorization tool to outperform TFIDF for classification tasks

machine-learning natural-language-processing nlp python text-analysis text-classification text-processing tf-idf

Last synced: 29 Oct 2024

https://github.com/rizerphe/obsidian-companion

Autocomplete your obsidian notes with AI, including ChatGPT, through a copilot-like interface.

ai ai21labs chatgpt groq groq-ai large-language-models llm llm-local nlp obsidian-md obsidian-plugin ollama oobabooga openai

Last synced: 10 Oct 2024

https://github.com/maartengr/concept

Concept Modeling: Topic Modeling on Images and Text

computer-vision image-processing nlp topic-modeling

Last synced: 26 Oct 2024

https://github.com/iPieter/RobBERT

A Dutch RoBERTa-based language model

bert bert-model language-model nlp nlp-resources roberta transformers

Last synced: 03 Aug 2024

https://github.com/WZBSocialScienceCenter/tmtoolkit

Text Mining and Topic Modeling Toolkit for Python with parallel processing power

evaluation nlp parallel-processing python socialscience text-processing topic-modeling

Last synced: 13 Nov 2024

https://github.com/yanndubs/hash-embeddings

PyTorch implementation of Hash Embeddings (NIPS 2017). Submission to the NIPS Implementation Challenge.

embeddings hashing nips nips-challenge nlp pytorch reproducible-research word-embeddings

Last synced: 27 Oct 2024

https://github.com/houbb/word-checker

🇨🇳🇬🇧Chinese and English word spelling corrector.(中文易错别字检测,中文拼写检测纠正。英文单词拼写校验工具)

cc csc english-word java nlp spelling spelling-correction word

Last synced: 07 Nov 2024

https://github.com/milaan9/python_natural_language_processing

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

bag-of-words inversedocumentfrequency ipython-notebook lemmatization named-entity-recognition nlp partofspeech-tagger python4datascience python4everybody sentence-segmentation stemming stopwords termfrequency tf-idf tokenization tutor-milaan9 vocabulary-matching

Last synced: 11 Oct 2024

https://github.com/yaroslavyaroslav/openai-sublime-text

First class Sublime Text AI assistant with GPT-o1 and ollama support!

chatgpt gpt-4 nlp openai sublime-text

Last synced: 11 Nov 2024

https://github.com/guotong1988/NL2SQL-RULE

Content Enhanced BERT-based Text-to-SQL Generation https://arxiv.org/abs/1910.07179

bert deep-learning knowledge knowledge-representation nl2sql nlp pytorch rule-inject-to-model semantic-parsing text2sql

Last synced: 11 Nov 2024

https://github.com/intelligo-mn/neuro

🔮 Neuro.js is machine learning library for building AI assistants and chat-bots.

ai ai-assistants bot chat-bot chat-bots chatbot machine-learning natural-language-processing nlp nodejs

Last synced: 12 Nov 2024

https://github.com/soumyadip007/microsoft-student-partner-workshop-learning-materials-ai-nlp

This repository contains all codes and materials of the current session. It contains the required code on Natural Language Processing, Artificial intelligence.

ai cloud distributed-networking microsoft nlp peer-to-peer workshop

Last synced: 27 Oct 2024

https://github.com/dair-ai/emotion_dataset

:smile: Dataset for Emotion Recognition Research

dataset machine-learning nlp pytorch

Last synced: 10 Nov 2024

https://github.com/ines/spacy-js

🎀 JavaScript API for spaCy with Python REST API

javascript natural-language-processing nlp python rest-api spacy

Last synced: 30 Oct 2024

https://github.com/franck-dernoncourt/pubmed-rct

PubMed 200k RCT dataset: a large dataset for sequential sentence classification.

corpus machine-learning medical nlp randomized-controlled-trials sentence-classification

Last synced: 14 Oct 2024

https://github.com/tomasonjo/neogpt-explorer

Knowledge-graph based chatbot using GPT3 and Neo4j

chatbot gpt-3 graph neo4j nlp streamlit

Last synced: 12 Nov 2024

https://github.com/ropensci/tokenizers

Fast, Consistent Tokenization of Natural Language Text

nlp peer-reviewed r r-package rstats text-mining tokenizer

Last synced: 05 Aug 2024

https://github.com/ShawnyXiao/2017-CCF-BDCI-AIJudge

2017-CCF-BDCI-让AI当法官(初赛):7th/415 (Top 1.68%)

2017 bdci ccf data-mining multiclass-classification nlp

Last synced: 01 Nov 2024

https://github.com/Attempto/APE

Parser for Attempto Controlled English (ACE)

ace attempto cnl nlp swi-prolog

Last synced: 02 Aug 2024

https://github.com/explosion/spacymoji

💙 Emoji handling and meta data for spaCy with custom extension attributes

emoji emoji-unicode emojis natural-language-processing nlp spacy spacy-extension spacy-pipeline

Last synced: 07 Oct 2024

https://github.com/houbb/nlp-hanzi-similar

The hanzi similar tool.(汉字相似度计算工具,中文形近字算法。可用于手写汉字识别纠正,文本混淆等。)

chinese data han nlp ocr word-correction

Last synced: 07 Nov 2024

https://github.com/beader/ruijin_round2

瑞金医院MMC人工智能辅助构建知识图谱大赛复赛

nlp relation-extraction tianchi

Last synced: 12 Nov 2024

https://github.com/princeton-nlp/trime

[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674

language-model nlp

Last synced: 11 Nov 2024

https://github.com/mannefedov/compling_nlp_hse_course

Материалы курса по компьютерной лингвистике Школы Лингвистики НИУ ВШЭ

computational-linguistics course hse machine-learning natural-language-processing nlp python

Last synced: 13 Nov 2024

https://github.com/hrwhisper/SpamMessage

中文垃圾短信识别(手写分类器)

machine-learning nlp python

Last synced: 04 Aug 2024

https://github.com/thammegowda/nllb-serve

Meta's "No Language Left Behind" models served as web app and REST API

machine-translation multilingual nlp transformers translation

Last synced: 12 Nov 2024

https://github.com/martinomensio/spacy-universal-sentence-encoder

Google USE (Universal Sentence Encoder) for spaCy

models nlp spacy tensorflow-hub use

Last synced: 30 Oct 2024

https://github.com/opensemanticsearch/open-semantic-entity-search-api

Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names

api disambiguation entity-extraction knowledge-graph knowledgebase linked-data linked-data-api linkeddata named-entities named-entity-recognition natural-language-processing nlp python reconciliation reconciliation-service rest-api semantic semantic-analysis semantic-annotation thesaurus

Last synced: 27 Oct 2024