Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/vortezwohl/ceo-agentic-ai-framework

An ultra-lightweight Agentic AI framework based on the ReAct paradigm, supporting mainstream LLMs and is stronger than Swarm.

agent agi ai ai-agent autogen crewai framework gpt langchain llm llm-framework multiagent nlp openai python swarm tool-learning tool-oriented-learning tools transformer

Last synced: 09 Jan 2025

https://github.com/natasha/razdel

Rule-based token, sentence segmentation for Russian language

nlp python russian sentence-boundary-detection sentence-segmentation tokenization

Last synced: 03 Jan 2025

https://github.com/PlanTL-GOB-ES/lm-spanish

Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

benchmarks corpora embeddings language-model nlp transformers

Last synced: 22 Nov 2024

https://github.com/zilliztech/akcio

Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses vector databases to fetch relevant documents to enhance the quality and relevance of the output.

artificial-intelligence chatbot chatgpt dolly embeddings ernie-bot fastapi gradio langchain llm milvus minimax nlp openai retrieval-augmented-generation retrieval-chatbot semantic-search towhee

Last synced: 29 Nov 2024

https://github.com/cbaziotis/neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)

attention attention-mechanism attention-mechanisms attention-scores attention-visualization deep-learning deep-learning-library deep-learning-visualization natural-language-processing nlp self-attention self-attentive-rnn text-visualization visualization vuejs

Last synced: 06 Nov 2024

https://github.com/irlab-sdu/fuzi.mingcha

夫子•明察司法大模型是由山东大学、浪潮云、中国政法大学联合研发,以 ChatGLM 为大模型底座,基于海量中文无监督司法语料与有监督司法微调数据训练的中文司法大模型。该模型支持法条检索、案例分析、三段论推理判决以及司法对话等功能,旨在为用户提供全方位、高精准的法律咨询与解答服务。

chatglm-6b judicial large-language-models legal legal-ai legalai llms nlp pretrained-models

Last synced: 02 Nov 2024

https://github.com/princeton-nlp/WebShop

[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

decision-making language language-grounding ml nlp rl rl-environment shopping sim-to-real web-based

Last synced: 09 Nov 2024

https://github.com/kanyun-inc/fairseq-gec

Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data

grammar nlp

Last synced: 09 Jan 2025

https://github.com/vortezwohl/CEO-Agentic-AI-Framework

An ultra-lightweight Agentic AI framework based on the ReAct paradigm, supporting mainstream LLMs and is stronger than Swarm.

agent agi ai ai-agent autogen crewai framework gpt langchain llm llm-framework multiagent nlp openai python swarm tool-learning tool-oriented-learning tools transformer

Last synced: 06 Jan 2025

https://github.com/houbb/nlp-hanzi-similar

The hanzi similar tool.(汉字相似度计算工具,中文形近字算法。可用于手写汉字识别纠正,文本混淆等。)

chinese data han nlp ocr word-correction

Last synced: 07 Jan 2025

https://github.com/webanno/webanno

🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/inception 👈 -- ⚠️ The official WebAnno repository has reached the end of the line. -- 🚀 To migrate, export your annotation projects from WebAnno, then import them into INCEpTION and just work on.

annotation annotation-editor annotation-tool java nlp web-application

Last synced: 29 Oct 2024

https://webanno.github.io/webanno/

🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/inception 👈 -- ⚠️ The official WebAnno repository has reached the end of the line. -- 🚀 To migrate, export your annotation projects from WebAnno, then import them into INCEpTION and just work on.

annotation annotation-editor annotation-tool java nlp web-application

Last synced: 28 Oct 2024

https://github.com/IBM/transition-amr-parser

SoTA Abstract Meaning Representation (AMR) parsing with word-node alignments in Pytorch. Includes checkpoints and other tools such as statistical significance Smatch.

abstract-meaning-representation amr amr-graphs amr-parser amr-parsing machine-learning nlp semantic-parsing

Last synced: 11 Nov 2024

https://github.com/neomatrix369/nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

google-colab grammar-checks hacktoberfest jupyter kaggle-kernels natural-language-processing nlp nlp-keywords-extraction nlp-library nlp-machine-learning nlp-parsing nlp-profiler profiler profiling profiling-datasets text-mining

Last synced: 04 Jan 2025

https://github.com/davidberenstein1957/concise-concepts

This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.

few-shot-classifcation gensim hacktoberfest machine-learning natural-language-processing ner nlp spacy

Last synced: 08 Jan 2025

https://github.com/louisfb01/best_ai_papers_2023

A curated list of the latest breakthroughs in AI (in 2023) by release date with a clear video explanation, link to a more in-depth article, and code.

ai artificial-intelligence computer-vision machine-learning ml nlp paper papers python research state-of-the-art

Last synced: 08 Jan 2025

https://github.com/explosion/spacy-services

💫 REST microservices for various spaCy-related tasks

falcon natural-language-processing nlp rest-api rest-microservice spacy

Last synced: 25 Sep 2024

https://github.com/brutalcoding/aub.ai

AubAI brings you on-device gen-AI capabilities, including offline text generation and more, directly within your app.

android dart flutter gemini gemini-nano gen-ai genai indiedev ios ipados linux llamacpp localllama macos mistral-7b native-apps nlp on-device on-device-ai pubdev

Last synced: 08 Jan 2025

https://github.com/houbb/word-checker

🇨🇳🇬🇧Chinese and English word spelling corrector.(中文易错别字检测,中文拼写检测纠正。英文单词拼写校验工具)

cc csc english-word java nlp spelling spelling-correction word

Last synced: 07 Jan 2025

https://github.com/ajdavidl/portuguese-nlp

List of resources and tools developed with focus on Portuguese.

nlp portuguese portuguese-language

Last synced: 20 Nov 2024

https://github.com/lucasmccabe/emailgpt

a quick and easy interface to generate emails with ChatGPT

chatgpt gpt nlp openai productivity streamlit

Last synced: 09 Jan 2025

https://github.com/ajdavidl/Portuguese-NLP

List of resources and tools developed with focus on Portuguese.

nlp portuguese portuguese-language

Last synced: 12 Nov 2024

https://github.com/devmount/germanwordembeddings

Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.

deep-learning deep-neural-networks evaluation gensim german-language model natural-language-processing neural-network nlp training word-embeddings word2vec

Last synced: 06 Jan 2025

https://github.com/ha-lins/metalearning4nlp-papers

A list of recent papers about Meta / few-shot learning methods applied in NLP areas.

dialogue-systems few-shot-learning low-resource meta-learning nlp papers-collection semantic-parsing

Last synced: 05 Dec 2024

https://github.com/telekom/create-tsi

Create-tsi is a generative AI RAG toolkit which generates AI Applications with low code.

ai chatbot llm machine-learning nlp openai-api rag transformer

Last synced: 03 Jan 2025

https://github.com/hxu296/nlp-resume-parser

NLP-powered, GPT-3 enabled Resume Parser from PDF to JSON.

gpt-3 nlp nlp-parsing open-ai parser resume resume-parer

Last synced: 09 Nov 2024

https://github.com/swabhs/open-sesame

A frame-semantic parsing system based on a softmax-margin SegRNN.

crf deep-learning dynet frame-semantic-parsing natural-language-processing nlp python27

Last synced: 08 Jan 2025

https://github.com/j-min/dl-for-chatbot

Deep Learning / NLP tutorial for Chatbot Developers

chatbot deep-learning nlp pytorch tutorial

Last synced: 19 Nov 2024

https://github.com/maxim5/cs224n-2017-winter

All lecture notes, slides and assignments from CS224n: Natural Language Processing with Deep Learning class by Stanford

cs224n deep-learning machine-learning nlp stanford-nlp

Last synced: 23 Dec 2024

https://github.com/seopbo/nlp_classification

Implementing nlp papers relevant to classification with PyTorch, gluonnlp

classification korean-nlp nlp pytorch-implementation pytorch-nlp text-classification

Last synced: 09 Jan 2025

https://github.com/as-ideas/headliner

🏖 Easy training and deployment of seq2seq models.

neural-network nlp python seq2seq tensorflow

Last synced: 07 Nov 2024

https://github.com/lucasmccabe/emailGPT

a quick and easy interface to generate emails with ChatGPT

chatgpt gpt nlp openai productivity streamlit

Last synced: 07 Nov 2024

https://github.com/jaidevd/numerizer

A Python module to convert natural language numerics into ints and floats.

information-extraction nlp regular-expressions spacy spacy-extension

Last synced: 04 Jan 2025

https://github.com/vrasneur/pyfasttext

Yet another Python binding for fastText

fasttext machine-learning nlp numpy python python-bindings word-vectors

Last synced: 07 Nov 2024

https://github.com/BLLIP/bllip-parser

BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.

ai artificial-intelligence computational-linguistics machine-learning natural-language-processing nlp nlp-library parsing

Last synced: 30 Oct 2024

https://github.com/natasha/slovnet

Deep Learning based NLP modeling for Russian language

bert deep-learning machine-learning morphology ner nlp python pytorch russian syntax

Last synced: 04 Jan 2025

https://github.com/daac-tools/vaporetto

🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer

analyzer japanese morphological-analysis nlp rust segmentation tokenization tokenizer

Last synced: 03 Jan 2025

https://github.com/hppRC/bert-classification-tutorial

【2023年版】BERTによるテキスト分類

bert deep-learning japanese nlp python pytorch transformers

Last synced: 06 Nov 2024

https://github.com/hpprc/bert-classification-tutorial

【2023年版】BERTによるテキスト分類

bert deep-learning japanese nlp python pytorch transformers

Last synced: 08 Jan 2025

https://github.com/FedML-AI/FedNLP

FedNLP: An Industry and Research Integrated Platform for Federated Learning in Natural Language Processing, Backed by FedML, Inc. The Previous Research Version is Accepted to NAACL 2022

federated-learning machine-learning natural-language-processing nlp

Last synced: 11 Nov 2024

https://github.com/fedml-ai/fednlp

FedNLP: An Industry and Research Integrated Platform for Federated Learning in Natural Language Processing, Backed by FedML, Inc. The Previous Research Version is Accepted to NAACL 2022

federated-learning machine-learning natural-language-processing nlp

Last synced: 01 Jan 2025

https://github.com/zoho/hawking

A Natural Language Date Time Parser that Extract date and time from text with context and parse to the required format

configurations-hawking corenlp date-parser dateformat datetime datetimepicker machinelearning nlp nlp-libary parse-dates timeparser timezones

Last synced: 09 Jan 2025

https://github.com/vzhong/embeddings

Fast, DB Backed pretrained word embeddings for natural language processing.

deep-learning neural-network nlp

Last synced: 04 Jan 2025

https://github.com/sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

bert correference-resolution entity-linking entity-typing natural-language-inference nlp prompt-learning sentence-classification sentiment-analysis tensorflow text-classification zero-shot

Last synced: 16 Nov 2024

https://github.com/mindflowai/mindflow

🧠 AI-powered CLI git wrapper, boilerplate code generator, chat history manager, and code search engine to streamline your dev workflow 🌊

chat-gpt cli code-generation command-line-interface dev-tools git git-wrapper information-retrieval large-language-models llm machine-learning modern-dev-tools nlp openai openai-api python search search-engine

Last synced: 29 Oct 2024

https://github.com/openvenues/node-postal

NodeJS bindings to libpostal for fast international address parsing/normalization

address address-parser binding international native nlp

Last synced: 04 Jan 2025

https://github.com/soskek/bert-chainer

Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

bert chainer google natural-language-processi natural-language-understanding nlp transformer

Last synced: 19 Dec 2024

https://github.com/cohere-ai/sandbox-topically

Topic modeling helpers using managed language models from Cohere. Name text clusters using large GPT models.

machine-learning nlp python topic-modeling

Last synced: 09 Jan 2025

https://github.com/mmxgn/spacy-clausie

Implementation of the ClausIE information extraction system for python+spacy

clausie information-extraction nlp problog python-spacy spacy

Last synced: 30 Sep 2024

https://github.com/IngestAI/embedditor

⚡ GUI for editing LLM vector embeddings. No more blind chunking. Upload content in any file extension, join and split chunks, edit metadata and embedding tokens + remove stop-words and punctuation with one click, add images, and download in .veml to share it with your team.

datapreprocessing datascience embedding-vectors embeddings genai laravel llm markup-language ml nlp nltk php vector-database vector-search vectorization veml

Last synced: 31 Oct 2024

https://github.com/naver/claf

CLaF: Open-Source Clova Language Framework

clova framework language natural-language-processing nlp pytorch

Last synced: 06 Jan 2025

https://github.com/renjunxiang/Competition_CAIL

2018中国‘法研杯’法律智能挑战赛(CAIL2018)个人作品

competition nlp

Last synced: 25 Nov 2024

https://github.com/bnosac/udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

conll dependency-parser lemmatization natural-language-processing nlp pos-tagging r r-package r-pkg rcpp text-mining tokenizer udpipe

Last synced: 04 Jan 2025

https://github.com/yaroslavyaroslav/openai-sublime-text

First class Sublime Text AI assistant with GPT-o1 and ollama support!

chatgpt gpt-4 nlp openai sublime-text

Last synced: 04 Jan 2025

https://github.com/himkt/konoha

🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.

janome japanese kytea mecab natural-language-processing nlp sentencepiece sudachi text-processing

Last synced: 08 Jan 2025

https://github.com/ticki/eudex

A blazingly fast phonetic reduction/hashing algorithm.

nlp

Last synced: 09 Jan 2025

https://github.com/akaza-im/akaza

Yet another Japanese IME for IBus/Linux

ibus ime nlp rust

Last synced: 07 Nov 2024

https://github.com/kavgan/rouge-2.0

ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output.

evaluation evaluation-toolkit java metrics nlp rouge rouge-l rouge-n rouge-s rouge-su text-summarization unicode-text

Last synced: 03 Jan 2025

https://github.com/davidberenstein1957/classy-classification

This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.

few-shot-classifcation hacktoberfest machine-learning natural-language-processing nlp nlu sentence-transformers spacy text-classification

Last synced: 04 Jan 2025

https://github.com/akoksal/Turkish-Word2Vec

Pre-trained Word2Vec Model for Turkish

gensim nlp turkish word2vec

Last synced: 12 Nov 2024

https://github.com/vipul-sharma20/sharingan

Tool to extract news articles from newspaper and give the context about the news

context-extraction news-extraction nlp opencv

Last synced: 10 Nov 2024

https://github.com/umarbutler/semchunk

A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.

chunking nlp python semantic-chunking splitting text text-chunking text-splitting

Last synced: 04 Jan 2025

https://github.com/erfanzar/easydel

Accelerate, Optimize performance with streamlined training and serving options with JAX.

easydel flax gpt jax nlp optax transformers

Last synced: 04 Jan 2025

https://github.com/Fixy-TR/fixy

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.

acikhack2 ai artificial-intelligence bert data-science deep-learning deeplearning keras natural-language-processing neural-network neural-networks nlp python

Last synced: 12 Nov 2024

https://github.com/erfanzar/EasyDeL

Accelerate, Optimize performance with streamlined training and serving options with JAX.

easydel flax gpt jax machine-learning mojo nlp optax transformers

Last synced: 16 Nov 2024

https://github.com/ayaka14732/llama-2-jax

JAX implementation of the Llama 2 model

jax llama llama2 natural-language-processing nlp

Last synced: 03 Jan 2025

https://github.com/thunlp/thuctc

An Efficient Chinese Text Classifier

chinese-nlp nlp

Last synced: 10 Nov 2024

https://github.com/vishwasg217/fin-sight

FinSight - Financial Insights at Your Fingertip: FinSight is a cutting-edge AI assistant tailored for portfolio managers, investors, and finance enthusiasts. It streamlines the process of gaining crucial insights and summaries about a company in a user-friendly manner.

fintech langchain llama-index llms nlp streamlit

Last synced: 08 Jan 2025

https://github.com/sea-snell/implicit-language-q-learning

Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"

implicit-q-learning iql language-model nlp offline-rl python pytorch q-learning reinforcement-learning

Last synced: 29 Dec 2024

https://github.com/neuml/rag

🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.

large-language-models llm machine-learning nlp python rag retrieval-augmented-generation search txtai

Last synced: 20 Oct 2024

https://github.com/coteries/cedille-ai

✒️ Cedille is a large French language model (6B), released under an open-source license

machine-learning nlg nlp

Last synced: 04 Nov 2024

https://github.com/ymcui/lert

LERT: A Linguistically-motivated Pre-trained Language Model(语言学信息增强的预训练模型LERT)

bert lert nlp plm pre-train pytorch tensorflow transformer

Last synced: 06 Dec 2024

https://github.com/thunlp/THUCTC

An Efficient Chinese Text Classifier

chinese-nlp nlp

Last synced: 08 Nov 2024

https://github.com/FanhuaandLuomu/ParseLawDocuments

对收集的法律文档进行一系列分析,包括根据规范自动切分、案件相似度计算、案件聚类、法律条文推荐等(试验目前基于婚姻类案件,可扩展至其它领域)。

law nlp text-classification

Last synced: 25 Nov 2024

https://github.com/maartengr/concept

Concept Modeling: Topic Modeling on Images and Text

computer-vision image-processing nlp topic-modeling

Last synced: 06 Jan 2025