Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

https://github.com/Beomi/KcBERT

🤗 Pretrained BERT model & WordPiece tokenizer trained on Korean Comments 한국어 댓글로 프리트레이닝한 BERT 모델과 데이터셋

bert bert-model korean-nlp nlp transformers

Last synced: 09 Nov 2024

https://github.com/koaning/whatlies

Toolkit to help understand "what lies" in word embeddings. Also benchmarking!

embeddings nlp visualisations

Last synced: 29 Oct 2024

https://github.com/huggingface/node-question-answering

Fast and production-ready question answering in Node.js

bert nlp nodejs question-answering tensorflow transformers typescript

Last synced: 30 Oct 2024

https://github.com/ematvey/hierarchical-attention-networks

Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is currently unmaintained, issues will probably not be addressed.

deep-learning document-classification hierarchical-attention-networks machine-learning nlp tensorflow

Last synced: 06 Nov 2024

https://github.com/LingDong-/cope

A modern IDE for writing classical Chinese poetry 格律诗编辑程序

bag-of-words chinese chinese-poetry editor electron ide nlp poetry

Last synced: 01 Nov 2024

https://github.com/ynqa/wego

Word Embeddings (e.g. Word2Vec) in Go!

glove go machine-learning nlp word-embeddings word2vec

Last synced: 29 Oct 2024

https://github.com/hendrikstrobelt/detecting-fake-text

Giant Language Model Test Room

ai nlp visualization

Last synced: 30 Oct 2024

https://github.com/jina-ai/examples

Jina examples and demos to help you get started

deep-learning examples jina neural-search nlp onboarding python semantic-search tutorials

Last synced: 01 Nov 2024

https://github.com/ruu3f/freegpt

freeGPT provides free access to text and image generation models.

ai artificial-intelligence chatgpt deep-learning freegpt gpt gpt4all gpt4free llama llm machine-learning nlp python

Last synced: 10 Oct 2024

https://github.com/imgarylai/bert-embedding

🔡 Token level embeddings from BERT model on mxnet and gluonnlp

bert gluonnlp mxnet natural-language-processing nlp word-embeddings

Last synced: 02 Nov 2024

https://github.com/johanmodin/clifs

Contrastive Language-Image Forensic Search allows free text searching through videos using OpenAI's machine learning model CLIP

ai machine-learning nlp openai python search text video

Last synced: 01 Nov 2024

https://github.com/huggingface/large_language_model_training_playbook

An open collection of implementation tips, tricks and resources for training large language models

cuda large-language-models llm nccl nlp performance python pytorch scalability troubleshooting

Last synced: 02 Aug 2024

https://github.com/vas3k/infomate.club

RSS feed aggregator with collections and NLP article summarization

feed nlp nltk python rss telegram

Last synced: 29 Oct 2024

https://github.com/adbar/German-NLP

Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German

computational-linguistics corpus-linguistics german-language natural-language-processing nlp text-mining

Last synced: 26 Oct 2024

https://github.com/ayoungprogrammer/nlquery

Natural Language Engine on WikiData

dbpedia nlp wikidata

Last synced: 04 Aug 2024

https://github.com/magpie-align/magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

alignment dataset gemma llama2 llama3 llm nlp paper phi3 qwen2 supervised-finetuning synthetic-data synthetic-dataset-generation

Last synced: 10 Oct 2024

https://github.com/Cartus/AGGCN

Attention Guided Graph Convolutional Networks for Relation Extraction (authors' PyTorch implementation for the ACL19 paper)

deep-learning graph-convolutional-networks graph-neural-networks information-extraction nlp relation-extraction

Last synced: 02 Nov 2024

https://github.com/intelligo-mn/intelligo

Intelligo is powerful chatbot builder that enables anyone to create and deploy chatbots anywhere.

ai artificial-intelligence bot bot-framework bots chatbot machine-learning messenger-api messenger-bot messenger-chatbots nlp nodejs slack slack-bot

Last synced: 01 Nov 2024

https://github.com/zzy99/epidemic-sentence-pair

天池 疫情相似句对判定大赛 线上第一名方案

nlp

Last synced: 03 Aug 2024

https://github.com/pochih/RL-Chatbot

🤖 Deep Reinforcement Learning Chatbot

chatbot deep-learning nlp reinforcement-learning seq2seq-model tensorflow

Last synced: 02 Aug 2024

https://github.com/The-FinAI/PIXIU

This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).

aifinance chatgpt fintech gpt-4 large-language-models llama machine-learning named-entity-recognition natural-language-processing nlp pixiu question-answering sentiment-analysis stock-price-prediction text-classification

Last synced: 24 Oct 2024

https://github.com/the-finai/pixiu

This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).

aifinance chatgpt fintech gpt-4 large-language-models llama machine-learning named-entity-recognition natural-language-processing nlp pixiu question-answering sentiment-analysis stock-price-prediction text-classification

Last synced: 09 Nov 2024

https://github.com/houbb/opencc4j

🇨🇳Open Chinese Convert is an opensource project for conversion between Traditional Chinese and Simplified Chinese.(java 中文繁简体转换)

chinese dfa java java7 nlp opencc simple-tranditional trie trie-tree

Last synced: 07 Nov 2024

https://github.com/hyunwoongko/kss

KSS: Korean String processing Suite

korean korean-nlp kss nlp sentences split-sentences

Last synced: 30 Oct 2024

https://github.com/jjangsangy/ExplainToMe

Automatic Web Article Summarizer

docker heroku nlp python textrank

Last synced: 26 Oct 2024

https://github.com/llhthinker/NLP-Papers

Natural Language Processing Papers

deep-learning nlp

Last synced: 10 Nov 2024

https://github.com/Droidtown/ArticutAPI

API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。

artificial-intelligence cws natural-language-processing natural-language-understanding nlp nlu part-of-speech-embdding part-of-speech-tagger pos-tagger pos-tagging

Last synced: 30 Oct 2024

https://github.com/IntelLabs/RAGFoundry

Framework for enhancing LLMs for RAG tasks using fine-tuning.

evaluation fine-tuning information-retrieval llm nlp question-answering rag semantic-search

Last synced: 21 Aug 2024

https://github.com/MuQiuJun-AI/bert4pytorch

超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新

bert nlp pytorch transformer

Last synced: 06 Nov 2024

https://github.com/microsoft/rat-sql

A relation-aware semantic parsing model from English to SQL

dbqa nl2sql nlp program-synthesis question-answering semantic-parsing transformers

Last synced: 07 Oct 2024

https://github.com/erickrf/nlpnet

A neural network architecture for NLP tasks, using cython for fast performance. Currently, it can perform POS tagging, SRL and dependency parsing.

natural-language-processing neural-network nlp parsing pos-tagging semantic-role-labeling

Last synced: 03 Aug 2024

https://github.com/muqiujun-ai/bert4pytorch

超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新

bert nlp pytorch transformer

Last synced: 01 Oct 2024

https://github.com/kunalj101/data-science-hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 11 Oct 2024

https://github.com/airaria/Visual-Chinese-LLaMA-Alpaca

多模态中文LLaMA&Alpaca大语言模型(VisualCLA)

alpaca chinese llama llm lora multimodal nlp vision-language

Last synced: 08 Aug 2024

https://github.com/microsoft/azureml-bert

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning

Last synced: 02 Nov 2024

https://github.com/tomaarsen/spanmarkerner

SpanMarker for Named Entity Recognition

huggingface ner nlp spacy spacy-extension transformers

Last synced: 14 Oct 2024

https://github.com/kunalj101/Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 02 Aug 2024

https://github.com/shibing624/nlp-tutorial

自然语言处理(NLP)教程,包括:词向量,词法分析,预训练语言模型,文本分类,文本语义匹配,信息抽取,翻译,对话。

dialogue language-model machine-translation nlp seq2seq text-classification text-generation torch word-embedding

Last synced: 31 Oct 2024

https://github.com/openmoss/collie

Collaborative Training of Large Language Models in an Efficient Way

deep-learning deepspeed nlp pytorch

Last synced: 09 Nov 2024

https://github.com/OpenMOSS/CoLLiE

Collaborative Training of Large Language Models in an Efficient Way

deep-learning deepspeed nlp pytorch

Last synced: 03 Aug 2024

https://github.com/Microsoft/rat-sql

A relation-aware semantic parsing model from English to SQL

dbqa nl2sql nlp program-synthesis question-answering semantic-parsing transformers

Last synced: 18 Aug 2024

https://github.com/Microsoft/AzureML-BERT

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning

Last synced: 02 Nov 2024

https://github.com/Shixzie/nlp

[UNMANTEINED] Extract values from strings and fill your structs with nlp.

go golang natural-language-processing nlp parse text text-extraction

Last synced: 26 Oct 2024

https://github.com/shixzie/nlp

[UNMANTEINED] Extract values from strings and fill your structs with nlp.

go golang natural-language-processing nlp parse text text-extraction

Last synced: 02 Nov 2024

https://github.com/microsoft/AzureML-BERT

End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service

azure-machine-learning azureml-bert bert bert-model finetuning language-model nlp pretrained-models pretraining pytorch tuning

Last synced: 07 Aug 2024

https://github.com/kermitt2/delft

a Deep Learning Framework for Text https://delft.readthedocs.io/

deep-learning keras ner nlp sequence-labeling text-classification

Last synced: 30 Oct 2024

https://github.com/msg-systems/holmes-extractor

Information extraction from English and German texts based on predicate logic

information-extraction machine-learning nlp ontology python semantics spacy spacy-extension

Last synced: 31 Oct 2024

https://github.com/thunlp/few-nerd

Code and data of ACL 2021 paper "Few-NERD: A Few-shot Named Entity Recognition Dataset"

deep-learning entity-typing few-shot-learning named-entity-recognition nlp

Last synced: 10 Nov 2024

https://github.com/huggingface/tflite-android-transformers

DistilBERT / GPT-2 for on-device inference thanks to TensorFlow Lite with Android demo apps

android nlp tensorflow tensorflow-lite transformers

Last synced: 08 Nov 2024

https://github.com/gutfeeling/beginner_nlp

A curated list of beginner resources in Natural Language Processing

natural-language-processing nlp nlp-resources

Last synced: 07 Aug 2024

https://github.com/graykode/commit-autosuggestions

A tool that AI automatically recommends commit messages.

bert commit-autosuggestions natural-language nlp text-generation

Last synced: 30 Oct 2024

https://github.com/thunlp/Few-NERD

Code and data of ACL 2021 paper "Few-NERD: A Few-shot Named Entity Recognition Dataset"

deep-learning entity-typing few-shot-learning named-entity-recognition nlp

Last synced: 03 Aug 2024

https://github.com/judahpaul16/gpt-home

ChatGPT at home! Basically a better Google Nest Hub or Amazon Alexa home assistant. Built on the Raspberry Pi using the OpenAI API.

ai async automation chatgpt docker fastapi home-assistant home-automation iot llm nginx nlp nodejs openai python raspberry-pi react speech-recognition spotify typescript

Last synced: 09 Nov 2024

https://github.com/neurocult/agency

🕵️‍♂️ Library designed for developers eager to explore the potential of Large Language Models (LLMs) and other generative AI through a clean, effective, and Go-idiomatic approach.

agents ai artificial-general-intelligence artificial-intelligence artificial-neural-networks autonomous-agents chatgpt generative-ai go golang gpt language-models llm llmops machine-learning neural-network nlp openai rag vector-database

Last synced: 06 Nov 2024

https://github.com/qipeng/gcn-over-pruned-trees

Graph Convolution over Pruned Dependency Trees Improves Relation Extraction (authors' PyTorch implementation)

dependency-parse-trees dependency-parsing information-extraction natural-language-processing nlp relation-extraction

Last synced: 02 Nov 2024

https://github.com/mb-14/gomarkov

Markov chains in golang

golang markov-chain nlp

Last synced: 02 Aug 2024

https://github.com/omarsar/nlp_highlights

The most important NLP highlights of 2018 (PDF Report)

analytics artificial-intelligence conversational-ai deep-learning health nlp technology

Last synced: 13 Oct 2024

https://github.com/polm/fugashi

A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.

cython-wrapper japanese mecab nlp tokenizer

Last synced: 02 Aug 2024

https://github.com/dair-ai/nlp_fundamentals

📘 Contains a series of hands-on notebooks for learning the fundamentals of NLP

deep-learning education nlp

Last synced: 10 Nov 2024

https://github.com/kefirski/pytorch_RVAE

Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

deep-learning nlp python pytorch vae

Last synced: 02 Nov 2024

https://github.com/shamspias/customizable-gpt-chatbot

A dynamic, scalable AI chatbot built with Django REST framework, supporting custom training from PDFs, documents, websites, and YouTube videos. Leveraging OpenAI's GPT-3.5, Pinecone, FAISS, and Celery for seamless integration and performance.

artificial-intelligence autogpt chatbot conversational-ai data-preprocessing django django-rest-framework gpt-3 gpt-voice langchain langchain-python longchain machine-learning natural-language-processing nlp python voice-chat voice-recognition voice-to-text voice-transcription

Last synced: 06 Nov 2024

https://github.com/neuml/tldrstory

📊 Semantic search for headlines and story text

machine-learning nlp python search txtai

Last synced: 01 Nov 2024

https://github.com/kakaobrain/word2word

Easy-to-use word-to-word translations for 3,564 language pairs.

bilingual-lexicon-extraction nlp opensubtitles translation

Last synced: 10 Nov 2024

https://github.com/nashex/gpt4-playground

Clone of OpenAI's ChatGPT and Playground environments to enable experimenting with API keys.

gpt4 gpt4-api nextjs nlp openai playground

Last synced: 09 Nov 2024

https://github.com/hit-scir/huozi

活字通用大模型

fine-tuning large-language-models llm nlp

Last synced: 10 Nov 2024

https://github.com/planeshifter/node-word2vec

Node.js interface to the Google word2vec tool.

nlp word2vec

Last synced: 30 Oct 2024

https://github.com/dongjunlee/transformer-tensorflow

TensorFlow implementation of 'Attention Is All You Need (2017. 6)'

attention deep-learning experiments hb-experiment nlp tensorflow transformer translation

Last synced: 08 Nov 2024