An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with bert-embeddings

A curated list of projects in awesome lists tagged with bert-embeddings .

https://github.com/michaelfeil/infinity

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

bert-embeddings llm text-embeddings

Last synced: 13 May 2025

https://github.com/boat-group/fancy-nlp

NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.

bert bert-chinese bert-classifier bert-embeddings bert-ner bilstm-crf bimpm chinese-nlp crf esim keras named-entity-recognition nlp python-library semantic-similarity tensorflow text-classification tf2

Last synced: 05 Apr 2025

https://github.com/abhilash1910/clustertransformer

Topic clustering library built on Transformer embeddings and cosine similarity metrics.Compatible with all BERT base transformers from huggingface.

albert bert-embeddings clustering distilbert embeddings pytorch pytorch-implementation roberta-model transformer

Last synced: 31 Jul 2025

https://github.com/gurpreetkaurjethra/medical-rag-using-bio-mistral-7b

This is a RAG implementation using Open Source stack. BioMistral 7B has been used to build this app along with PubMedBert as an embedding model, Qdrant as a self hosted Vector DB, and Langchain & Llama CPP as an orchestration frameworks.

ai bert-embeddings bootstrap5 deployment docker fastapi generative-ai langchain large-language-models llama mistral pubmed qdrant rag retrieval-augmented-generation sentence-transformers

Last synced: 13 Jul 2025

https://github.com/bhattbhavesh91/bert-topic-modeling

Small tutorial on how you can use BERT for Topic Modeling

bert bert-embeddings bert-model google-bert topic-modeling topic-modelling

Last synced: 12 Aug 2025

https://github.com/olafurjohannsson/edgebert

A pure Rust + WASM implementation for BERT inference with minimal dependencies

bert-embeddings bert-model minilm-l6-v2 rust vector-search wasm

Last synced: 16 Jan 2026

https://github.com/eric11eca/neurallog

A neural-symbolic joint reasoning approach for Natural Language Inference (NLI). Modeling NLI as inference path planning through a search engine. Sequence chunking and neural paraphrase detection for syntactic variation. SOTA result on SICK and MED.

bert-embeddings computational-semantics language-model natural-language-inference natural-language-processing neural-symbolic-reasoning phrase-alignment

Last synced: 06 Oct 2025

https://github.com/r2d4/blog-embeddings

Script to generate embeddings from a blog and use GPT-3.5 to categorize the embedding space

bert-embeddings embeddings gpt t-sne

Last synced: 04 Jul 2025

https://github.com/26hzhang/bert_classification

Token and Sentence Level Classification with Google's BERT (TensorFlow)

bert-bilstm-crf bert-embeddings bert-model tensorflow

Last synced: 12 Oct 2025

https://github.com/sai-123-code/manas-mentalhealth_chatbot

Created a Chatbot using Machine learning models such as SVM,BERT embeddings,Sentiment Analysis and Created GUI in tkinter for above

bert-embeddings chatbot hackathon healthcare machine-learning mental-health scraping sentiment-analysis tkinter-gui

Last synced: 10 Apr 2025

https://github.com/soco-ai/soco-core-python

Python client to use SOCO answer-as-as-service platform.

bert-embeddings nlp-machine-learning question-answering search-engine

Last synced: 14 Dec 2025

https://github.com/bhattbhavesh91/muril-tutorial

Small demo showing how MuRIL (Multilingual Representations for Indian Languages : A BERT model pre-trained on 17 Indian languages) understands Indian Languages better

bert-embeddings bert-model google-muril indian-languages multilingual-representations

Last synced: 17 Apr 2025

https://github.com/benja1972/topicphrase

Simple project for extraction of key-phrases from single document based on Sentence Trasfomers

bert-embeddings clusters embeddings key-phrase-extraction nlp noun-phrases-candidates sentence-transformers topics

Last synced: 22 Jul 2025

https://github.com/kelindar/bert

Go wrapper for bert.cpp embeddings library using cgo/dll approach.

bert bert-embeddings embeddings nlp

Last synced: 06 Sep 2025

https://github.com/shreypandit/volatility-prediction-using-maec-dataset

Problem statement - Implement a solution to forecast stock 'volatility' following earnings calls release of S&P1500 companies.

audio-encoder bert bert-embeddings bert-model bert-models stock stock-market stock-price-prediction stock-prices stock-prices-prediction stock-trading stocks time-series volatility

Last synced: 15 Aug 2025

https://github.com/gpavanb1/lyrics-comparison

Get lyrics using API and perform translation to calculate similarity between languages

ai4bharat bert-embeddings lyrics movies nlp python

Last synced: 10 Apr 2025

https://github.com/wxjiao/bert-text-features

BERT-Text-Features for Tokenized Transcripts from P2FA.

bert-embeddings forced-alignment p2fa text-features

Last synced: 22 Jul 2025

https://github.com/arthurdjn/img2poem-pytorch

PyTorch implementation of the paper ‟Beyond Narrative Description: Generating Poetry from Images” by B. Liu et al., 2018.

bert-embeddings image-captioning image-to-poem image-to-text img2poem img2poem-pytorch liu machine-learning nlp paired-poems pytorch researchmm resnet zhaoyang

Last synced: 13 Apr 2025

https://github.com/ksm26/embedding-models-from-architecture-to-implementation

Understand and build embedding models, focusing on word and sentence embeddings, dual encoder architectures. Learn to train embedding models using contrastive loss, implement them in semantic search and RAG systems.

ai-applications ai-architecture bert bert-embeddings bert-fine-tuning bert-model contrastive-loss dual-encoder embedding-models machine-learning model-training question-answer-retrieval rag-systems semantic-search sentence-embeddings transformer-models word-embeddings word2vec

Last synced: 25 Jul 2025

https://github.com/louisbrulenaudet/tsdae

Tranformer-based Denoising AutoEncoder for Sentence Transformers Unsupervised pre-training.

bert bert-embeddings lemone lemone-io machine-learning nltk pre-training python sentence-transformers transformers tsdae unsupervised-learning

Last synced: 14 Jul 2025

https://github.com/andreped/nlp-mtl

Training neural networks to solve multiple tasks simultaneously from free text through multi-task learning

bert-embeddings keras multi-task-learning natural-language-processing neural-networks nlp scikit-learn

Last synced: 09 May 2026

https://github.com/chris-santiago/bookmarks-topics

Using unsupervised learning and language modeling to cluster and reorganize web bookmarks.

bert-embeddings bertopic bookmarks clustering generative-modeling hdbscan hydra llm openai taskfile umap unsupervised-learning

Last synced: 11 Apr 2026

https://github.com/elaaatif/nlp-project

An NLP Project using BERT model for Tweeter similarity analysis

bert-embeddings bert-model nlp nlp-machine-learning

Last synced: 20 Jun 2025

https://github.com/ornella-gigante/arquitectura_avanzada_ia

The IA work on the Disneyland.csv dataset utilizes feed forward, LSTM, and BERT models to analyze and predict positive and negative reviews in a simple and engaging manner. By employing these models, the analysis provides valuable insights for park managers to enhance the visitor experience based on customer feedback.

advanced-data-structures ai bert-embeddings bert-model feedforward-neural-network fine-tuning lstm lstm-neural-networks

Last synced: 16 Mar 2025

https://github.com/abhi227070/question-answering-from-a-given-paragraph-using-bert

This project is about creating a QA (Question-Answering) system using BERT, a powerful language model by Google. The goal is to train BERT to understand a given text passage and accurately respond to questions related to that passage. This system can be invaluable for tasks requiring efficient information retrieval and comprehension.

bert bert-embeddings bert-fine-tuning bert-model deep-learning deep-neural-networks deeplearning hugging-face huggingface huggingface-transformers natural-language-processing natural-language-understanding python3

Last synced: 25 Apr 2026

https://github.com/nandahkrishna/amld2020

Submission to the AMLD 2020 Transfer Learning for Crisis Response Challenge organised by DEEP

bert bert-embeddings jupyter-notebook machine-learning natural-language-processing nlp python3 transfer-learning transformers

Last synced: 28 Apr 2026

https://github.com/somjit101/nlp-star-trek-scripts

Using digital form of the actual scripts of the 'Star Trek' science fiction series to perform interesting NLP tasks and answering some questions on Topic Modelling, Character properties and the plot as a whole.

bag-of-words bert bert-embeddings cosine-similarity data-mining gensim-doc2vec json lda-model natural-language-processing nlp nlp-machine-learning similarity-matrix star-trek text-mining text-similarity topic-modeling

Last synced: 16 Apr 2026

https://github.com/krisharul26/sentiments-classifier-using-bert

The process of computationally identifying and categorizing opinions expressed in a piece of text, especially to determine whether the writer's attitude towards a particular topic, product, etc. is positive, negative, or neutral. Understanding people’s emotions is essential for businesses since customers are able to express their thoughts and feelings more openly than ever before. By automatically analysing customer feedback, from survey responses to social media conversations, brands are able to listen attentively to their customers, and tailor products and services to meet their needs.

bert-embeddings bert-model deep-learning machine-learning modelevaluation natural-language-processing nlp sentiment-analysis softmax-layer transfer-learning transformers

Last synced: 20 Jun 2025

https://github.com/sayamalt/news-category-classification

Successfully developed a news category classification model using fine-tuned BERT which can accurately classify any news text into its respective category i.e. Politics, Business, Technology and Entertainment.

bert-embeddings exploratory-data-analysis feature-engineering fine-tuning-bert model-evaluation nlp text-classification text-cleaning text-preprocessing text-tokenization

Last synced: 15 Jun 2025

https://github.com/abdulsamie10/emojifyai

EmojifyAI is a Python package that suggests relevant emojis for a given input sentence using natural language processing techniques. It employs the BERT model to generate embeddings for the input sentence and the emojis' descriptions, and then finds the most similar emojis using cosine similarity.

ai artificial-intelligence bert bert-embeddings bert-model bert-models emoji model natural-language-processing nlp python python3 transformer

Last synced: 20 Apr 2026

https://github.com/champ9090/qdrant-self-hosted

🗄️ Run a self-hosted Qdrant vector database using Docker with easy setup, persistent data, and integration for smooth service connections.

agents ai airathalitov airattop audio-search bert-embeddings deployment docker huggingface llama local-ai machine-learning mcp rag retrieval-augmented-generation searxng web-scraping workflows

Last synced: 11 Apr 2026

https://github.com/shudhanshurp/news_recommendation_system

This repository presents a News Recommendation System using Azure Data Factory, Azure Databricks, and Azure Data Lake to create a data pipeline for ML models. It uses BERT for content-based filtering, Neural Collaborative Filtering for user behaviors, and a hybrid model that combines both to enhance news recommendations.

azure azure-data-factory azure-data-lake azure-databricks bert-embeddings ncf pyspark recommendation-system

Last synced: 18 Mar 2025

https://github.com/aranzadata/moviereviewclassifier

Modelo de análisis de sentimientos basado en BERT para 45,000 reseñas de películas, logrando una puntuación F1 de 0.88 al aprovechar técnicas avanzadas de preprocesamiento de texto con NLTK y SpaCy

bert-embeddings nltk spacy

Last synced: 05 May 2026

https://github.com/ammahmoudi/image-captioning-rnn

Image Captioning using Recurrent Neural Networks on Flickr images with pretrained ResNet50 model features.

bert-embeddings deep-learning image-captioning lstm machine-learning ml resnet50 rnn

Last synced: 04 Mar 2025

https://github.com/mohammad95labbaf/bert_medical_diagnosis

This repository uses BERT for medical diagnosis, incorporating data augmentation techniques like backtranslation and paraphrasing to improve model performance in predicting diseases from symptom descriptions.

backtranslation bert bert-embeddings bert-model data-augmentation deep-learning large-language-model large-language-models llm marianmt marianmtmodel medical medical-diagnosis natural-language-processing nlp nlp-library nlp-machine-learning

Last synced: 28 Jun 2025

https://github.com/sayamalt/fake-news-classification-using-fine-tuned-bert

Successfully developed a text classification model to predict whether a given news text is fake or not by fine-tuning a pretrained BERT transformed model imported from Hugging Face.

bert-embeddings bert-model data-analysis data-visualization deep-learning fine-tuning-bert model-evaluation model-training-and-evaluation text-classification text-preprocessing text-tokenization tokenizer-nlp wordcloud-visualization

Last synced: 05 Apr 2025

https://github.com/spoortimorabad/twittersentimentanalysis

Analyzing Twitter sentiment using BERT for real-time insights into public opinion and brand perception

bert-embeddings bert-model mit-license python3 pytorch twitter-sentiment-analysis wordcloud

Last synced: 14 Feb 2026

https://github.com/lucasste/havina

A tool to generate Knowledge Graphs from sentences or evaluate language models' text comprehension.

bert bert-embeddings knowledge-graph language-model python pytorch

Last synced: 12 May 2026

https://github.com/ahmedshahriar/restaurant-menu-pricing

Predict menu prices from 5M+ UberEats menus with an end-to-end MLOps pipeline: crawl → DWH → curate → train → deploy on Azure ML (MLflow) via APIM & CLIs.

azure azureml bert-embeddings docker fastapi github-actions huggingface machine-learning mlflow mlops optuna python restaurant-menu scikit-learn scrapy tensorflow transformers uber-eats web-crawler

Last synced: 03 Feb 2026

https://github.com/abhigyan126/similaritymap

This application calculates document similarities using BERT embeddings. It reads text files from a folder, computes pairwise similarities, and generates a heat-map to visualise the results. The similarity matrix can be saved as a CSV file, and the heat-map as an image.

bert-embeddings cosine-similarity embeddings plagarism-detection

Last synced: 18 Oct 2025

https://github.com/mahnoorsheikh16/explainable-fake-news-detection-and-personalized-credible-recommendation-via-graphml

System for detecting fake news and suggesting credible alternatives. Takes a news URL and outputs a credibility score, explanation, and top reliable sources. Uses TF-IDF + Logistic Regression, XGBoost, and DistilBERT with hybrid BERT–LightGCN models, plus SHAP and GNNExplainer for interpretability.

bert-embeddings binary-classification distilbert embedding-models fake-news-detection gnn-explainer graphml graphsage lightgcn logistic-regression pytorch recommendation-system shap tf-idf xgboost-classifier

Last synced: 08 Oct 2025

https://github.com/jhroy/facebook-franco

Full code and most data (in accordance with CrowdTangle’s Terms of Service) supporting an article on what would remain on French-language Facebook if news content was removed

belgique belgium bert bert-embeddings camembert-model canada content-analysis facebook flaubert francais france french-speaking nlp python quebec spacy suisse switzerland text-mining topic-modeling

Last synced: 21 Jan 2026

https://github.com/sathish-1804/content-based-image-search

A simple application that lets you search for images using natural language. Describe what you want to see, and the app will find and display relevant images based on your description.

bert-embeddings bert-model faiss python3 streamlit

Last synced: 27 Jan 2026

https://github.com/blacksujit/quantumlens

QuantumLens is a cutting-edge, AI-powered information assistant designed to revolutionize how you interact with and process information. By leveraging advanced machine learning algorithms and natural language processing techniques.

ai bert bert-embeddings dataanalysis information integration-flow intellij-idea ml model models nlp-machine-learning processing project research spacy spacy-models spacy-nlp spacy-pipeline summeriza summerization

Last synced: 16 Apr 2026

https://github.com/rrayhka/information-retrieval-bert-bm25

Search Engine untuk mengambil keputusan Mahkamah Agung Indonesia menggunakan BERT embedding dan model BM25.

bert-embeddings bm25 information-retrieval mahkamahagung nlp putusan search-engine

Last synced: 28 Apr 2026

https://github.com/t4vexx/ia-classification-analysis

This GitHub repository hosts my AI evaluation work, featuring a Kaggle dataset analysis, experiments with three ML algorithms (including hyperparameter tuning), and a detailed exploration of wine quality data through outlier detection, correlation, and normalization techniques.

ai artificial-intelligence bert-embeddings bert-model confusion-matrix correlation-analysis embedding fake-news-detection logistic-regression python3 random-forest-classifier svm-classifier

Last synced: 01 May 2026

https://github.com/antoniakras/semantic-video-search

GPU-optimized semantic search on video transcripts, with benchmarking of FAISS, Pinecone, and PostgreSQL vector databases. Deployed via Docker on FORTH’s GPU infrastructure.

bert-embeddings bert-fine-tuning cuda dokcer embedding-models embeddings-word2vec faiss-vector-database gpu-computing huggingface-transformers nlp-machine-learning pgvector pineconedb postgresql python pytorch retrieval-augmented-generation similarity-search vector-database whisper-ai

Last synced: 03 May 2026

https://github.com/abdelrahman-elshahed/financial_news_sentiment_analysis_ranking

A deep learning-powered platform that analyzes sentiment in financial news articles, ranks them by importance, and delivers actionable insights through an interactive dashboard to inform investment decisions.

bert-embeddings deep-learning-algorithms docker-image fastapi mlflow mlops natural-language-processing postman-api restful-api sentiment-analysis streamlit

Last synced: 03 May 2026

https://github.com/emms21/rag_system

RAG architecture to retrieve and embed pdfs

bert-embeddings openai python rag sveltekit vector-database vector-search

Last synced: 07 May 2026

https://github.com/sayamalt/squad-question-answering-using-bert

Successfully leveraged a pretrained BERT Transformer model for developing a question answering system.

bert bert-embeddings nlp question-answering-system text-generation

Last synced: 15 Jun 2025