Projects in Awesome Lists tagged with text-similarity
A curated list of projects in awesome lists tagged with text-similarity .
https://github.com/srbhr/resume-matcher
Resume Matcher is an open source, free tool to improve your resume. It works by using AI, Reader LLMs, to compare and rank resumes with job descriptions.
applicant-tracking-system ats hacktoberfest machine-learning natural-language-processing nextjs python resume resume-builder resume-parser text-similarity typescript vector-search word-embeddings
Last synced: 08 Apr 2025
https://github.com/srbhr/Resume-Matcher
Resume Matcher is an open source, free tool to improve your resume. It works by using language models to compare and rank resumes with job descriptions.
applicant-tracking-system ats hacktoberfest machine-learning natural-language-processing nextjs python resume resume-builder resume-parser text-similarity typescript vector-search word-embeddings
Last synced: 26 Mar 2025
https://github.com/shibing624/text2vec
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
embeddings nlp sentence-embeddings similarity text-similarity text2vec word2vec
Last synced: 08 Apr 2025
https://github.com/cluebenchmark/cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
chinese corpus datasets knowledge-graph machine-reading-comprehension machine-translation match ner nlp qa sentiment-analysis text-classification text-similarity text-summarization
Last synced: 13 Apr 2025
https://github.com/CLUEbenchmark/CLUEDatasetSearch
搜索所有中文NLP数据集,附常用英文NLP数据集
chinese corpus datasets knowledge-graph machine-reading-comprehension machine-translation match ner nlp qa sentiment-analysis text-classification text-similarity text-summarization
Last synced: 28 Mar 2025
https://github.com/seanlee97/angle
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
dense-retrieval embeddings information-retrieval llama llama2 llm mteb rag retrieval-augmented-generation semantic-similarity semantic-textual-similarity sentence-embedding sentence-embeddings sentence-vector sts stsbenchmark text-embedding text-similarity text-vector text2vec
Last synced: 13 Apr 2025
https://github.com/nlpodyssey/cybertron
Cybertron: the home planet of the Transformers in Go
bart bert bert-as-service deep-learning huggingface machine-learning machine-translation named-entity-recognition natural-language-processing nlp question-answering summarization text-categorization text-classification text-similarity transformers translation zero-shot-classification
Last synced: 04 Apr 2025
https://github.com/dodona-edu/dolos
:detective: Source code plagiarism detection
academic-dishonesty code-similarity collusion-detection dodona education fuzzy-matching hacktoberfest learn-to-code online-learning plagiarism plagiarism-checker plagiarism-checking plagiarism-detection plagiarism-detector plagiarism-prevention software-plagiarism source-code-analysis text-similarity
Last synced: 05 Apr 2025
https://github.com/tlatkowski/multihead-siamese-nets
Implementation of Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task.
attention deep-architectures deep-learning deep-neural-networks multihead-attention multihead-attention-networks natural-language-processing nlp paraphrase paraphrase-identification python3 quora-question-pairs semantic-similarity sentence-similarity siamese-cnn siamese-lstm siamese-neural-network snli tensorflow text-similarity
Last synced: 13 Apr 2025
https://github.com/lonepatient/torchblocks
A PyTorch-based toolkit for natural language processing
advertising bert multilabel-classification named-entity-recognition nlp pytorch relation-classification siamese-network text-classification text-similarity transformers triplet-loss
Last synced: 07 Apr 2025
https://github.com/nityansuman/marvin
Web app to automatically generate subjective or an objective test and evaluate user responses without any human intervention in an efficient and automatic manner using machine learning and natural language processing.
examination examination-system final-year-project flask flask-application machine-learning natural-language-processing nltk python text-similarity
Last synced: 16 Mar 2025
https://github.com/adhaamehab/textblob-ar
Arabic support for textblob
arabic-language arabic-nlp machine-learning natural-language-processing nlp part-of-speech-tagger sentiment-analysis spelling-correction text-classification text-similarity textblob word-embeddings
Last synced: 14 Nov 2024
https://github.com/ddangelov/restful-top2vec
Expose a Top2Vec model with a REST API.
document-embedding fastapi rest-api restful-api semantic-search semantic-search-engine text-search text-similarity top2vec topic-model topic-modeling word-embedding
Last synced: 12 Nov 2024
https://github.com/zake7749/CIKM-AnalytiCup-2018
[ACM-CIKM] 2nd place solution at CIKM AnalytiCup 2018, a task for determining short text similarities.
cikm keras natural-language-processing natural-language-understanding semantic-matching semantic-similarity text-similarity
Last synced: 27 Nov 2024
https://github.com/Lipairui/textgo
Text preprocessing, representation, similarity calculation, text search and classification. Let's go and play with text!
bert nlp text-classification text-preprocessing text-representation text-search text-similarity
Last synced: 25 Nov 2024
https://github.com/giacbrd/python-dandelion-eu
A python client for connecting to all the services provided by https://dandelion.eu
api api-client api-wrapper entity-extraction entity-linking language-detection machine-learning python semantic-analysis semantic-similarity sentiment-analysis text-analysis text-classification text-mining text-similarity wikification wikipedia wikipedia-api
Last synced: 17 Mar 2025
https://github.com/sljavi/text-sound-similarity
JavaScript library useful to find degrees of similarity between text's phonetics
metaphone phonetics text-phonetics-similarity text-similarity text-sound
Last synced: 11 Nov 2024
https://github.com/chiragjn/short-text-similarity
Short Text Similarity as described in https://dl.acm.org/citation.cfm?id=2806475
semantic-similarity short-text-semantic-similarity sts text-similarity word-embeddings word-vectors
Last synced: 06 Dec 2024
https://github.com/stephangeorg/trigram-similarity
Determining the similarity of alphanumeric text based on trigram matching.
ngrams postgres similarity text-similarity trigram trigrams
Last synced: 12 Nov 2024
https://github.com/osainz59/aqgsas
Automatic Question Generation and Short Answer Scoring system
deep-learning exercise-generator machine-learning question-answering question-generation reading-comprehension text-similarity
Last synced: 13 Apr 2025
https://github.com/rhnvrm/textsimilarity
go package that provides similarity between two string documents using cosine similarity and tf-idf along with various other useful things.
cosine-similarity golang google keyword-extraction nlp similarity text-similarity tf-idf
Last synced: 11 Apr 2025
https://github.com/kvarun07/asag-gt
Multi-Relational Graph Transformer for Automatic Short Answer Grading (NAACL 2022)
asag attention automatic-short-answer-grading deep-learning graph graph-learning graph-similarity knowledge-graphs mitigate multi-relational nlp python text-similarity transformer
Last synced: 10 Apr 2025
https://github.com/themaximalist/vectordb.js
Simple in-memory vector database for text similarity in Node.js
embeddings feature-extraction hnsw nodejs openai text-similarity vectordb
Last synced: 03 Jan 2025
https://github.com/rosette-api-community/text-embeddings-sample
A little python code to show how to get similarity between word embeddings returned from the Rosette API's new /text-embedding endpoint.
machine-learning natural-language-processing nlp python text-embedding text-extraction text-similarity word-similarity
Last synced: 28 Feb 2025
https://github.com/maastrichtu-ids/docona
DoConA (Document Content and Citation Analysis Pipeline) is an open source, configurable and extensible Python tool to analyse the level of agreement between the citation network of a set of textual documents and the textual similarity of these documents.
citation-analysis citation-network text-similarity word-embeddings
Last synced: 14 Feb 2025
https://github.com/eren23/semantic-code-searcher
Basic example for searching code semantically in github profiles. In python
bert bert-embeddings code-similarity cosine-distance cosine-similarity embeddings large-language-models llm neural-network openai roberta search search-algorithm semantic sentence-embeddings sentence-transformers text-similarity
Last synced: 26 Mar 2025
https://github.com/kamilhan-karaismailoglu/tversky-text-similarity-calculation-program
Program to Calculate Text Similarity ratio using Tversky Index, Sørensen–Dice coefficient and Jaccard Index. Made with C#. This program was written for the Algorithms and Programming lecture.
jaccard-similarity sorensen-dice text-similarity tversky
Last synced: 05 Mar 2025
https://github.com/ikram-maulana/text-similarity-api
🚀 Text Similarity API is a free and open source API for comparing text similarity
api-rest next-13 similarity-api text-similarity
Last synced: 26 Feb 2025
https://github.com/a-poor/jarowinkler
An implementation of the Jaro-Winkler string similarity algorithm in Go.
jaro-winkler nlp text-similarity
Last synced: 27 Mar 2025
https://github.com/md-emon-hasan/distilbert-model-with-hf-transformer
📝 DistilBERT, a lightweight Transformer model from Hugging Face, for various NLP tasks without requiring custom fine-tuning or datasets.
ai-interface conversational-ai deep-learning distilbert huggingface-transformers language-model machine-learning named-entity-recognition nlp-models pretrained-models question-answering sentiment-analysis text-classification text-similarity text-summarization transformer
Last synced: 05 Apr 2025
https://github.com/somjit101/nlp-star-trek-scripts
Using digital form of the actual scripts of the 'Star Trek' science fiction series to perform interesting NLP tasks and answering some questions on Topic Modelling, Character properties and the plot as a whole.
bag-of-words bert bert-embeddings cosine-similarity data-mining gensim-doc2vec json lda-model natural-language-processing nlp nlp-machine-learning similarity-matrix star-trek text-mining text-similarity topic-modeling
Last synced: 06 Mar 2025
https://github.com/tomlin7/ai-research-assistant
Semantic document search system with pgvector and PGAI
ai assistant document-search machine-learning natural-language-processing ollama pgai pgvector postgres postgresql research-assistant semantic-search sentence-embeddings sentence-transformers sentiment-analysis summarization text-similarity text-summarization
Last synced: 26 Feb 2025
https://github.com/antonio-f/find-duplicate-questions
Find duplicate questions on StackOverflow by their embeddings. From the Natural Language Processing course - Coursera's Advanced Machine Learning specialization.
cosine-similarity discounted-cumulative-gain embeddings gensim natural-language-processing nlp nltk scikit-learn starspace text-similarity word2vec
Last synced: 30 Mar 2025
https://github.com/soumyagautam/text-similarity
Software based on flask, used to check the most similar texts out of a list.
bert bert-model flask flask-application roberta roberta-model software text text-similarity
Last synced: 20 Feb 2025
https://github.com/maemresen/similarity-detection
An example project to detect cheats in an exam with using similarity detection.
data-mining data-science java similarity-analysis text-similarity
Last synced: 01 Mar 2025
https://github.com/chris-santiago/stringcluster
A Scikit-Learn style deduper.
dedupe deduplication scikit-learn text-processing text-similarity transformer
Last synced: 02 Apr 2025
https://github.com/vikrantdeshpande09876/masterize_hospital_entities
The goal was to maintain a ‘single version of truth’ for associated entities across the entire organization’s data sources. The RecordLinkage package is integrated with a wrapper recursive data-pipeline for de-duplicating of records and generating a master set. Similarity between two textual strings determines if they are a probabilistic match.
c data-wrangling divide-and-conquer-approach etl-pipeline levenshtein-distance linkages machine-learning master-data-management pyspark python r text-similarity visualization
Last synced: 20 Feb 2025
https://github.com/tanyachutani/siamese-network-on-text-data
Siamese Network On Quora Question Pairs Similarity Data Keras
cnn-on-text deep-learning gru keras lstm one-shot-learning quora-question-pairs siamese-network siamese-neural-network text-similarity word2vec
Last synced: 27 Feb 2025
https://github.com/mintone-creators/string-proximity
String-proximity is a high-performance string comparison library built in Rust, offering efficient similarity and proximity functions.
levenstein-distance string-comparison string-manipulation text-similarity
Last synced: 09 Mar 2025
https://github.com/ycatsh/past-pilot
Manage school resources and navigate past papers with ease.
flask machine-learning nlp nlp-machine-learning python text-similarity
Last synced: 23 Feb 2025
https://github.com/atinyshrimp/tripadvisor-recommendation-ml-nlp
Machine Learning and NLP models for improving text-based recommendations on TripAdvisor, using BM25, TF-IDF, embeddings, and a Hybrid approach.
bm25 data-science embeddings kaggle-dataset machine-learning nlp nlp-machine-learning python recommandation-system sentence-embeddings sentence-transformers text-similarity tripadvisor
Last synced: 27 Mar 2025
https://github.com/kaoutarmi/contextual-text-similarity
Contextual Text Similarity with Sentence-BERT est un projet permettant de mesurer la similarité entre des phrases en utilisant Sentence-BERT et la similarité cosinus. Il permet de retrouver les phrases les plus proches contextuellement d'une phrase donnée à partir d'un dataset.
cosine-similarity huggingface machine-learning nlp python semantic-search sentence-bert text-similarity
Last synced: 07 Mar 2025
https://github.com/jokerdii/nlp-projects
Built text classifiers by fine-tuning pre-trained BERT models
albert bert nlp sentiment-analysis text-similarity
Last synced: 24 Feb 2025
https://github.com/khalidbelk/jaccard
Calculate the similarity index between two texts
nlp ocaml similarity-analysis text-similarity
Last synced: 16 Feb 2025