Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with gensim
A curated list of projects in awesome lists tagged with gensim .
https://github.com/piskvorky/gensim
Topic Modelling for Humans
data-mining data-science document-similarity fasttext gensim information-retrieval machine-learning natural-language-processing neural-network nlp python topic-modeling word-embeddings word-similarity word2vec
Last synced: 29 Sep 2024
https://github.com/rare-technologies/gensim
Topic Modelling for Humans
data-mining data-science document-similarity fasttext gensim information-retrieval machine-learning natural-language-processing neural-network nlp python topic-modeling word-embeddings word-similarity word2vec
Last synced: 07 Aug 2024
https://github.com/RaRe-Technologies/gensim
Topic Modelling for Humans
data-mining data-science document-similarity fasttext gensim information-retrieval machine-learning natural-language-processing neural-network nlp python topic-modeling word-embeddings word-similarity word2vec
Last synced: 04 Aug 2024
https://github.com/dipanjans/text-analytics-with-python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
clustering gensim natural-language natural-language-processing nltk pattern python scikit-learn semantic sentiment sentiment-analysis spacy stanford-nlp text-analytics text-classification text-summarization
Last synced: 26 Sep 2024
https://github.com/dipanjanS/text-analytics-with-python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
clustering gensim natural-language natural-language-processing nltk pattern python scikit-learn semantic sentiment sentiment-analysis spacy stanford-nlp text-analytics text-classification text-summarization
Last synced: 02 Aug 2024
https://github.com/plasticityai/magnitude
A fast, efficient universal vector embedding utility package.
embeddings fast fasttext gensim glove machine-learning machine-learning-library memory-efficient natural-language-processing nlp python vectors word-embeddings word2vec
Last synced: 30 Sep 2024
https://github.com/explosion/sense2vec
🦆 Contextually-keyed word vectors
gensim gensim-word2vec machine-learning natural-language-processing nlp python sense2vec spacy word2vec
Last synced: 30 Sep 2024
https://github.com/kavgan/nlp-in-practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
gensim machine-learning natural-language-processing nlp text-classification text-mining tf-idf word2vec
Last synced: 30 Sep 2024
https://github.com/piskvorky/gensim-data
Data repository for pretrained NLP models and NLP corpora.
corpora dataset gensim glove-model lda-model lsi-model pretrained-models word2vec-model
Last synced: 03 Aug 2024
https://github.com/oborchers/Fast_Sentence_Embeddings
Compute Sentence Embeddings Fast!
cython document-similarity embeddings fasttext fse gensim gensim-model maxpooling sentence-embeddings sentence-representation sentence-similarity sif swem usif word2vec-model wordembedding
Last synced: 02 Aug 2024
https://github.com/5hirish/adam_qas
ADAM - A Question Answering System. Inspired from IBM Watson
adam elasticsearch gensim natural-language-processing pandas python question-answering scikit-learn spacy spacy-extension wikipedia
Last synced: 25 Sep 2024
https://github.com/AICoE/log-anomaly-detector
Log Anomaly Detection - Machine learning to detect abnormal events logs
aiops anomaly-detection artificial-intelligence gensim kubernetes log machine-learning-algorithms som stream-processing word2vec
Last synced: 01 Aug 2024
https://github.com/30lm32/ml-projects
ML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Python
ab-testing deep-learning docker gensim geolocation imbalanced-data kdtree keras lstm-neural-networks machine-learning mlflow nlp random-forest spam-classification svm tensorboard tensorflow text-classification timeseries-analysis word2vec
Last synced: 03 Aug 2024
https://github.com/benedekrozemberczki/GEMSEC
The TensorFlow reference implementation of 'GEMSEC: Graph Embedding with Self Clustering' (ASONAM 2019).
clustering community-detection deepwalk deezer embedding facebook gemsec gensim graph-embedding implicit-factorization m-nmf machine-learning matrix-factorization network-embedding neural-network node2vec semisupervised-learning tensorflow unsupervised-learning word2vec
Last synced: 31 Jul 2024
https://github.com/davidberenstein1957/concise-concepts
This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
few-shot-classifcation gensim hacktoberfest machine-learning natural-language-processing ner nlp spacy
Last synced: 30 Sep 2024
https://github.com/akoksal/Turkish-Word2Vec
Pre-trained Word2Vec Model for Turkish
Last synced: 02 Aug 2024
https://github.com/benedekrozemberczki/Splitter
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).
clustering community-detection deep-learning deep-neural-network deepwalk ego-splitting factorization gensim graph-embedding graph-neural-network graph-representation-learning implicit-factorization machine-learning network-embedding node-embedding node2vec overlapping-community-detection pytorch word-vector word2vec
Last synced: 01 Aug 2024
https://github.com/giacbrd/ShallowLearn
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
fasttext gensim machine-learning neural-network online-learning scikit-learn shallow-learning supervised-learning text-classification text-mining word-embeddings word2vec
Last synced: 07 Aug 2024
https://github.com/giacbrd/shallowlearn
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
fasttext gensim machine-learning neural-network online-learning scikit-learn shallow-learning supervised-learning text-classification text-mining word-embeddings word2vec
Last synced: 29 Sep 2024
https://github.com/akutuzov/webvectors
Web-ify your word2vec: framework to serve distributional semantic models online
distributional-semantics embedding-models flask gensim web-app word2vec
Last synced: 07 Aug 2024
https://github.com/benedekrozemberczki/role2vec
A scalable Gensim implementation of "Learning Role-based Graph Embeddings" (IJCAI 2018).
deep-learning deepwalk gensim graph-embedding graph-neural-network graph-wavelet implicit-factorization machine-learning network-embedding network-science node-embedding node2vec pytorch representation-learning research sklearn struc2vec tensorflow weisfeiler-lehman word2vec
Last synced: 29 Sep 2024
https://github.com/platisd/duplicate-code-detection-tool
A simple Python3 tool to detect similarities between files within a repository
Last synced: 31 Jul 2024
https://github.com/benedekrozemberczki/MUSAE
The reference implementation of "Multi-scale Attributed Node Embedding". (Journal of Complex Networks 2021)
aane asne asonam attributed-embedding deep-learning deepwalk embedding gemsec gensim graph-embedding graph-neural-network implicit-factorization musae network-analysis network-embedding node-embedding node2vec tadw walklets word2vec
Last synced: 01 Aug 2024
https://github.com/alisonmitchell/stock-prediction
Technical and sentiment analysis to predict the stock market with machine learning models based on historical time series data and news article sentiment collected using APIs and web scraping.
beautifulsoup bert gensim huggingface keras-tensorflow machine-learning matplotlib mplfinance nlp nltk numpy pandas plotly python scikit-learn scipy seaborn spacy textblob yfinance
Last synced: 30 Sep 2024
https://github.com/dipanjans/nlp_workshop_odsc_europe20
Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and Topic Models.
deep-learning gensim jupyter-notebook machine-learning natural-language-processing nltk python pytorch scikit-learn spacy tensorflow transfer-learning transformers
Last synced: 30 Sep 2024
https://github.com/eellak/nlpbuddy
A text analysis application for performing common NLP tasks through a web dashboard interface and an API
fasttext gensim natural-language-processing spacy text-analysis text-classification
Last synced: 30 Sep 2024
https://github.com/benedekrozemberczki/diff2vec
Reference implementation of Diffusion2Vec (Complenet 2018) built on Gensim and NetworkX.
complex-networks deep-learning deepwalk diff2vec diffusion embedding embeddings factorization gensim graph-embedding implicit-factorization machine-learning network-embedding neural-network node-embedding node2vec semisupervised-learning struc2vec tensorflow unsupervised-learning
Last synced: 02 Aug 2024
https://github.com/ibrahimsharaf/doc2vec
:notebook: Long(er) text representation and classification using Doc2Vec embeddings
doc2vec gensim nlp-machine-learning scikit-learn sentiment-analysis text-classification
Last synced: 07 Aug 2024
https://github.com/benedekrozemberczki/walklets
A lightweight implementation of Walklets from "Don't Walk Skip! Online Learning of Multi-scale Network Embeddings" (ASONAM 2017).
deep-learning deepwalk dimensionality-reduction dont-walk-skip edge-prediction embedding gensim graph-convolution graph-embedding graph-mining graph-neural-networks graphlet machine-learning multiscale node-classification node-embedding node2vec walklet word-embedding word2vec
Last synced: 01 Aug 2024
https://github.com/benedekrozemberczki/SINE
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).
aane asne deep-learning deepwalk dimensionality-reduction gensim graph-embedding grarep machine-learning network-embedding networkx node-embedding node2vec pytorch sklearn tadw tensorflow torch unsupervised-learning walklets
Last synced: 01 Aug 2024
https://github.com/benedekrozemberczki/GraRep
A SciPy implementation of "GraRep: Learning Graph Representations with Global Structural Information" (WWW 2015).
aane asne deep-learning deepwalk embedding gensim graph-embedding graph2vec grarep matrix-factorization network-embedding nmf node-embedding node2vec role2vec singular-value-decomposition svd tadw walklets word2vec
Last synced: 04 Aug 2024
https://github.com/minasmz/Persian-Summarization
Statistical and Semantical Text Summarizer in Persian Language
doc2vec-model gensim nlp persian-language persian-nlp text-summarization textrank-algorithm
Last synced: 04 Aug 2024
https://github.com/annamalai-nr/subgraph2vec_gensim
Contains the code (and working vm setup) for our KDD MLG 2016 paper titled: "subgraph2vec: Learning Distributed Representations of Rooted Sub-graphs from Large Graphs"
deep-learning gensim graph-kernels kernel-methods word2vec
Last synced: 31 Jul 2024
https://github.com/js1010/cusim
Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)
cuda gensim gpu lda topic-modeling w2v word-embedding
Last synced: 05 Aug 2024
https://github.com/silviatti/topic-model-diversity
A collection of topic diversity measures for topic modeling
evaluation-metrics gensim latent-dirichlet-allocation lda topic-diversity topic-diversity-measures topic-model topic-modeling topic-modeling-analysis topic-models
Last synced: 02 Aug 2024
https://github.com/sarthakjshetty/pyresearchinsights
End-to-end NLP tool to analyze research publications. Published in Ecology & Evolution 2021.
gensim natural-language-processing nlp python scientific-analysis spacy text-mining
Last synced: 28 Sep 2024
https://github.com/rtrevinnoc/FUTURE
A private, free, open-source search engine built on a P2P network
css3 flask flask-application gensim glove glove-embeddings glove-vectors hnswlib html5 javascript js json lmdb machine-learning mongodb natural-language-processing natural-language-understanding python python3 search-engine
Last synced: 02 Aug 2024
https://github.com/pelican-plugins/similar-posts
Pelican plugin to list similar posts to articles, based on a vector space model.
blog gensim pelican pelican-plugins python similarity tags tf-idf
Last synced: 27 Sep 2024
https://github.com/autonomio/signs
A suite of tools for text preparation, vectorization and processing for deep learning with Keras.
embeddings fasttext gensim glove keras spacy word2vec
Last synced: 30 Sep 2024
https://github.com/snoop2head/instagram_hashtag_analysis
📷 Crawl and Analyze Instagram Hashtag Data: KoNLPY to gensim word2Vec & scikit-learn TF-IDF
adjective gensim gensim-word2vec instagram-hashtag-analysis konlpy natural-language-processing noun scikit-learn scikitlearn tf-idf word2vec
Last synced: 01 Aug 2024
https://github.com/Rajan-sust/WikiTextCorpusDownloader
A Language Independent Wikipedia Text Corpus Downloader
gensim nlp python3 tensorflow wikipedia
Last synced: 31 Jul 2024
https://github.com/benedekrozemberczki/NestedSubtreeHash
A distributed implementation of "Nested Subtree Hash Kernels for Large-Scale Graph Classification Over Streams" (ICDM 2012).
data-mining data-science deepwalk distributed-machine-learning feature-extraction gensim graph-classification graph-kernel graph-mining hashing large-scale-learning machine-learning multi-scale node2vec representation-learning streaming-data streaming-processing word2vec
Last synced: 31 Jul 2024
https://github.com/brucewlee/wiki-text-summarizer-keyword-extractor
Uses Beautiful Soup to read Wiki pages, Gensim to summarize, NLTK to process, and extracts keywords based on entropy: everything in one beautiful code. A simple but effective solution to extractive text summarization.
gensim gensim-model keyword-extraction keyword-identification nltk simple-summarizer text-mining text-summarization text-summarizer wikipedia-summarizer
Last synced: 02 Oct 2024
https://github.com/rmitsch/docker-alpine-python-nlp
alpine-based dockerfile with python, openblas, numpy (linked against openblas), gensim, spacy, nltk and pattern.
alpine dockerfile gensim nlp numpy openblas python
Last synced: 02 Oct 2024
https://github.com/bees4ever/seaqube
Semantic Quality Benchmark for Word Embeddings, i.e. Natural Language Models in Python. Acronym `SeaQuBe` or `seaqube`.
augmentation benchmark fasttext gensim nlp spacy spacy-nlp wordembeddings
Last synced: 01 Oct 2024
https://github.com/christoph/robics
Automatic detection of robust parametrizations for LDA and NMF. Compatible with scikit-learn and gensim.
gensim lda natural-language-processing nmf robust-parametrizations scikit-learn topic-modeling topic-models
Last synced: 30 Sep 2024
https://github.com/banjtheman/topicblob
Text to Topics
gensim machine-learning nlp topic-modeling
Last synced: 28 Sep 2024
https://github.com/bkataru/search-m8
A service that scrapes google and youtube search to transcribe and obtain summaries of articles and videos. Made for Inventure Hackathon 2020 by team Etwas
beautifulsoup4 data-mining gensim machine-learning machine-learning-algorithms natural-language-processing scraping summarization sumy transcription web-scraping youtube-search
Last synced: 28 Sep 2024
https://github.com/toshimelonhead/Springboard-Berkshire
Topic model analysis of Berkshire Hathaway annual letters (Completed Capstone Project #2)
gensim nlp spacy springboard textacy topic-modeling
Last synced: 13 Aug 2024
https://github.com/tomhalloin/Springboard-Berkshire
Topic model analysis of Berkshire Hathaway annual letters (Completed Capstone Project #2)
gensim nlp spacy springboard textacy topic-modeling
Last synced: 03 Aug 2024
https://github.com/kavayk29/quora-duplicate-question-pair
This project improves information retrieval by detecting duplicate question pairs in the Quora dataset using data exploration, text preprocessing, feature engineering, and models like Random Forest and LSTM, aiming to streamline question-answering.
beautifulsoup4 bilstm gensim keras lstm matplotlib numpy pandas pytorch random-forest seaborn sklearn tensorflow xgboost
Last synced: 26 Sep 2024
https://github.com/netcodez/book-recommendation-system
Book Recommendation System - NLP
dendrogram gensim natural-language-processing nltk python recommendation-system scipy
Last synced: 01 Oct 2024
https://github.com/jonad/identify_question_pair_with_the_same_intent
Identify question pair with the same intent using Convolutional Neural Network
cnn-keras convolutional-neural-networks gensim nltk-library numpy-library pandas python-3-5 word2vec-algorithm xgboost-model
Last synced: 01 Oct 2024