Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with gensim

A curated list of projects in awesome lists tagged with gensim .

https://github.com/dipanjans/text-analytics-with-python

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.

clustering gensim natural-language natural-language-processing nltk pattern python scikit-learn semantic sentiment sentiment-analysis spacy stanford-nlp text-analytics text-classification text-summarization

Last synced: 26 Sep 2024

https://github.com/dipanjanS/text-analytics-with-python

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.

clustering gensim natural-language natural-language-processing nltk pattern python scikit-learn semantic sentiment sentiment-analysis spacy stanford-nlp text-analytics text-classification text-summarization

Last synced: 02 Aug 2024

https://github.com/kavgan/nlp-in-practice

Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.

gensim machine-learning natural-language-processing nlp text-classification text-mining tf-idf word2vec

Last synced: 30 Sep 2024

https://github.com/piskvorky/gensim-data

Data repository for pretrained NLP models and NLP corpora.

corpora dataset gensim glove-model lda-model lsi-model pretrained-models word2vec-model

Last synced: 03 Aug 2024

https://github.com/ThoughtRiver/lmdb-embeddings

Fast word vectors with little memory usage in Python

embeddings fasttext gensim glove lmdb magnitude memory speed text vectors word word2vec

Last synced: 07 Aug 2024

https://github.com/30lm32/ml-projects

ML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Python

ab-testing deep-learning docker gensim geolocation imbalanced-data kdtree keras lstm-neural-networks machine-learning mlflow nlp random-forest spam-classification svm tensorboard tensorflow text-classification timeseries-analysis word2vec

Last synced: 03 Aug 2024

https://github.com/davidberenstein1957/concise-concepts

This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.

few-shot-classifcation gensim hacktoberfest machine-learning natural-language-processing ner nlp spacy

Last synced: 30 Sep 2024

https://github.com/akoksal/Turkish-Word2Vec

Pre-trained Word2Vec Model for Turkish

gensim nlp turkish word2vec

Last synced: 02 Aug 2024

https://github.com/giacbrd/ShallowLearn

An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.

fasttext gensim machine-learning neural-network online-learning scikit-learn shallow-learning supervised-learning text-classification text-mining word-embeddings word2vec

Last synced: 07 Aug 2024

https://github.com/giacbrd/shallowlearn

An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.

fasttext gensim machine-learning neural-network online-learning scikit-learn shallow-learning supervised-learning text-classification text-mining word-embeddings word2vec

Last synced: 29 Sep 2024

https://github.com/akutuzov/webvectors

Web-ify your word2vec: framework to serve distributional semantic models online

distributional-semantics embedding-models flask gensim web-app word2vec

Last synced: 07 Aug 2024

https://github.com/platisd/duplicate-code-detection-tool

A simple Python3 tool to detect similarities between files within a repository

code-duplication gensim nlp

Last synced: 31 Jul 2024

https://github.com/alisonmitchell/stock-prediction

Technical and sentiment analysis to predict the stock market with machine learning models based on historical time series data and news article sentiment collected using APIs and web scraping.

beautifulsoup bert gensim huggingface keras-tensorflow machine-learning matplotlib mplfinance nlp nltk numpy pandas plotly python scikit-learn scipy seaborn spacy textblob yfinance

Last synced: 30 Sep 2024

https://github.com/dipanjans/nlp_workshop_odsc_europe20

Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and Topic Models.

deep-learning gensim jupyter-notebook machine-learning natural-language-processing nltk python pytorch scikit-learn spacy tensorflow transfer-learning transformers

Last synced: 30 Sep 2024

https://github.com/eellak/nlpbuddy

A text analysis application for performing common NLP tasks through a web dashboard interface and an API

fasttext gensim natural-language-processing spacy text-analysis text-classification

Last synced: 30 Sep 2024

https://github.com/ibrahimsharaf/doc2vec

:notebook: Long(er) text representation and classification using Doc2Vec embeddings

doc2vec gensim nlp-machine-learning scikit-learn sentiment-analysis text-classification

Last synced: 07 Aug 2024

https://github.com/minasmz/Persian-Summarization

Statistical and Semantical Text Summarizer in Persian Language

doc2vec-model gensim nlp persian-language persian-nlp text-summarization textrank-algorithm

Last synced: 04 Aug 2024

https://github.com/annamalai-nr/subgraph2vec_gensim

Contains the code (and working vm setup) for our KDD MLG 2016 paper titled: "subgraph2vec: Learning Distributed Representations of Rooted Sub-graphs from Large Graphs"

deep-learning gensim graph-kernels kernel-methods word2vec

Last synced: 31 Jul 2024

https://github.com/js1010/cusim

Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)

cuda gensim gpu lda topic-modeling w2v word-embedding

Last synced: 05 Aug 2024

https://github.com/sarthakjshetty/pyresearchinsights

End-to-end NLP tool to analyze research publications. Published in Ecology & Evolution 2021.

gensim natural-language-processing nlp python scientific-analysis spacy text-mining

Last synced: 28 Sep 2024

https://github.com/pelican-plugins/similar-posts

Pelican plugin to list similar posts to articles, based on a vector space model.

blog gensim pelican pelican-plugins python similarity tags tf-idf

Last synced: 27 Sep 2024

https://github.com/autonomio/signs

A suite of tools for text preparation, vectorization and processing for deep learning with Keras.

embeddings fasttext gensim glove keras spacy word2vec

Last synced: 30 Sep 2024

https://github.com/snoop2head/instagram_hashtag_analysis

📷 Crawl and Analyze Instagram Hashtag Data: KoNLPY to gensim word2Vec & scikit-learn TF-IDF

adjective gensim gensim-word2vec instagram-hashtag-analysis konlpy natural-language-processing noun scikit-learn scikitlearn tf-idf word2vec

Last synced: 01 Aug 2024

https://github.com/Rajan-sust/WikiTextCorpusDownloader

A Language Independent Wikipedia Text Corpus Downloader

gensim nlp python3 tensorflow wikipedia

Last synced: 31 Jul 2024

https://github.com/applenob/nlp_projects

my nlp projects notebook

gensim ipynb nlp notebook rnn rnn-namer

Last synced: 27 Sep 2024

https://github.com/brucewlee/wiki-text-summarizer-keyword-extractor

Uses Beautiful Soup to read Wiki pages, Gensim to summarize, NLTK to process, and extracts keywords based on entropy: everything in one beautiful code. A simple but effective solution to extractive text summarization.

gensim gensim-model keyword-extraction keyword-identification nltk simple-summarizer text-mining text-summarization text-summarizer wikipedia-summarizer

Last synced: 02 Oct 2024

https://github.com/rmitsch/docker-alpine-python-nlp

alpine-based dockerfile with python, openblas, numpy (linked against openblas), gensim, spacy, nltk and pattern.

alpine dockerfile gensim nlp numpy openblas python

Last synced: 02 Oct 2024

https://github.com/bees4ever/seaqube

Semantic Quality Benchmark for Word Embeddings, i.e. Natural Language Models in Python. Acronym `SeaQuBe` or `seaqube`.

augmentation benchmark fasttext gensim nlp spacy spacy-nlp wordembeddings

Last synced: 01 Oct 2024

https://github.com/christoph/robics

Automatic detection of robust parametrizations for LDA and NMF. Compatible with scikit-learn and gensim.

gensim lda natural-language-processing nmf robust-parametrizations scikit-learn topic-modeling topic-models

Last synced: 30 Sep 2024

https://github.com/ayushsubedi/choto

CLI tool to generate a summary of news/articles right on your terminal. Also a pip package.

articles bert choto cli click gensim news pip python spacy summary

Last synced: 30 Sep 2024

https://github.com/bkataru/search-m8

A service that scrapes google and youtube search to transcribe and obtain summaries of articles and videos. Made for Inventure Hackathon 2020 by team Etwas

beautifulsoup4 data-mining gensim machine-learning machine-learning-algorithms natural-language-processing scraping summarization sumy transcription web-scraping youtube-search

Last synced: 28 Sep 2024

https://github.com/toshimelonhead/Springboard-Berkshire

Topic model analysis of Berkshire Hathaway annual letters (Completed Capstone Project #2)

gensim nlp spacy springboard textacy topic-modeling

Last synced: 13 Aug 2024

https://github.com/tomhalloin/Springboard-Berkshire

Topic model analysis of Berkshire Hathaway annual letters (Completed Capstone Project #2)

gensim nlp spacy springboard textacy topic-modeling

Last synced: 03 Aug 2024

https://github.com/kavayk29/quora-duplicate-question-pair

This project improves information retrieval by detecting duplicate question pairs in the Quora dataset using data exploration, text preprocessing, feature engineering, and models like Random Forest and LSTM, aiming to streamline question-answering.

beautifulsoup4 bilstm gensim keras lstm matplotlib numpy pandas pytorch random-forest seaborn sklearn tensorflow xgboost

Last synced: 26 Sep 2024