Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/snoop2head/instagram_hashtag_analysis

๐Ÿ“ท Crawl and Analyze Instagram Hashtag Data: KoNLPY to gensim word2Vec & scikit-learn TF-IDF
https://github.com/snoop2head/instagram_hashtag_analysis

adjective gensim gensim-word2vec instagram-hashtag-analysis konlpy natural-language-processing noun scikit-learn scikitlearn tf-idf word2vec

Last synced: about 1 month ago
JSON representation

๐Ÿ“ท Crawl and Analyze Instagram Hashtag Data: KoNLPY to gensim word2Vec & scikit-learn TF-IDF

Awesome Lists containing this project

README

        

# instagram_hashtag_analysis
Crawl and Analyze Instagram Hashtag Data

## Header Numbers for files

* 0: Crawl Instagram posts according to search result of #keyword
* 1: Create and wrangle dataset with pandas
* 2: KoNLPy tagging for Koran nouns, Korean action words
* 3: Extract similar documents and make word2Vec models with gensim
* 4: TF-IDF code without using scikit-learn library
* 5: Extracting similar documents using scikit-learn library's tfidfvectorizer

## ๋ฌธ์„œ ์•ž์— ์žˆ๋Š” ๋ฒˆํ˜ธ๋Š” ๋‹ค์Œ์„ ์˜๋ฏธํ•จ
* 0: #keyword ๊ฒ€์ƒ‰, ํ•ด์‹œํƒœ๊ทธ ๊ธฐ๋ฐ˜ ์ธ์Šคํƒ€๊ทธ๋žจ ํฌ๋กค๋ง

* 1: ์ธ์Šคํƒ€๊ทธ๋žจ ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ ๋ฐ ์กฐ์ž‘ - Pandas ๋ชจ๋“ˆ ์ด์šฉ

* 2: KoNLPy ํ˜•ํƒœ์†Œ๋ถ„์„ -> ์ตœ๋Œ€ ๋นˆ๋„ ์ฒด์–ธ(๋ช…์‚ฌ), ์„œ์ˆ ์–ด(๋™์‚ฌ, ํ˜•์šฉ์‚ฌ) ๋„์ถœ

* 3: Gensim์„ ์ด์šฉํ•œ Word2Vec ๋ชจ๋ธ ๋„์ถœ ๋ฐ ์œ ์‚ฌ ๋ฌธ์„œ ์ถ”์ถœ

* 4: scikitlearn ๋ชจ๋“ˆ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์€, Vanilla๋กœ ์ž‘์„ฑํ•œ TF-IDF ์˜ˆ์ œ

* 5: scikitlearn ๋ชจ๋“ˆ์˜ TF-IDF Vectorizer์„ ์ด์šฉํ•œ ์œ ์‚ฌ ๋ฌธ์„œ ๋„์ถœ