https://github.com/sarahm44/crypto-sentiment-analysis
Analysis of the sentiment of the latest news articles on Bitcoin and Ethereum using sentiment analysis, natural language processing and named entity recognition.
https://github.com/sarahm44/crypto-sentiment-analysis
bitcoin bitcoin-sentiment crypto-sentiment ethereum ethereum-sentiment fintech named-entity-recognition natural-language-processing newsapi sentiment-analysis
Last synced: 19 days ago
JSON representation
Analysis of the sentiment of the latest news articles on Bitcoin and Ethereum using sentiment analysis, natural language processing and named entity recognition.
- Host: GitHub
- URL: https://github.com/sarahm44/crypto-sentiment-analysis
- Owner: sarahm44
- Created: 2022-05-15T09:12:00.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-10T03:02:45.000Z (almost 3 years ago)
- Last Synced: 2025-05-06T21:11:17.694Z (2 months ago)
- Topics: bitcoin, bitcoin-sentiment, crypto-sentiment, ethereum, ethereum-sentiment, fintech, named-entity-recognition, natural-language-processing, newsapi, sentiment-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 2.69 MB
- Stars: 3
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Crypto Sentiment Analysis

## Table of Contents
- [Overview](#overview)
- [Sentiment Analysis](#sentiment-analysis)
* [Bitcoin Sentiment](#bitcoin-sentiment)
* [Ethereum Sentiment](#ethereum-sentiment)
- [Natural Language Processing](#natural-language-processing)
* [Tokenize](#tokenize)
* [N-grams](#n-grams)
* [Word Clouds](#word-clouds)
* [Named Entity Recognition](#named-entity-recognition)## Overview
In this repository I applied natural language processing to understand the sentiment in the latest news articles featuring Bitcoin and Ethereum. I also applied fundamental NLP techniques to better understand the other factors involved with the coin prices such as common words and phrases and organizations and entities mentioned in the articles.
I completed the following tasks:
1. Sentiment Analysis
2. Natural Language Processing
3. Named Entity RecognitionSee this contained in this [Jupyter Lab notebook](https://github.com/sarahm44/crypto-sentiment-analysis/blob/main/crypto_sentiment_1.ipynb).
## Sentiment Analysis
I used the [newsapi](https://newsapi.org/) to pull the latest news articles for Bitcoin and Ethereum and created a DataFrame of sentiment scores for each coin.
### Bitcoin Sentiment
I created the Bitcoin sentiment scores dataframe:

See Bitcoin sentiment below:

### Ethereum Sentiment
I created the Ethereum sentiment scores dataframe:

See Ethereum sentiment as follows:

Some observations include that:
* Ethereum had the highest mean positive score.
* Ethereum had the highest mean compound score.
* Bitcoin had the highest max compound score.## Natural Language Processing
In this section, I used NLTK and Python to tokenize text, find n-gram counts, and create word clouds for both coins.
### Tokenize
I used NLTK and Python to tokenize the text for each coin. I completed the following:
1. Changed each word to lowercase.
2. Removed punctuation.
3. Removed stop words.See relevant code below:

I then added the "Tokens" column of the tokenized text to the dataframe:

### N-grams
Then I looked at the ngrams and word frequency for each coin.
I completed as follows:
1. Used NLTK to produce the ngrams for N = 2.
2. Listed the top 10 words for each coin.See below the count for ngrams for N = 2:

See below the code and results for the top 10 words for each coin:

### Word Clouds
Finally, I generated word clouds for each coin to summarize the news for each coin.
See Bitcoin word cloud:

See Ethereum word cloud:

### Named Entity Recognition
In this section, I built a named entity recognition (NER) model for both coins and visualized the tags using SpaCy.
See Bitcoin NER:

See Ethereum NER:
