Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/charlesyuan02/sentiment-analysis-stock-trader
https://github.com/charlesyuan02/sentiment-analysis-stock-trader
finbert marketwatch reddit-api sentiment-analysis web-scraping
Last synced: 5 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/charlesyuan02/sentiment-analysis-stock-trader
- Owner: CharlesYuan02
- License: mit
- Created: 2023-02-22T20:40:55.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-04-15T21:06:54.000Z (over 1 year ago)
- Last Synced: 2024-10-10T19:10:07.876Z (26 days ago)
- Topics: finbert, marketwatch, reddit-api, sentiment-analysis, web-scraping
- Language: Python
- Homepage:
- Size: 682 KB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# sentiment-analysis-stock-trader
## Prerequisites
All code was written in Python 3.7.9. Please see requirements.txt for dependencies.
```
beautifulsoup4==4.12.0
pandas==1.2.3
praw==7.7.0
requests==2.28.1
snscrape==0.3.4
tqdm==4.56.2
numpy==1.24.2
scikit-learn==1.2.2
torch==2.0.0+cu117
torchaudio==2.0.1+cu117
torchvision==0.15.1+cu117
transformers==4.27.3
```## Description of Files
### create_dataset.py
This file calls functions defined in the other files to create a dataset (this is not the final dataset that we will be using for sentiment analysis, just a preliminary proof of concept).### dataset.csv
This is the example dataset created using create_dataset.py.### finbert.py
This file uses the pretrained FinBERT model on the example dataset.### scrape_headlines.py
This file contains functions to scrape S and P 500 stock tickers and names from Wikipedia, scrape news headlines for any S and P 500 stock from Yahoo Finance, and scrape news headlines for any S and P 500 stock from MarketWatch.### scrape_reddit.py
This file contains functions to scrape titles and top comments of top posts from a specified subreddit on Reddit. Note that it requires you to have a file called info.txt saved in the same directory, with the first line of this file being your Reddit API client ID, the second line being your Reddit API client secret, and the third and final line of this file being your Reddit API user agent.## License
This project is licensed under the MIT License - see the LICENSE file for details.