https://github.com/maxisoft/yahoo-finance-data-downloader
Automate downloading historical and recent stock data from Yahoo Finance.
https://github.com/maxisoft/yahoo-finance-data-downloader
data stock-market yahoo-finance
Last synced: 5 months ago
JSON representation
Automate downloading historical and recent stock data from Yahoo Finance.
- Host: GitHub
- URL: https://github.com/maxisoft/yahoo-finance-data-downloader
- Owner: maxisoft
- License: agpl-3.0
- Created: 2022-11-11T17:42:37.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-06-22T07:14:38.000Z (12 months ago)
- Last Synced: 2025-06-22T08:20:53.201Z (12 months ago)
- Topics: data, stock-market, yahoo-finance
- Language: Python
- Homepage:
- Size: 158 KB
- Stars: 0
- Watchers: 2
- Forks: 2
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Stock Data Downloader
This script automates the process, efficiently downloading historical and recent OHLCV (Open, High, Low, Close, Volume)
data for your desired stock symbols from Yahoo Finance.
Effortlessly build comprehensive datasets for analysis and upload them to platforms like Kaggle!
**Key Features:**
- **Efficient Downloading:** Downloads minute-level data using `yfinance`. ⚡
- **Historical & Recent Data:** Combines historical and recent data for a complete picture.
- **Multithreading:** Leverages multithreading for faster downloads on multi-core systems. ️
- **Error Handling:** Logs exceptions for troubleshooting.
- **Customizable:** Set the desired stock symbols in a separate `symbols.txt` file.
**Getting Started:**
1. **Prerequisites:**
- Python 3.x
- Required libraries: `pandas`, `yfinance`, `pathvalidate`, `numba` (for optimization, optional)
- Install dependencies with `pip install -r requirements.txt`
2. **Create a `symbols.txt` file:**
- List each stock symbol you want to download data for, one per line. (e.g., AAPL, TSLA, GOOG)
3. **Schedule Automatic Download:** (optional)
This script is designed to be run automatically on a schedule using GitHub Actions. To configure this:
- Go to your GitHub repository settings -> Actions.
- Create secrets in your repository settings to store sensitive information like your Kaggle credentials (
KAGGLE_USERNAME and KAGGLE_KEY).
- Enable the workflow `python-app.yml` to run in your fork's GitHub actions page
**Technical Details (for the curious):**
- The script utilizes [`yfinance`](https://github.com/ranaroussi/yfinance) for data retrieval.
- `pandas` is used for efficient data manipulation and storage as CSV files.
- `pathvalidate` ensures valid filenames for downloaded data.
- The `numba` library (optional) can be used for performance optimization (requires installation).
**Contributing & Further Development:**
- We welcome contributions and suggestions! Feel free to open pull requests.
**Let's automate your stock data collection!**
## Setting Up GitHub Secrets for Kaggle Upload:
**Here's a table outlining the secrets required for uploading your downloaded data to Kaggle:**
| Secret Name | Description |
|-----------------|----------------------------------------------------|
| KAGGLE_USERNAME | Your Kaggle username |
| KAGGLE_KEY | Your Kaggle API key (create one from your profile) |
| KAGGLE_DATASET | The remote dataset name |
**Note:** These secrets should not be directly added to your code. Instead, create them securely within your GitHub
repository settings -> Secrets.