Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/debhere/predict-crypto-price
A Machine Learning based application to predict the Bitcoin price
https://github.com/debhere/predict-crypto-price
bitcoin coinbase crypto predictor
Last synced: 20 days ago
JSON representation
A Machine Learning based application to predict the Bitcoin price
- Host: GitHub
- URL: https://github.com/debhere/predict-crypto-price
- Owner: debhere
- License: gpl-3.0
- Created: 2024-09-23T16:14:41.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-10-21T16:09:58.000Z (28 days ago)
- Last Synced: 2024-10-30T10:09:20.122Z (20 days ago)
- Topics: bitcoin, coinbase, crypto, predictor
- Language: Python
- Homepage: https://predict-crypto-price.onrender.com/
- Size: 3.08 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Predict Crypto-Currency Price
This project is a Machine Learning implementation to predict the time weighted average price (TWAP) for a crypto currency asset (currently only Bitcoin). The project is deployed on cloud and can be accessed with through https://predict-crypto-price.onrender.com/
## Authors
- [@debhere](https://www.github.com/debhere)
## History
- ver_0.0.1: deployed on Oct 7, 2024. Base model created with facebook prophet trained with 9 months (6 Oct, 2024) worth of OHLC data from 1 Jan, 2024 to 6 Oct, 2024 to predict ONLY Bitcoin price.
## Tech Stack
**Server:** Streamlit
**Languages:** Python, T-SQL, Bash Script
**Machine Learning:** Meta Prophet
**Database:** Sql-Server 19
## Dataset
Data collection is a key step in any machine learning initiative. The dataset contains 9 month (till Oct 6, 2024) worth of bitcoin ohlc (open, high, low, close) prices over a granularity of 5 mins.
In order to do that, the coinbase api https://docs.cdp.coinbase.com/exchange/reference/exchangerestapi_getproductcandles is invoked and the data is stored in a local sql-server database.
It is a general ETL (Extract, Transform, Load) methodology where the data is extracted from an external api, transformed as applicable and loaded into the database.
## Database
As mentioned earlier, the collected data is stored locally into a sql-server database. Refer the sql script [here](/scripts/database/crypto.sql) to create the database schema.
Presently, only one table is created for Bitcoin.
## Requirements
- **python 3.11+**
- **streamlit**
- **sckit-learn**
- **numpy**
- **pandas**
- **pyyaml**
- **joblib**
- **sqlalchemy**
- **fire**
- **prophet**
- **requests**## Run Locally
Clone the project
```bash
git clone https://github.com/debhere/Predict-Crypto-Price.git
```Go to the project directory
```bash
cd Predict-Crypto-Price
```Create virtual environment
```bash
conda create -p venv python=3.11 -y
```Activate virtual environment
```bash
conda activate venv\
```Install dependencies
```bash
pip install -e .
```create the database schema in sql server. Refer the script at /scripts/database/crypto.sql
collect the data and insert into the database
```bash
./scripts/data_prep.sh
```Start the server
```bash
streamlit run app.py
```## Description
There are 3 main sections (if you will) for this project:-
- Extract-Transform-Load
- Model Implementation & Pipeline
- Streamlit server for front-endThe presentation layer (front-end) is very light-weight at this moment with the solitary aim to grab user input and predict the Bitcoin TWAP price.
### Extract-Transform-Load (ETL)
*scripts* contains the source code of the ETL pipeline. Ensure to update the database server name in the db config file [here](/scripts/_config/database.yaml). Upon running the shell script data_prep.sh, the ohlc data will be inserted into the database. Consider this piece as the pre-requisite as until the data collection is done the learning process cannot be started.
Currently, the data is corrected for the period of 6 months (01-Jan-2024 to 30-Sep-2024). Change the start and end date within the script for any preferred date range. Ensure to delete any existing data from the table before running the shell script or else the data will be appended.
### Model Implementation
*predictcrypto* contains the primary source code of the entire model implmentation mechanism. The primary components is "train" while "cmg" is the component manager.
*pipeline* contains the data-pipeline and tranformation-pipeline
*utils* directory comprises of utility functions and methods that is being referred throughout the project.
trainer.sh script is run to train the model and generate the serialized model in a json file which is preferred format for 'prophet'.
### Front-End
Streamlit framework is used for a fast deployment of the front-end web interface. User can provide the inputs, the model is invoked to return the prediction on the web itself.
### Upcoming Changes
1. Support for multiple crypto assets.
2. Tune the model.
3. Schedule the data collection.
4. Verify the prediction w.r.t current price.
5. Integrate MlOps for efficient model registry.
6. If possible migrate the database from local to cloud.## Support
For any query or discussion, email [email protected] or send me a message on
[![linkedin](https://img.shields.io/badge/linkedin-0A66C2?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/debmalyamondal)