https://github.com/cdeck3r/yasmape
Practical exercise in ML system engineering with an application to stock market prediction
https://github.com/cdeck3r/yasmape
Last synced: 7 months ago
JSON representation
Practical exercise in ML system engineering with an application to stock market prediction
- Host: GitHub
- URL: https://github.com/cdeck3r/yasmape
- Owner: cdeck3r
- License: mit
- Created: 2022-10-02T16:57:01.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-04-06T20:26:56.000Z (about 3 years ago)
- Last Synced: 2025-01-05T10:27:44.621Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 3.86 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# YASMaPE
YASMaPE stands for _Yet another stock market prediction experiment_.
Technically, this project is an practical exercise in machine learning (ML) system engineering. It is about to play around with powerful tools and to experiment with ML systems designs.
The stock market supplies time series data as input to this project. This is highly attractive: fresh as well as historical time series data is available and standardized, but the data generating process is hidden and unveils surprisingly complex behaviors.
**Disclaimer:** This project is highly experimental. Do not use for investment decisions.
## Project Information
In this project, we want to evaluate chart indicators for stock recommendations. Chart indicators are empirical hearsay and popular folklore about the stock market. There is neither a scientific foundation nor a derivation from first principles.
As a consequence, our analysis is purely based on statistical correlations of stock prices with these indicators. We aim to exploit the situation that _sometimes_ these correlations turn out to be useful. So, for a stock recommendation we formulate the following prediction problem.
> **Prediction Problem:** When do today's indicators correlate with future returns?
Further, we aim at traders with a low budget. This constraints our recommendations as follows:
1. single stock investments over diversified portfolios
1. long-term return over short-term decisions
1. buy/sell delays over intraday trading
As a result, we formulate the project goal.
> **Goal:** Enable a solution approach for the prediction problem in an application contrained by budget, diversity and delays.
## Technical Approach
The project's experiments compile features from 70+ chart indicators for a stock and use Uber's deep learning framework [Ludwig](https://ludwig-ai.github.io/ludwig-docs/) to make predictions. In the refined prediction problem, we formulate regressions and classifications problems involving
* returns across various horizons
* return larger than a given threshold
YASMaPE is [multi-container Docker application](https://docs.docker.com/compose/). Each container runs idempotent workflows implemented using [snakemake](https://snakemake.readthedocs.io/en/stable/). Distributed coordination across containers is supported by [celery](https://docs.celeryq.dev/en/stable/index.html).
## Documentation
Please see [`docs`](docs) folder for an extensive documentation
## Software System Setup
A `docker-compose` file provides a setup for a multi-container application composed of containers as microservices, which are loosely coupled by a distributed task queue. The ML pipelines are controlled via the [director](https://ovh.github.io/celery-director/) on http://localhost:8000.
The diagram below depicts the containers, their dependencies and ports (number in circles). The diagram sources from this [medium.com article](https://medium.com/@krishnakummar/creating-block-diagrams-from-your-docker-compose-yml-da9d5a2450b4).

**Setup:** Start in project's root dir and create a `.env` file with the content shown below.
```
# .env file
# In the container, this is the directory where the code is found
# Example:
APP_ROOT=/YASMaPE
# the HOST directory containing directories to be mounted into containers
# Example:
VOL_DIR=/dev/YASMaPE
```
**Create** docker image. Please see [Dockerfiles/](Dockerfiles directory) for details.
```bash
docker-compose build yasmape
docker-compose build ludwig
docker-compose build jupyter
docker-compose build mlflow
docker-compose build director
```
**Spin up** the containers and get a shell from a container
```bash
docker-compose up -d yasmape
docker-compose up -d ludwig
docker-compose up -d jupyter
docker-compose up -d director
# optionally, get a shell on the yasmape container
docker exec -it yasmape /bin/bash
```
Finally, point you browser to http://localhost:8000 to retrieve the director interface to run the ML pipelines for distributed processing across the containers.
## License
Information provided in the [LICENSE](LICENSE) file.