Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/shortthirdman/datascience-jupyter-notebooks
Data Science Jupyter Notebooks
https://github.com/shortthirdman/datascience-jupyter-notebooks
data-science data-visualization jupyterlab-notebooks jyputer-notebook notebook pycryptobot
Last synced: about 1 month ago
JSON representation
Data Science Jupyter Notebooks
- Host: GitHub
- URL: https://github.com/shortthirdman/datascience-jupyter-notebooks
- Owner: shortthirdman
- License: mit
- Created: 2024-01-27T08:02:28.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-04T13:57:18.000Z (4 months ago)
- Last Synced: 2024-11-19T05:05:09.894Z (3 months ago)
- Topics: data-science, data-visualization, jupyterlab-notebooks, jyputer-notebook, notebook, pycryptobot
- Language: Jupyter Notebook
- Homepage:
- Size: 5.98 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Big Data MLOps Platform
Data Science and Machine Learning Jupyter Notebooks
[data:image/s3,"s3://crabby-images/4ce2f/4ce2f7fbe81071a52b5deb64693a4abbbb213337" alt="Made withJupyter"](https://jupyter.org/try) [data:image/s3,"s3://crabby-images/e7985/e79852128a5f83c92496b9d734ca52d01e009a39" alt="Open In Collab"](https://colab.research.google.com/github/Naereen/badges) [data:image/s3,"s3://crabby-images/fbe1d/fbe1d2f89215b7589b3f89aa2112c2614f97d3b5" alt="Binder"](https://mybinder.org/v2/gh/shortthirdman/DataScience-Jupyter-Notebooks/main)
data:image/s3,"s3://crabby-images/75125/751257bc537d0ec090e934bf09e4b52406df2142" alt="GitHub commit activity" data:image/s3,"s3://crabby-images/c77db/c77db5c02837d91710c4ba787603fcc99a0d6514" alt="GitHub commit activity" data:image/s3,"s3://crabby-images/47eed/47eed0d4903dd26fccb1957e0365152b34781e73" alt="GitHub Created At" data:image/s3,"s3://crabby-images/9a9e4/9a9e448d773b4dbd700e1f5650d2a2e8d754fd30" alt="GitHub last commit" data:image/s3,"s3://crabby-images/136d3/136d3ba6066e2363b6236559f3b3a3c530162731" alt="GitHub repo size" [data:image/s3,"s3://crabby-images/1f1eb/1f1ebda571be8c3c78272100785257cb06e06b79" alt="Docker Image CI"](https://github.com/shortthirdman/DataScience-Jupyter-Notebooks/actions/workflows/docker.yaml)
## Tech Stack
shortthirdman/DataScience-Jupyter-Notebooks is built on the following main stack:
-
[Python](https://www.python.org) – Languages
-[Docker](https://www.docker.com/) – Virtual Machine Platforms & Containers
-[GitHub Actions](https://github.com/features/actions) – Continuous Integration
-[Jupyter](http://jupyter.org) – Data Science Notebooks
Full tech stack [here](/techstack.md)
## Setup References
- [Ready-to-run Docker images containing Jupyter applications - jupyter/docker-stacks](https://github.com/jupyter/docker-stacks)
- [Kaggle/docker-python - Kaggle Python Docker image](https://github.com/Kaggle/docker-python)
## Dataset References
- [Apple Stock Price](https://www.kaggle.com/datasets/rafsunahmad/apple-stock-price)
- [Spotify Dataset 2023](https://www.kaggle.com/datasets/tonygordonjr/spotify-dataset-2023)
- [Online Retail Transactions](https://www.kaggle.com/datasets/thedevastator/online-retail-transaction-data)
- [Stock Market](https://www.kaggle.com/datasets/jacksoncrow/stock-market-dataset)
- [American Airlines Group Stock](https://www.kaggle.com/datasets/varpit94/american-airlines-group-stock-data)
- [Amazon (AMZN) Historical Stock Price](https://www.kaggle.com/datasets/specter7/amazon-amzn-historical-stock-price-data)
## Docker commands
```shell
docker system prune --all --volumes --force
``````shell
docker build --no-cache -f Dockerfile --progress=auto --compress --rm -t shortthirdman-org/bigdata-mlops-platform:latest .
``````shell
docker buildx build --progress=auto --compress --rm -t shortthirdman-org/bigdata-mlops-platform:latest .
``````shell
docker run -d -n mlops -p 8888:8888 --restart unless-stopped shortthirdman-org/bigdata-mlops-platform:latest
```## Local Setup
- Create a Python virtual environment and activate
```shell
python -m venv dev
```
```shell
.\dev\Scripts\activate
```- Install the packages and dependencies as listed in requirements file
```shell
pip install -r requirements.txt --no-cache-dir --disable-pip-version-check
```- Start your development `Jupyter Notebook` or `Jupyter Lab` server
```shell
jupyter lab --notebook-dir=.\notebooks --no-browser
```
```shell
jupyter notebook
```
```
jupyter_nbextensions_configurator
```## References
- [TimeGPT: The First Foundation Model for Time Series Forecasting](https://towardsdatascience.com/timegpt-the-first-foundation-model-for-time-series-forecasting-bf0a75e63b3a)
- [Staggering Returns with PyCryptoBot](https://trading-data-analysis.pro/staggering-returns-with-pycryptobot-39dd2ef5ead5)
- [Trading Data Analysis](https://trading-data-analysis.pro/)
- [Phenomenal Returns with PyCryptoBot](https://trading-data-analysis.pro/phenomenal-returns-with-pycryptobot-16e62f5f684)
- [Leveraging PyCryptoBot for Optimal Cryptocurrency Trading](https://coinsbench.com/leveraging-pycryptobot-for-optimal-cryptocurrency-trading-5b7082354cd3)
- [Forecasting Stock Using Deep Learning Along With Indicators | Medium](https://medium.com/@redeaddiscolll/forecasting-stock-using-deep-learning-along-with-indicators-c1523101c08d)
- [Forecasting Stock Using Deep Learning Along With Indicators | OnePageCode@SubStack](https://onepagecode.substack.com/p/forecasting-stock-using-deep-learning-220)
- [Advanced Stock Pattern Prediction using LSTM with the Attention Mechanism in TensorFlow: A step by step Guide with Apple Inc. (AAPL) Data](https://drlee.io/advanced-stock-pattern-prediction-using-lstm-with-the-attention-mechanism-in-tensorflow-a-step-by-143a2e8b0e95)
- [Spark and Docker: Your Spark development cycle just got 10x faster!](https://towardsdatascience.com/spark-and-docker-your-spark-development-cycle-just-got-10x-faster-f41ed50c67fd)
- [datamechanics/spark on Docker Hub](https://hub.docker.com/r/datamechanics/spark)
- [Setting up a Spark standalone cluster on Docker in layman terms](https://medium.com/@MarinAgli1/setting-up-a-spark-standalone-cluster-on-docker-in-layman-terms-8cbdc9fdd14b)
- [Apache Spark Standalone Cluster on Docker](https://github.com/cluster-apps-on-docker/spark-standalone-cluster-on-docker)
- [Visualizing Trading Signals in Python - Plot buy and sell trading signals in Python's graph](https://eodhd.medium.com/visualizing-trading-signals-in-python-3cab01cc5847)
- [Apache Hadoop and Apache Spark for Big Data Analysis](https://towardsdatascience.com/apache-hadoop-and-apache-spark-for-big-data-analysis-daaf659fd0ee)
- [Agent-Based Stock Trading: Design and Implementation](https://medium.com/@redeaddiscolll/agent-based-stock-trading-design-and-implementation-c2141fc8f984)
- [Mastering K-Means Clustering](https://towardsdatascience.com/mastering-k-means-clustering-065bc42637e4)
- [Additive Decision Trees - An interpretable classification and regression model](https://towardsdatascience.com/additive-decision-trees-85f2feda2223)
- [Interpretable kNN (ikNN) - An interpretable classifier](https://towardsdatascience.com/interpretable-knn-iknn-33d38402b8fc)
- [Interpretable Outlier Detection: Frequent Patterns Outlier Factor (FPOF)](https://towardsdatascience.com/interpretable-outlier-detection-frequent-patterns-outlier-factor-fpof-0d9cbf51b17a)
- [Telco Customer Churn - Kaggle](https://www.kaggle.com/datasets/blastchar/telco-customer-churn)
- [End-to-End Machine Learning Project: TelCo Churn Prediction](https://medium.com/@ramazanolmeez/end-to-end-machine-learning-project-churn-prediction-e9c4d0322ac9)
- [Predicting Stock Prices with Monte Carlo Simulations](https://medium.com/@antoine.boucher012/predicting-stock-prices-with-monte-carlo-simulations-0884ef32c35b)
- [Artificial Intelligence (AI) models for Trading - Exploring Random Forests from Machine Learning](https://medium.com/coinmonks/artificial-intelligence-ai-models-for-trading-0bfd308d012d)
- [XGBoost for Stock Price Forecasting](https://medium.com/@bugragultekin/xgboost-for-stock-price-forecasting-64f89719a8e4)
- [Model Interpretability Using Credit Card Fraud Data](https://towardsdatascience.com/model-interpretability-using-credit-card-fraud-data-f219ff7ec89d)
- [How Many Pokémon Fit?](https://towardsdatascience.com/how-many-pok%C3%A9mon-fit-84f812c0387e)
- [Gold Price Prediction Using Machine Learning](https://medium.com/@iabbasali/gold-price-prediction-using-machine-learning-24e23841de52)
- [Data Leakage in Preprocessing, Explained: A Visual Guide with Code Examples](https://towardsdatascience.com/data-leakage-in-preprocessing-explained-a-visual-guide-with-code-examples-33cbf07507b7)