{"id":20976457,"url":"https://github.com/shortthirdman/datascience-jupyter-notebooks","last_synced_at":"2025-08-25T03:37:30.903Z","repository":{"id":219408694,"uuid":"748977813","full_name":"shortthirdman/DataScience-Jupyter-Notebooks","owner":"shortthirdman","description":"Data Science Jupyter Notebooks","archived":false,"fork":false,"pushed_at":"2024-11-04T13:57:18.000Z","size":6272,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-20T05:51:27.499Z","etag":null,"topics":["data-science","data-visualization","jupyterlab-notebooks","jyputer-notebook","notebook","pycryptobot"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shortthirdman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-27T08:02:28.000Z","updated_at":"2024-11-04T13:57:22.000Z","dependencies_parsed_at":"2024-03-12T16:30:21.837Z","dependency_job_id":"1fdc6b74-60d8-4848-8a5b-646dcba83da0","html_url":"https://github.com/shortthirdman/DataScience-Jupyter-Notebooks","commit_stats":null,"previous_names":["shortthirdman/datascience-jupyter-notebooks"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shortthirdman%2FDataScience-Jupyter-Notebooks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shortthirdman%2FDataScience-Jupyter-Notebooks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shortthirdman%2FDataScience-Jupyter-Notebooks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shortthirdman%2FDataScience-Jupyter-Notebooks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shortthirdman","download_url":"https://codeload.github.com/shortthirdman/DataScience-Jupyter-Notebooks/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243374755,"owners_count":20280735,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","data-visualization","jupyterlab-notebooks","jyputer-notebook","notebook","pycryptobot"],"created_at":"2024-11-19T04:54:02.056Z","updated_at":"2025-03-13T09:20:02.404Z","avatar_url":"https://github.com/shortthirdman.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Big Data MLOps Platform\n\nData Science and Machine Learning Jupyter Notebooks\n\n[![Made withJupyter](https://img.shields.io/badge/Made%20with-Jupyter-orange?style=for-the-badge\u0026logo=Jupyter)](https://jupyter.org/try)\t[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Naereen/badges)\t[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/shortthirdman/DataScience-Jupyter-Notebooks/main)\n\n![GitHub commit activity](https://img.shields.io/github/commit-activity/m/shortthirdman/DataScience-Jupyter-Notebooks)\t![GitHub commit activity](https://img.shields.io/github/commit-activity/t/shortthirdman/DataScience-Jupyter-Notebooks)\t![GitHub Created At](https://img.shields.io/github/created-at/shortthirdman/DataScience-Jupyter-Notebooks)\t![GitHub last commit](https://img.shields.io/github/last-commit/shortthirdman/DataScience-Jupyter-Notebooks)\t![GitHub repo size](https://img.shields.io/github/repo-size/shortthirdman/DataScience-Jupyter-Notebooks)\t[![Docker Image CI](https://github.com/shortthirdman/DataScience-Jupyter-Notebooks/actions/workflows/docker.yaml/badge.svg?branch=main\u0026event=workflow_run)](https://github.com/shortthirdman/DataScience-Jupyter-Notebooks/actions/workflows/docker.yaml)\n\n\n## Tech Stack\n\nshortthirdman/DataScience-Jupyter-Notebooks is built on the following main stack:\n\n- \u003cimg width='25' height='25' src='https://img.stackshare.io/service/993/pUBY5pVj.png' alt='Python'/\u003e [Python](https://www.python.org) – Languages\n- \u003cimg width='25' height='25' src='https://img.stackshare.io/service/586/n4u37v9t_400x400.png' alt='Docker'/\u003e [Docker](https://www.docker.com/) – Virtual Machine Platforms \u0026 Containers\n- \u003cimg width='25' height='25' src='https://img.stackshare.io/service/11563/actions.png' alt='GitHub Actions'/\u003e [GitHub Actions](https://github.com/features/actions) – Continuous Integration\n- \u003cimg width='25' height='25' src='https://img.stackshare.io/service/4190/fGBUdNf__400x400.jpg' alt='Jupyter'/\u003e [Jupyter](http://jupyter.org) – Data Science Notebooks\n\nFull tech stack [here](/techstack.md)\n\n\n\n## Setup References\n\n- [Ready-to-run Docker images containing Jupyter applications - jupyter/docker-stacks](https://github.com/jupyter/docker-stacks)\n\n- [Kaggle/docker-python - Kaggle Python Docker image](https://github.com/Kaggle/docker-python)\n\n\n\n## Dataset References\n\n- [Apple Stock Price](https://www.kaggle.com/datasets/rafsunahmad/apple-stock-price)\n\n- [Spotify Dataset 2023](https://www.kaggle.com/datasets/tonygordonjr/spotify-dataset-2023)\n\n- [Online Retail Transactions](https://www.kaggle.com/datasets/thedevastator/online-retail-transaction-data)\n\n- [Stock Market](https://www.kaggle.com/datasets/jacksoncrow/stock-market-dataset)\n\n- [American Airlines Group Stock](https://www.kaggle.com/datasets/varpit94/american-airlines-group-stock-data)\n\n- [Amazon (AMZN) Historical Stock Price](https://www.kaggle.com/datasets/specter7/amazon-amzn-historical-stock-price-data)\n\n## Docker commands\n\n```shell\n docker system prune --all --volumes --force\n```\n\n```shell\n docker build --no-cache -f Dockerfile --progress=auto --compress --rm -t shortthirdman-org/bigdata-mlops-platform:latest .\n```\n\n```shell\ndocker buildx build --progress=auto --compress --rm -t shortthirdman-org/bigdata-mlops-platform:latest .\n```\n\n```shell\n docker run -d -n mlops -p 8888:8888 --restart unless-stopped shortthirdman-org/bigdata-mlops-platform:latest\n```\n\n\n## Local Setup\n\n  - Create a Python virtual environment and activate\n\t\n\t```shell\n\tpython -m venv dev\n\t```\n\t\n\t```shell\n\t.\\dev\\Scripts\\activate\n\t```\n\n  - Install the packages and dependencies as listed in requirements file\n\t\n\t```shell\n\tpip install -r requirements.txt --no-cache-dir --disable-pip-version-check\n\t```\n\n  - Start your development `Jupyter Notebook` or `Jupyter Lab` server\n\t\n\t```shell\n\tjupyter lab --notebook-dir=.\\notebooks --no-browser\n\t```\n\t\n\t```shell\n\tjupyter notebook\n\t```\n\t\n\t```\n\tjupyter_nbextensions_configurator\n\t```\n\n## References\n\n- [TimeGPT: The First Foundation Model for Time Series Forecasting](https://towardsdatascience.com/timegpt-the-first-foundation-model-for-time-series-forecasting-bf0a75e63b3a)\n\n- [Staggering Returns with PyCryptoBot](https://trading-data-analysis.pro/staggering-returns-with-pycryptobot-39dd2ef5ead5)\n\n- [Trading Data Analysis](https://trading-data-analysis.pro/)\n\n- [Phenomenal Returns with PyCryptoBot](https://trading-data-analysis.pro/phenomenal-returns-with-pycryptobot-16e62f5f684)\n\n- [Leveraging PyCryptoBot for Optimal Cryptocurrency Trading](https://coinsbench.com/leveraging-pycryptobot-for-optimal-cryptocurrency-trading-5b7082354cd3)\n\n- [Forecasting Stock Using Deep Learning Along With Indicators | Medium](https://medium.com/@redeaddiscolll/forecasting-stock-using-deep-learning-along-with-indicators-c1523101c08d)\n\n- [Forecasting Stock Using Deep Learning Along With Indicators | OnePageCode@SubStack](https://onepagecode.substack.com/p/forecasting-stock-using-deep-learning-220)\n\n- [Advanced Stock Pattern Prediction using LSTM with the Attention Mechanism in TensorFlow: A step by step Guide with Apple Inc. (AAPL) Data](https://drlee.io/advanced-stock-pattern-prediction-using-lstm-with-the-attention-mechanism-in-tensorflow-a-step-by-143a2e8b0e95)\n\n- [Spark and Docker: Your Spark development cycle just got 10x faster!](https://towardsdatascience.com/spark-and-docker-your-spark-development-cycle-just-got-10x-faster-f41ed50c67fd)\n\n- [datamechanics/spark on Docker Hub](https://hub.docker.com/r/datamechanics/spark)\n\n- [Setting up a Spark standalone cluster on Docker in layman terms](https://medium.com/@MarinAgli1/setting-up-a-spark-standalone-cluster-on-docker-in-layman-terms-8cbdc9fdd14b)\n\n- [Apache Spark Standalone Cluster on Docker](https://github.com/cluster-apps-on-docker/spark-standalone-cluster-on-docker)\n\n- [Visualizing Trading Signals in Python - Plot buy and sell trading signals in Python's graph](https://eodhd.medium.com/visualizing-trading-signals-in-python-3cab01cc5847)\n\n- [Apache Hadoop and Apache Spark for Big Data Analysis](https://towardsdatascience.com/apache-hadoop-and-apache-spark-for-big-data-analysis-daaf659fd0ee)\n\n- [Agent-Based Stock Trading: Design and Implementation](https://medium.com/@redeaddiscolll/agent-based-stock-trading-design-and-implementation-c2141fc8f984)\n\n- [Mastering K-Means Clustering](https://towardsdatascience.com/mastering-k-means-clustering-065bc42637e4)\n\n- [Additive Decision Trees - An interpretable classification and regression model](https://towardsdatascience.com/additive-decision-trees-85f2feda2223)\n\n- [Interpretable kNN (ikNN) - An interpretable classifier](https://towardsdatascience.com/interpretable-knn-iknn-33d38402b8fc)\n\n- [Interpretable Outlier Detection: Frequent Patterns Outlier Factor (FPOF)](https://towardsdatascience.com/interpretable-outlier-detection-frequent-patterns-outlier-factor-fpof-0d9cbf51b17a)\n\n- [Telco Customer Churn - Kaggle](https://www.kaggle.com/datasets/blastchar/telco-customer-churn)\n\n- [End-to-End Machine Learning Project: TelCo Churn Prediction](https://medium.com/@ramazanolmeez/end-to-end-machine-learning-project-churn-prediction-e9c4d0322ac9)\n\n- [Predicting Stock Prices with Monte Carlo Simulations](https://medium.com/@antoine.boucher012/predicting-stock-prices-with-monte-carlo-simulations-0884ef32c35b)\n\n- [Artificial Intelligence (AI) models for Trading - Exploring Random Forests from Machine Learning](https://medium.com/coinmonks/artificial-intelligence-ai-models-for-trading-0bfd308d012d)\n\n- [XGBoost for Stock Price Forecasting](https://medium.com/@bugragultekin/xgboost-for-stock-price-forecasting-64f89719a8e4)\n\n- [Model Interpretability Using Credit Card Fraud Data](https://towardsdatascience.com/model-interpretability-using-credit-card-fraud-data-f219ff7ec89d)\n\n- [How Many Pokémon Fit?](https://towardsdatascience.com/how-many-pok%C3%A9mon-fit-84f812c0387e)\n\n- [Gold Price Prediction Using Machine Learning](https://medium.com/@iabbasali/gold-price-prediction-using-machine-learning-24e23841de52)\n\n- [Data Leakage in Preprocessing, Explained: A Visual Guide with Code Examples](https://towardsdatascience.com/data-leakage-in-preprocessing-explained-a-visual-guide-with-code-examples-33cbf07507b7)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshortthirdman%2Fdatascience-jupyter-notebooks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshortthirdman%2Fdatascience-jupyter-notebooks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshortthirdman%2Fdatascience-jupyter-notebooks/lists"}