Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2025-02-11 00:07:25 UTC
- JSON Representation
https://github.com/scrapinghub/aduana
Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even when making big crawls (one billion pages).
Last synced: 10 Nov 2024
https://github.com/tommyod/paretoset
Compute the Pareto (non-dominated) set, i.e., skyline operator/query.
data-mining data-science datascience multi-objective-optimization optimization pandas skyline-query
Last synced: 10 Feb 2025
https://github.com/codait/presentations
Talks & Workshops by the CODAIT team
data-science deep-learning fairness-ai fairness-ml machine-learning open-source presentations
Last synced: 09 Nov 2024
https://github.com/ActivitySim/populationsim
An Open Platform for Population Synthesis
activitysim bsd-3-clause data-science microsimulation population-synthesis python
Last synced: 27 Oct 2024
https://github.com/datakitchen/dataops-testgen
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring
data data-engineering data-observability data-quality data-science data-testing datachecker dataops dataprofiling dataquality datavalidation mssql postgresql python redshift self-hosted snowflake
Last synced: 06 Feb 2025
https://github.com/junpenglao/planet_sakaar_data_science
A colourful collection of codes and notebooks, like Planet Sakaar
bayesian-inference data-science pymc3
Last synced: 02 Nov 2024
https://github.com/alinski29/stonks.jl
Julia library for standardizing financial data retrieval and storage from multiple APIs.
data data-mining data-science dataframe finance julia trading trading-algorithms
Last synced: 02 Nov 2024
https://github.com/stitchfix/mab
Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.
data-science experimentation go golang multi-armed-bandit multi-armed-bandits multiarmed-bandits reinforcement-learning thompson thompson-sampling
Last synced: 10 Feb 2025
https://github.com/dc-aichara/DS-ML-Public
Python Scripts and Jupyter Notebooks
bayesian-optimization beautifulsoup bitcoin catboost dash dashboard data-analysis data-mining data-science data-visualisation hyperparameter-tuning hyperparameters-optimization lightgbm machine-learning news plotly python telegram web-scraping xgboost
Last synced: 15 Nov 2024
https://github.com/realpython/web-dev-for-data-scientists
data-science flask python webdevelopment
Last synced: 17 Nov 2024
https://github.com/argmaxml/conjugate_prior
Implementation of the conjugate prior table for Bayesian Statistics
bayesian-statistics conjugation data data-science likelihood probabilistic-programming probability statistical-models statistics
Last synced: 28 Jan 2025
https://ddotta.github.io/cookbook-rpolars/
Cookbook to provide solutions to common tasks and problems in using Polars with R
benchmark cookbook data-engineering data-science datatable dplyr polars r tidyr
Last synced: 18 Nov 2024
https://github.com/bovem/publications
My publications on Medium
articles calculus data-science linear-algebra math mathematics matrices python3 science statistics subspaces tensorflow tutorials
Last synced: 11 Jan 2025
https://github.com/fcakyon/instafake-dataset
Dataset for Intagram Fake and Automated Account Detection
bot classification data-science dataset fake instafake instagram machine-learning research
Last synced: 06 Jan 2025
https://github.com/loukesio/ggvolc
𝐠𝐠𝐯𝐨𝐥𝐜 effortlessly translates differential expression datasets and RNAseq data into informative volcano plots. Highlight genes of interest with unprecedented ease. With just a single line of code, visualize complex datasets, gaining deeper insights and simplifying data representation
bioinformatics data-science data-visualization gro-seq rna-seq
Last synced: 21 Dec 2024
https://github.com/ulikoehler/uliengineering
A python library for calculations perfomed in electronics engineering
data-analysis data-science electronics engineering python
Last synced: 10 Feb 2025
https://github.com/mikeizbicki/cmc-csci046
CMC's Data Structures and Algorithms Course Materials
cmc computer-science course data-science python3
Last synced: 07 Feb 2025
https://github.com/team-fastml/fastml
A Python package built on sklearn for running a series of classification Algorithms in a faster and easier way.
algorithms data-science deep-learning machine-learning machine-learning-algorithms neural-network python
Last synced: 04 Feb 2025
https://github.com/Inist-CNRS/lodex
Linked Open Data EXperiment
data-science data-structures datavisualization mongo nodejs
Last synced: 21 Dec 2024
https://github.com/mikeroyal/apache-flink-guide
Apache Flink Guide
data-science database flink flink-kafka flink-stream-processing flink-streaming stream-processing streaming
Last synced: 05 Feb 2025
https://github.com/elshor/dstools
Javascript tools and utilities for the data scientist
Last synced: 27 Oct 2024
https://github.com/dlab-berkeley/Python-Data-Wrangling-Legacy
D-Lab's 3 hour introduction to data wrangling in Python. Learn how to import and manipulate dataframes using pandas in Python.
Last synced: 11 Nov 2024
https://github.com/BojarLab/glycowork
Package for processing and analyzing glycans and their role in biology.
bioinformatics computational-biology data-science glycans glycobiology machine-learning molecular-biology open-source python
Last synced: 18 Jan 2025
https://github.com/ahmed-mohamed-sn/olliePy
OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.
ai analytics charts dashboard data data-analytics data-science data-scientist eda error-analysis exploratory-data-analysis machine-learning python visualization
Last synced: 15 Nov 2024
https://github.com/svilupp/awesome-generative-ai-meets-julia-language
Comprehensive guide to generative AI projects and resources in Julia.
awesome awesome-list data-science generative-ai julia
Last synced: 28 Oct 2024
https://github.com/pnavaro/big-data
Python tools for big data
dask data-science hadoop jupyter-book notebooks python spark
Last synced: 02 Nov 2024
https://github.com/rthorst/mint_condition
Automatic Sports Trading Card Grading
computer-vision data-science machine-learning sports trading-cards
Last synced: 04 Dec 2024
https://github.com/noahgift/myrepo
continuous integration rep
build circleci continuous-integration data-science jupyter-notebook nbval pytest python testing
Last synced: 10 Feb 2025
https://github.com/asad70/insider-trading
This program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
algotrading data-science extract-data insider-trading insiders tickers trading trading-strategies
Last synced: 11 Nov 2024
https://github.com/zeeshanahmad4/stock-prices-prediction-ml-flask-dashboard
This program predicts the price of GOOG stock for a specific day using the Machine Learning algorithm called Support Vector Regression (SVR) Linear Regression. Importing flask module in the project is mandatory An object of Flask class is our WSGI application.
classification data-mining data-science data-visualization dataset flask flask-dashboard linear-regression ml prediction prediction-algorithm prediction-model predictive-analytics python stock-analysis stock-market stock-prices stock-prices-prediction stock-trading visualization
Last synced: 24 Jan 2025
https://github.com/arturomoncadatorres/deepsurvk
Implementation of DeepSurv using Keras
data-science deep-learning keras survival-analysis tensorflow2
Last synced: 17 Dec 2024
https://github.com/shlizee/NeuroAI
NeuroAI-UW seminar, a regular weekly seminar for the UW community, organized by NeuroAI Shlizerman Lab.
ai cvpr data-science deep-learning eccv icml neural-networks neurips neuroscience-methods recurrent-neural-networks sfn
Last synced: 12 Nov 2024
https://github.com/sparkfish/shabby-pages
ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to original denoised documents.
binarization born-digital computer-vision corpus data-science dataset denoising layout-detection
Last synced: 17 Dec 2024
https://github.com/rcdilorenzo/ecce
ML Prediction of Bible Topics and Passages (Python / React)
data-science fastapi fully-connected-network interactive-visualizations keras-tensorflow reactjs
Last synced: 11 Nov 2024
https://github.com/kennethleungty/end-to-end-automl-insurance
An End-to-End Implementation of AutoML with H2O, MLflow, FastAPI, and Streamlit for Insurance Cross-Sell
automl data-science fastapi h2o h2o-automl machine-learning mlflow mlops python streamlit
Last synced: 22 Nov 2024
https://github.com/jacksonburns/astartes
Better Data Splits for Machine Learning
ai data-science machine-learning ml python sampling
Last synced: 19 Dec 2024
https://github.com/kaggledatasets/kaggledatasets
Collection of Kaggle Datasets ready to use for Everyone (Looking for contributors)
data-science datasets deep-learning kaggle keras machine-learning python pytorch scikit-learn tensorflow
Last synced: 13 Oct 2024
https://github.com/patilharshal16/data-structures
Computer science data structures and algorithms implementation from scratch
algorithms computer-science data-science data-structures datascience datastructures deque doubly-linked-list enqueue implementation-from-scratch implementation-of-algorithms implementation-of-data-structures java java-8 linked-list queue searching-algorithms sorting-algorithm sorting-algorithms stack
Last synced: 05 Nov 2024
https://github.com/zincware/ZnTrack
Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.
data-science data-version-control developer-tools dvc git machine-learning python reproducibility
Last synced: 14 Nov 2024
https://github.com/daun-io/study-data-science
Practical data science notebooks that I used to study at 2016
data-science jupyter-notebook machine-learning tensorflow
Last synced: 30 Nov 2024
https://github.com/minerva-ml/minerva-training-materials
Learn advanced data science on real-life, curated problems
data-science data-science-experience data-science-learning deep-learning education ipython-notebook knowledge machine-learning machine-learning-algorithms minerva neptune neural-network python python3 training training-materials training-module
Last synced: 21 Jan 2025
https://github.com/PKU-DAIR/mindware
An efficient open-source AutoML system for automating machine learning lifecycle, including feature engineering, neural architecture search, and hyper-parameter tuning.
automl-algorithms automl-pipeline bayesian-optimization blackbox-optimization data-science deep-learning distributed-systems ensemble-learning hyper-parameter-optimization knobs-tuning machine-learning meta-learning neural-architecture-search python
Last synced: 16 Nov 2024
https://github.com/danielhanchen/sciblox
sciblox - Easier Data Science and Machine Learning
boosting data-analysis data-mining data-preprocessing data-science data-visualization imputation machine-learning python sklearn
Last synced: 22 Oct 2024
https://github.com/dsfsi/covid19africa
Africa open COVID-19 data working group
africa collate-data coronavirus coronavirus-pandemic covid-19 covid19 covid19-data data-science dataset doi volunteers
Last synced: 21 Jan 2025
https://github.com/nolanbconaway/pitchfork-data
Analyses on over 18,000 pitchfork reviews.
data-science ipynb jupyter music pitchfork
Last synced: 02 Jan 2025
https://github.com/daun-io/Study-Data-Science
Practical data science notebooks that I used to study at 2016
data-science jupyter-notebook machine-learning tensorflow
Last synced: 27 Nov 2024
https://github.com/DS2BRAIN/ds2
Easiest way to use AI models without coding (Web UI & API support)
ai annotation-tool auto-labeling automl dalle data-science deep-learning feature-engineering huggingface image-annotation-tool image-to-text machine-learning ml mlops neural-network python pytorch stable-diffusion tensorflow text-annotation
Last synced: 17 Jan 2025
https://github.com/great-northern-diver/loon
A Toolkit for Interactive Statistical Data Visualization
data-analysis data-science data-visualization exploratory-analysis exploratory-data-analysis high-dimensional-data interactive-graphics interactive-visualizations loon python statistical-analysis statistical-graphics statistics tcl-extension tk
Last synced: 22 Nov 2024
https://github.com/jonathandinu/spark-ray-data-science
Supporting content (slides and exercises) for the Pearson video series covering best practices for developing scalable applications with Spark and Ray in the context of a data scientist's standard workflow.
artificial-intelligence data-science distributed-computing machine-learning python ray spark
Last synced: 15 Nov 2024
https://github.com/rubixml/housing
An example project that predicts house prices for a Kaggle competition using a Gradient Boosted Machine.
data-science ensemble gradient-boost gradient-boosted-trees gradient-boosting gradient-boosting-machine gradient-boosting-regressor gradient-descent housing-prices kaggle kaggle-competition machine-learning machine-learning-tutorial php php-machine-learning php-ml predicting-housing-prices regression rubix rubix-ml
Last synced: 04 Dec 2024
https://github.com/credo-ai/credoai_lens
Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.
ai artificial-intelligence assessment data-science ethical-artificial-intelligence fairness-ai fairness-ml jupyter machine-learning ml python reporting responsible-ai visualization
Last synced: 21 Jan 2025
https://github.com/lter/lterdatasampler
LTER data samples to teach environmental data science
data-science ecology lter-science r r-package
Last synced: 27 Oct 2024
https://github.com/krassowski/jupyter-helpers
A collection of helpers for Jupyter/IPython
data-science jupyter jupyter-lab jupyter-notebook jupyter-widget jupyterlab jupyterlab-extension
Last synced: 13 Jan 2025
https://github.com/tatevkaren/artificial-neural-network-business_case_study
Business Case Study to predict customer churn rate based on Artificial Neural Network (ANN), with TensorFlow and Keras in Python. This is a customer churn analysis that contains training, testing, and evaluation of an ANN model. (Includes: Case Study Paper, Code)
ann ann-model artificial-neural-network artificial-neural-networks bank-customers case-study churn-analysis data-science deep-learning machine-learning prediction-model predictive-analytics python3 tensorflow-tutorials
Last synced: 12 Nov 2024
https://github.com/microsoft/automated-explanations
Generating and validating natural-language explanations.
artificial-intelligence automated-interpretability data-science explanation fmri fmri-data-analysis gpt gpt4 huggingface interpretability language-model large-language-models machine-learning mechanistic-interpretability neuroscience xai
Last synced: 10 Feb 2025
https://github.com/theengineeringworld/statistics-using-python
These files are part of Youtube Course "Statistics Using Python" Offered By The Engineering WOrld. Offered By: http://youtube.com/theengineeringworld
cleaning data-analysis data-mining data-science data-visualization database jupyter-notebooks python python3 statistics
Last synced: 08 Nov 2024
https://github.com/criccomini/proto-schema-parser
A Pure Python Protobuf Parser
abstract-syntax-tree antlr bufbuild data-engineering data-science lexer lexer-parser parser protobuf protocol-buffers python schema
Last synced: 07 Feb 2025
https://github.com/ahmedbesbes/understanding-deep-convolutional-neural-networks-with-a-practical-use-case-in-tensorflow-and-keras
What makes convnets so powerful at image classification?
article blog computer-vision convnet convolution-filter convolutional-neural-networks data-science dataset deep-learning deep-learning-tutorial image-classification kaggle kaggle-cats kdd keras keras-tutorials python tensorflow
Last synced: 23 Nov 2024
https://github.com/lachhebo/pyclustertend
A python package to assess cluster tendency
cluster-analysis cluster-tendency clustering clustertendency data-science hopkins ivat machine-learning python scikit-learn statistics vat visual-assessment-cluster-tendency
Last synced: 09 Feb 2025
https://github.com/okfn-brasil/whistleblower
🚨A Twitter bot for publicly reporting suspicions found by Rosie, Serenata de Amor's AI
data-science facebook-messenger-bot machine-learning twitter-bot
Last synced: 31 Oct 2024
https://github.com/ahmedbesbes/Understanding-deep-Convolutional-Neural-Networks-with-a-practical-use-case-in-Tensorflow-and-Keras
What makes convnets so powerful at image classification?
article blog computer-vision convnet convolution-filter convolutional-neural-networks data-science dataset deep-learning deep-learning-tutorial image-classification kaggle kaggle-cats kdd keras keras-tutorials python tensorflow
Last synced: 27 Nov 2024
https://github.com/electronick1/stairs
Framework which helps you to make parallel/distributed calculations using data pipelines
data-engineering data-pipeline data-science distributed-computing python
Last synced: 10 Nov 2024
https://github.com/opengeos/geoai
A Python package for using Artificial Intelligence (AI) with geospatial data
ai data-science geoai geopython geospatial jupyter python
Last synced: 11 Nov 2024
https://github.com/tatevkaren/tatevkaren-data-science-portfolio
Data Science Portfolio of Tatev Karen Aslanyan including Case Studies and Research Projects that I have completed that solve business problems or introduce new products. Case Study papers, codes, and additional resources are all included.
blog case-study computer-science data-analysis data-science deep-learning econometrics machine-learning papers portfolio portfolio-website statistics
Last synced: 07 Dec 2024
https://github.com/welding-torch/excel-anonymizer
A Python script that anonymizes an Excel file and synthesizes new data in its place.
data-science microsoft nlp pandas presidio privacy
Last synced: 07 Nov 2024
https://github.com/kb22/GitHub-User-Insights-using-API
The project involves using the GitHub API using user authentication to fetch information such as commits and repositories for that specific user and store them as CSV files for data collection and analysis.
api data-analysis data-science data-scraping github-api python
Last synced: 08 Nov 2024
https://github.com/devinterview-io/data-scientist-interview-questions
🟣 Data Scientist interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist data-scientist-interview data-scientist-interview-questions data-scientist-questions data-scientist-tech-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 06 Feb 2025
https://github.com/schochastics/football-data
football (soccer) datasets
data-analysis data-science data-visualization dataset football-data rstats soccer-data
Last synced: 27 Oct 2024
https://github.com/henestrosadev/sololearn
Compilation of all SoloLearn courses with their respective projects and practices and all 72 code challenges for all 7 supported languages.
code-challenge code-practice data-science programming-exercises programming-languages python sololearn sololearn-cert sololearn-solutions
Last synced: 27 Oct 2024
https://github.com/lkuffo/data-viz
Más de 50 ejemplos de visualizaciones y análisis de datos en Matplotlib, Pandas, Seaborn, Plotly, Bokeh y Networkx
data-analysis data-science dataviz geoviz jupyter jupyter-notebook matplotlib networkx pandas plotly python seaborn
Last synced: 18 Dec 2024
https://github.com/dfinke/PSDuckDB
PSDuckDB is a PowerShell module that provides seamless integration with DuckDB, enabling efficient execution of analytical SQL queries directly from the PowerShell environment.
data-analysis data-science duckdb powershell sql
Last synced: 16 Dec 2024
https://github.com/fremantle-industries/prop
An open and opinionated trading platform using productive & familiar open source libraries and tools for strategy research, execution and operation.
algo-trading data-science defi elixir grafana trading-platform
Last synced: 07 Nov 2024
https://github.com/ztrimus/speech-emotion-recognition
Predicting various emotion in human speech signal by detecting different speech components affected by human emotion.
audio-files colab-notebook convolutional-neural-networks data-science deep-learning emotion-detection emotion-recognition jupyter-notebook keras librosa lstm natural-language-processing neural-network python3 pytorch rnn speech-emotion-recognition speech-recoginition supervised-learning voice
Last synced: 22 Jan 2025
https://github.com/jason2brownlee/machinelearningmischief
Machine Learning Mischief: Examples from the dark side of data science
data-science ethics hacking machine-learning statistics
Last synced: 24 Dec 2024
https://github.com/mlabonne/how-to-data-science
Scripts, notebooks, and articles about data science in general.
data-science numpy pandas pandas-dataframe python pytorch
Last synced: 02 Jan 2025
https://github.com/ropensci/rdataretriever
R interface to the Data Retriever
data data-science database datasets r r-package rstats science
Last synced: 04 Dec 2024
https://github.com/giswqs/postgis
Spatial Data Management with PostgreSQL and PostGIS https://gishub.org/sdm
data-science database geospatial postgis postgres postgresql
Last synced: 02 Nov 2024
https://github.com/varir/scikit-hubness
A Python package for hubness analysis and high-dimensional data mining
approximate-nearest-neighbor-search data-mining data-science high-dimensional-data hubness machine-learning nearest-neighbor-search
Last synced: 29 Jan 2025
https://github.com/google/bayesnf
Bayesian Neural Field models for prediction in large-scale spatiotemporal datasets
bayesian-inference data-science machine-learning spatiotemporal-data-analysis statistics
Last synced: 09 Jan 2025
https://github.com/soumyadip007/data-science-using-python-university-course-module
“Data science” is just about as broad of a term as they come. It may be easiest to describe what it is by listing its more concrete components: Data exploration & analysis. Included here: Pandas; NumPy; SciPy; a helping hand from Python's Standard Library.
data-preparation data-preprocessing data-processing data-science data-visualization jupyter-notebook knn numpy panda plotting python
Last synced: 28 Oct 2024
https://github.com/plantinformatics/pretzel
Javascript full-stack framework for Big Data visualisation and analysis
big-data bioinformatics data-science data-visualization ember emberjs express expressjs javascript open-source
Last synced: 23 Jan 2025
https://github.com/imgcook/datacook
Machine Learning and Data Analysis in JavaScript.
data-science feature-engineering javascript machine-learning
Last synced: 13 Nov 2024
https://github.com/kjappelbaum/mofdscribe
An ecosystem for digital reticular chemistry
artificial-intelligence benchmark data-science descriptors featurization hacktoberfest machine-learning metal-organic-frameworks metrics ml mof porous-materials reticular-chemistry splitting
Last synced: 05 Feb 2025
https://github.com/weiji14/deepbedmap
Going beyond BEDMAP2 using a super resolution deep neural network. Also a convenient flat file data repository for high resolution bed elevation datasets around Antarctica.
antarctica bedmap binder chainer data-science deep-neural-network digital-elevation-model flat-file-db generative-adversarial-network glaciology jupyter-notebook optuna pangeo remote-sensing super-resolution
Last synced: 07 Jan 2025
https://github.com/samcomber/spacv
Spatial cross-validation in Python.
cross-validation data-science geographic-data-science machine-learning python scikit-learn scikitlearn-machine-learning sklearn spatial-data-science
Last synced: 09 Feb 2025
https://github.com/bluebrain/nexus-forge
Building and Using Knowledge Graphs made easy
data-management data-science json-ld knowledge-engineering knowledge-graph knowledgegraph rdf shacl
Last synced: 10 Feb 2025
https://github.com/vida-nyu/data-polygamy
Data Polygamy is a topology-based framework that allows users to query for statistically significant relationships between spatio-temporal data sets.
Last synced: 24 Nov 2024
https://github.com/gaelforget/climatemodels.jl
Julia interface to climate models + tracked workflow framework
atmosphere climate cmip data data-science earth-observation ecco git interface ipcc julia mitgcm modeling ocean parameters workflow
Last synced: 19 Dec 2024
https://github.com/datalab-platform/datalab
Open-source Platform for Scientific and Technical Data Processing and Visualization
data-science data-visualization image-processing opencv python scientific-computing scikit-image scipy signal-processing visualization
Last synced: 05 Feb 2025
https://github.com/SamComber/spacv
Spatial cross-validation in Python.
cross-validation data-science geographic-data-science machine-learning python scikit-learn scikitlearn-machine-learning sklearn spatial-data-science
Last synced: 27 Oct 2024
https://github.com/dentrax/data-mining-algorithms
Data Mining Algorithms with C# using LINQ
algorithm apriori apriori-algorithm c45 clustering-algorithm data-mining data-mining-algorithms data-science desiciontree id3 id3-algorithm k-means k-nearest-neighbor linq nearest-neighbors
Last synced: 09 Nov 2024
https://github.com/gaelforget/ClimateModels.jl
Julia interface to climate models + tracked workflow framework
atmosphere climate cmip data data-science earth-observation ecco git interface ipcc julia mitgcm modeling ocean parameters workflow
Last synced: 27 Nov 2024
https://github.com/gianlucatruda/quantified-sleep
Quantified Sleep: Machine learning techniques for observational n-of-1 studies.
biohacking data-science explainable-ai imputation interpretable-machine-learning lasso machine-learning missing-data observational-studies oura-ring prediction quantified-self rescuetime sleep time-series
Last synced: 22 Oct 2024
https://github.com/ploomber/soopervisor
☁️ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.
airflow argo argo-workflows aws data-science kubeflow kubeflow-pipelines kubernetes machine-learning slurm workflow
Last synced: 19 Dec 2024
https://github.com/younes-charfaoui/feature-selection-techniques
Python code source for features selection 👨🔬 series on medium website. 📰
autoencoder data-science deep-learning embedded-methods feature-engineering feature-selection filter-methods jupyter-notebook machine-learning machinelearning-python pandas python series sklearn wrapper-methods
Last synced: 28 Oct 2024
https://github.com/alexhallam/tablespoon
🥄✨Time-series Benchmark methods that are Simple and Probabilistic
data-science forecasting mean naive probabilistic probabilistic-programming probability python scipy seasonal-naive simple simple-models time-series uncertainty-quantification
Last synced: 07 Nov 2024
https://github.com/elysian01/data-purifier
A Python library for Automated Exploratory Data Analysis, Automated Data Cleaning, and Automated Data Preprocessing For Machine Learning and Natural Language Processing Applications in Python.
data-analysis data-cleaning data-cleaning-pipeline data-preprocessing data-science data-visualization datapurifier eda exploratory-data-analysis jupyter python-lib python-library python3
Last synced: 07 Nov 2024
https://github.com/rfordatascience/rfordatasciencewiki
Resources for the R4DS Online Learning Community, including answer keys to the text
beginner beginner-friendly beginner-tutorial-series data-science help-wanted r4ds rstats rstudio tidyverse
Last synced: 14 Nov 2024
https://github.com/nrwade0/edX
Data Science courses in R from HarvardX
data-science harvardx inference linear-regression machine-learning r rmarkdown statistical-analysis visualization wrangling
Last synced: 04 Dec 2024