Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/ShawhinT/YouTube-Blog

Codes to complement YouTube videos and blog posts on Medium.

data-science example-code machine-learning medium-articles youtube

Last synced: 25 Nov 2024

https://github.com/Chicago/food-inspections-evaluation

This repository contains the code to generate predictions of critical violations at food establishments in Chicago. It also contains the results of an evaluation of the effectiveness of those predictions.

cdph chicago data-science food-poisoning open-data open-science public-health

Last synced: 30 Oct 2024

https://github.com/tobgu/qframe

Immutable data frame for Go

data-frame data-science dataframe go golang immutable

Last synced: 08 Feb 2025

https://github.com/climbsrocks/machinejs

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

auto-ml automated-machine-learning automl data-science data-scientists javascript javascript-library kaggle machine-learning machine-learning-algorithms machine-learning-library ml numerai scikit-learn

Last synced: 09 Feb 2025

https://github.com/sforaidl/genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL

algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms

Last synced: 05 Feb 2025

https://github.com/SforAiDl/genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL

algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms

Last synced: 12 Nov 2024

https://github.com/5agado/data-science-learning

Repository of code and resources related to different data science and machine learning topics. For learning, practice and teaching purposes.

data-science deep-learning jupyter-notebook learning-by-doing machine-learning statistics

Last synced: 08 Nov 2024

https://github.com/platonai/PulsarRPA

Automate webpages at scale, scrape web data completely and accurately with high performance, distributed RPA.

crawler data-mining data-science rpa scraper scraping web-automation web-crawler web-mining web-scraping web-sql

Last synced: 05 Nov 2024

https://github.com/ledell/useR-machine-learning-tutorial

useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html

data-science deep-learning ensemble-learning gradient-boosting-machine machine-learning r random-forest tutorial

Last synced: 27 Nov 2024

https://github.com/ledell/user-machine-learning-tutorial

useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html

data-science deep-learning ensemble-learning gradient-boosting-machine machine-learning r random-forest tutorial

Last synced: 04 Feb 2025

https://github.com/rio-labs/rio

WebApps in pure Python. No JavaScript, HTML and CSS needed

data-analysis data-science data-visualization deep-learning machine-learning python ui webapp

Last synced: 06 Nov 2024

https://github.com/kevinliao159/mydatascienceportfolio

Applying Data Science and Machine Learning to Solve Real World Business Problems

api data-science data-visualization machine-learning neural-networks nlp recommendation-system spark

Last synced: 05 Feb 2025

https://github.com/kunalj101/Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 13 Nov 2024

https://github.com/aiguofer/gspread-pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

data data-analytics data-engineering data-science dataframes google google-sheets google-spreadsheets gspread pandas python sheets

Last synced: 06 Feb 2025

https://github.com/kunalj101/data-science-hacks

Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.

computer-vision data data-analysis data-science data-visualization dataset hacks image-augmentation ipynb machine-learning nlp nlp-machine-learning numpy pandas pandas-dataframe pandas-python pandas-tutorial python python3 tips-and-tricks

Last synced: 11 Oct 2024

https://github.com/airalcorn2/Michael-s-Data-Science-Curriculum

This is the companion curriculum to my guide to becoming a data scientist.

curriculum data-science machine-learning statistics

Last synced: 22 Nov 2024

https://github.com/epistasislab/scikit-rebate

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

data-science feature-selection python

Last synced: 10 Feb 2025

https://github.com/EpistasisLab/scikit-rebate

A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

data-science feature-selection python

Last synced: 30 Oct 2024

https://github.com/basedosdados/sdk

⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/sdk/

bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia

Last synced: 09 Feb 2025

https://github.com/basedosdados/mais

⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/

bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia

Last synced: 13 Oct 2024

https://github.com/ptyadana/data-science-and-machine-learning-projects-dojo

collections of data science, machine learning and data visualization projects with pandas, sklearn, matplotlib, tensorflow2, Keras, various ML algorithms like random forest classifier, boosting, etc

boosting-algorithms data-analysis data-science data-visualization deep-learning keras machine-learning machine-learning-algorithms natural-language-processing pandas probability-statistics scikit-learn seaborn tensorflow

Last synced: 05 Feb 2025

https://github.com/solegalli/feature-engineering-for-machine-learning

Code repository for the online course Feature Engineering for Machine Learning

data-science feature-engineering feature-extraction machine-learning python

Last synced: 08 Feb 2025

https://github.com/dagshub/fds

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

data-science dvc git

Last synced: 07 Feb 2025

https://github.com/weijie-chen/econometrics-with-python

Tutorials of econometrics featuring Python programming. This is a crash course for reviewing the most important concepts and techniques of basic econometrics, the theories are presented lightly without hustles of derivation and Python codes are straightforward.

data-analysis data-science econometrics economics python statistics time-series

Last synced: 09 Feb 2025

https://github.com/operatorai/modelstore

🏬 modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud storage provider.

data-science keras machine-learning mlops modelstore python-library pytorch s3-storage scikit-learn tensorflow transformer

Last synced: 09 Feb 2025

https://github.com/plotly/dashR

Create data science and AI web apps in R

dash data-science data-visualization plotly plotly-dash python r react web-application

Last synced: 27 Oct 2024

https://github.com/DagsHub/fds

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

data-science dvc git

Last synced: 15 Nov 2024

https://github.com/finos/jupyterlab_templates

Support for jupyter notebook templates in jupyterlab

data-science dataviz jupyter jupyterlab jupyterlab-extension machine-learning notebook

Last synced: 07 Nov 2024

https://github.com/wilsonrljr/sysidentpy

A Python Package For System Identification Using NARMAX Models

data-science dynamical-systems machine-learning narmax narx system-identification time-series

Last synced: 12 Nov 2024

https://github.com/matrix-profile-foundation/matrixprofile

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.

algorithms anomaly-detection clustering data-mining data-science hacktoberfest matrixprofile motif-discovery python python2 python3 segmentation time-series time-series-analysis

Last synced: 09 Feb 2025

https://github.com/thoughtworks/mlops-platforms

Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...

azureml data-science databricks dataiku datarobot google-ai-platform h2oai iguazio knime kubeflow machine-learning mlflow mlops pachyderm sagemaker seldon

Last synced: 12 Nov 2024

https://github.com/jkrumbiegel/chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 08 Feb 2025

https://github.com/anothersamwilson/miceforest

Multiple Imputation with LightGBM in Python

data-science imputed-values mice-algorithm python random-forest

Last synced: 10 Feb 2025

https://github.com/aunum/goro

A High-level Machine Learning Library for Go

data-science go golang machine-learning machinelearning

Last synced: 28 Oct 2024

https://github.com/adicherlavenkatasai/ml-workspace

Machine Learning (Beginners Hub), information(courses, books, cheat sheets, live sessions) related to machine learning, data science and python is available

cheat-sheets convolutional-networks data-science deep-learning deep-neural-networks gans harvard-edx interview-questions machine-learning python

Last synced: 31 Oct 2024

https://github.com/jkrumbiegel/Chain.jl

A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.

data-analysis data-science julia julia-language julia-package macro pipeline

Last synced: 19 Nov 2024

https://github.com/aaronpenne/data_visualization

A collection of my data visualizations, mostly in Python.

data-science data-visualization python3 visualization

Last synced: 25 Oct 2024

https://github.com/xoolive/traffic

A toolbox for processing and analysing air traffic data

adsb air-traffic-data data-analytics data-science data-visualisation declarative-pipeline mode-s trajectory

Last synced: 07 Feb 2025

https://github.com/maxhalford/xam

:dart: Personal data science and machine learning toolbox

data-science machine-learning preprocessing python stacking

Last synced: 06 Feb 2025

https://github.com/MaxHalford/xam

:dart: Personal data science and machine learning toolbox

data-science machine-learning preprocessing python stacking

Last synced: 15 Nov 2024

https://github.com/joaquinamatrodrigo/estadistica-con-r

Apuntes personales sobre estadística, machine learning y lenguaje de programación R

bioestadistica data-mining data-science estadistica machine-learning mineria-de-datos r

Last synced: 10 Feb 2025

https://github.com/aeturrell/skimpy

skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.

data-science eda exploratory-data-analysis pandas statistics summary-statistics

Last synced: 14 Nov 2024

https://github.com/olavolav/uniplot

Lightweight plotting to the terminal. 4x resolution via Unicode.

data-analysis data-science plot python

Last synced: 31 Oct 2024

https://github.com/souzatharsis/open-quant-live-book

An open source, hands-on and fully reproducible book in quantitative finance, data science and econophysics. Join us and help Make Wall Street Great Again!

algo-trading altdata data-science econophysics financial-analysis financial-markets machine-learning open-source quantitative-finance

Last synced: 05 Jan 2025

https://github.com/tellery/tellery

Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.

analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql

Last synced: 10 Feb 2025

https://github.com/AnotherSamWilson/miceforest

Multiple Imputation with LightGBM in Python

data-science imputed-values mice-algorithm python random-forest

Last synced: 22 Nov 2024

https://github.com/meteostat/meteostat-python

Access and analyze historical weather and climate data with Python.

climate climate-change climate-data data-science meteostat open-data statistics weather weather-data weather-station

Last synced: 27 Nov 2024

https://github.com/InseeFrLab/onyxia

🔬 Data science environment for k8s

bluehats data-science datalab helm insee kubernetes onyxia

Last synced: 27 Dec 2024

https://github.com/finlay-liu/kaggle_public

阿水的数据竞赛开源分支

data-science kaggle-competition

Last synced: 06 Feb 2025

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 07 Feb 2025

https://github.com/KiranGershenfeld/VisualizingTwitchCommunities

Graphing communities on Twitch.tv in a visually intuitive way

community data-science python twitch visualization

Last synced: 25 Oct 2024

https://github.com/datmo/datmo

Open source production model management tool for data scientists

artificial-intelligence data-science deep-learning machine-learning reproducibility version-control

Last synced: 12 Nov 2024

https://github.com/datalayer/jupyter-ui

⚛️ React.js components 💯% compatible with 🪐 Jupyter - Storybook on https://jupyter-ui-storybook.datalayer.tech

data data-product data-science data-visualisation datalayer ipywidgets jupyter jupyterlab lumino notebook reactjs ui

Last synced: 07 Feb 2025

https://github.com/larswaechter/voici.js

A Node.js library for pretty printing your data on the terminal🎨

console data-science javascript shell terminal tty typescript

Last synced: 31 Oct 2024

https://github.com/yzhao062/data-mining-conferences

Ranking, acceptance rate, deadline, and publication tips

data-mining data-science research

Last synced: 07 Feb 2025

https://github.com/petrobras/3w

Promotes development of ML algorithms for early detection and classification of undesirable events in offshore oil wells.

anomaly-detection data-science machine-learning multivariate-time-series-analysis oil-well-monitoring

Last synced: 10 Feb 2025

https://github.com/anonyfox/elixir-scrape

Scrape any website, article or RSS/Atom Feed with ease!

data-science elixir feed html information-retrieval readability rss scrape scraping

Last synced: 05 Feb 2025

https://github.com/jovianhq/opendatasets

A Python library for downloading datasets from Kaggle, Google Drive, and other online sources.

data-science datasets machine-learning python

Last synced: 08 Feb 2025

https://github.com/Anonyfox/elixir-scrape

Scrape any website, article or RSS/Atom Feed with ease!

data-science elixir feed html information-retrieval readability rss scrape scraping

Last synced: 01 Nov 2024

https://github.com/machine-learning-apps/issue-label-bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 23 Jan 2025

https://github.com/machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 25 Oct 2024

https://github.com/profjsb/python-seminar

Python for Data Science (Seminar Course at UC Berkeley; AY 250)

data-science distributed-computing machine-learning python visualization

Last synced: 27 Nov 2024

https://github.com/maxhumber/redframes

General Purpose Data Manipulation Library

data-science pandas python

Last synced: 06 Feb 2025

https://github.com/tommyod/efficient-apriori

An efficient Python implementation of the Apriori algorithm.

apriori-algorithm association-rules data-mining data-science machinelearning

Last synced: 07 Feb 2025