Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/briatte/dsr

Introduction to Data Science with R (Sciences Po, Paris, 2023)

course data-analysis data-science data-visualization r statistics

Last synced: 27 Oct 2024

https://github.com/elysian01/data-purifier

A Python library for Automated Exploratory Data Analysis, Automated Data Cleaning, and Automated Data Preprocessing For Machine Learning and Natural Language Processing Applications in Python.

data-analysis data-cleaning data-cleaning-pipeline data-preprocessing data-science data-visualization datapurifier eda exploratory-data-analysis jupyter python-lib python-library python3

Last synced: 07 Nov 2024

https://github.com/rfordatascience/rfordatasciencewiki

Resources for the R4DS Online Learning Community, including answer keys to the text

beginner beginner-friendly beginner-tutorial-series data-science help-wanted r4ds rstats rstudio tidyverse

Last synced: 14 Nov 2024

https://github.com/ploomber/soopervisor

☁️ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.

airflow argo argo-workflows aws data-science kubeflow kubeflow-pipelines kubernetes machine-learning slurm workflow

Last synced: 19 Dec 2024

https://github.com/joaopaulolndev/my-data-scientist-roadmap

Description about my roadmap to become Data Scientist and Engineer Machine Learning

artificial-intelligence data-science deep-learning machine-learning python python3

Last synced: 09 Jan 2025

https://github.com/codait/max-central-repo

Central Repository of Model Asset Exchange project. This repository contains information about the available models, current project status, contribution guidelines and supporting assets.

cloud codait data-science deep-learning ibm-developer kubernetes model-asset-exchange node-red-flow openshift trainable-models watson-machine-learning watson-st

Last synced: 09 Nov 2024

https://github.com/ActuariesInstitute/cookbook

Data and analytics cookbook for actuaries

actuarial analytics data-science hacktoberfest

Last synced: 27 Nov 2024

https://github.com/mlr-org/mlr3torch

Deep learning framework for the mlr3 ecosystem based on torch

data-science deep-learning machine-learning mlr3 r r-package torch

Last synced: 14 Feb 2025

https://github.com/stefan-m-lenz/BoltzmannMachines.jl

A Julia package for training and evaluating multimodal deep Boltzmann machines

data-science deep-boltzmann-machine deep-learning julia machine-learning neural-networks restricted-boltzmann-machine

Last synced: 13 Nov 2024

https://github.com/goldencheetah/scikit-sports

Sports analysis library for Python

data-science sports

Last synced: 08 Nov 2024

https://github.com/lukasmosser/snist

A Benchmark for Seismic Velocity Inversion from Synthetics

data-science deep-learning geology geophysics machine-learning physics seismic waveform

Last synced: 19 Dec 2024

https://github.com/joaquinamatrodrigo/cienciadedatos.net

Web de divulgación con material formativo sobre estadística, algoritmos de machine learning, ciencia de datos y programación en R y Python.

analytics ciencia-de-dados data-science estadistica forecasting machine-learning python r-programming rstats statistics

Last synced: 10 Jan 2025

https://github.com/nicolaskruchten/scipy2021

Data Visualization as the First and Last Mile of Data Science: Plotly Express and Dash

data-analysis data-science data-visualization python visualization

Last synced: 08 Nov 2024

https://github.com/franzdiebold/data-science-cheat-sheets

A collection of Data Science cheat sheets.

cheat-sheet cheat-sheets data-science pandas

Last synced: 15 Feb 2025

https://github.com/facultyai/scala-plotly-client

Visualise your data from Scala using Plotly

data-science graph plot plotly scala visualisation

Last synced: 08 Nov 2024

https://github.com/SOCR/SOCRAT

A Dynamic Web Toolbox for Interactive Data Processing, Analysis, and Visualization

data-analysis data-science data-visualization socr statistics visual-analytics visualization

Last synced: 03 Nov 2024

https://github.com/AidanCooper/shap-analysis-guide

How to Interpret SHAP Analyses: A Non-Technical Guide

data-science machine-learning shap tutorial

Last synced: 12 Nov 2024

https://github.com/repetere/modelscript

REPO MOVED TO https://github.com/repetere/jsonstack-data - Data Science and Machine learning in JavaScript

data-mining data-preprocessing data-science javascript machine-learning

Last synced: 21 Jan 2025

https://github.com/jules32/rmarkdown-website-tutorial

Tutorial for creating websites w/ R Markdown

data-science rmarkdown rstats teaching tutorial

Last synced: 16 Feb 2025

https://github.com/edinsonrequena/articicial-inteligence-and-data-science

Este repositorio esta basado principalmente en la carrera de machine learning y data science de platzi pero también habrán recursos de otras plataformas e instituciones educativas.

algebra algorithms articicial-inteligence data-science instituciones-educativas jupyter-notebook platzi python university

Last synced: 27 Oct 2024

https://github.com/m-dadej/marswitching.jl

MarSwitching.jl: Julia package for Markov switching dynamic models :chart_with_upwards_trend:

data-science econometrics julia machine-learning markov-chain statistics time-series

Last synced: 17 Feb 2025

https://github.com/dfinke/psduckdb

PSDuckDB is a PowerShell module that provides seamless integration with DuckDB, enabling efficient execution of analytical SQL queries directly from the PowerShell environment.

data-analysis data-science duckdb powershell sql

Last synced: 27 Oct 2024

https://github.com/pjaselin/cubist

A Python package for fitting Quinlan's Cubist regression model

data-science machine-learning python regression scikit-learn

Last synced: 14 Nov 2024

https://github.com/tatevkaren/free-resources-books-papers

Books and Papers in Mathematics, Econometrics, Machine Learning, Finance etc for different levels that can be useful for Data Scientists, Developers and everyone whoo is interesting in STEM.

books data-science databricks delta-lake developers econometrics free-books free-resources machine-learning mathematics statistics

Last synced: 02 Feb 2025

https://github.com/florents-tselai/greek-wines-analysis

Scraper, Data and Analysis for "Analyzing 1000+ Greek Wines with Python"

beautifulsoup data-science pandas python seaborn web-scraping

Last synced: 31 Oct 2024

https://github.com/leemengtw/gist-evernote

A Python application that sync Github Gists and save them to Evernote notebook as screenshots.

data-science evernote gists github github-graphql jupyter-notebook pet-project python selenium sync

Last synced: 22 Nov 2024

https://github.com/apreshill/data-vis-labs-2018

Principles & Practice of Data Visualization, CS631 Spring 2018

data-science data-visualization education rstats teaching

Last synced: 16 Jan 2025

https://github.com/njanakiev/openstreetmap-data-science

Data Science with OpenStreetMap

data-science openstreetmap python

Last synced: 06 Nov 2024

https://github.com/tejzpr/ordered-concurrently

Ordered-concurrently a library for concurrent processing with ordered output in Go. Process work concurrently and returns output in a channel in the order of input. It is useful in concurrently processing items in a queue, and get output in the order provided by the queue.

concurrent concurrent-data-structure data-pipeline data-science golang golang-library ordered parallel parallel-computing

Last synced: 26 Oct 2024

https://github.com/dMLTquant/openbb_sdk_exporation

Explore OpenBB SDK without having to install anything on your local machine. You just need a GitHub and a GitPod account.

algorithmic-trading data-science financial-data jupyter notebook openbb python

Last synced: 01 Nov 2024

https://github.com/jmari/ipharo

Pharo Smaltalk kernel for Jupyter

data-science jupyter-notebook pharo pharo-smalltalk smalltalk

Last synced: 08 Feb 2025

https://github.com/rrrlw/tdastats

R pipeline for computing persistent homology in topological data analysis. See https://doi.org/10.21105/joss.00860 for more details.

cran data-science ggplot2 homology homology-calculations homology-computation joss persistent-homology pipeline r r-package r-packages ripser tda topological-data-analysis topology topology-visualization visualization

Last synced: 11 Jan 2025

https://github.com/ipeirotis/introduction-to-python

Notes for the "Introduction to Programming for Data Science" class

data-science for-beginners python python3

Last synced: 14 Feb 2025

https://github.com/darribas/gds17

Geographic Data Science'17

data-science gis pysal python

Last synced: 28 Oct 2024

https://github.com/tirendazacademy/chatgpt-with-examples

This repo contains ChatGPT tutorials about data science, machine learning, deep learning, Python. We show how to use Chat GPT with examples.

chat-gpt chatgpt chatgpt-api chatgpt-python chatgpt3 data-science deep-learning machine-learning

Last synced: 08 Nov 2024

https://github.com/tdeboissiere/cookiecutter-deeplearning

Project folder structure for doing and sharing deep learning work.

data-science project-template

Last synced: 05 Nov 2024

https://github.com/jhwohlgemuth/pwsh-prelude

PowerShell “standard” library for supercharging your productivity. Provides a powerful cross-platform scripting environment enabling efficient analysis and sustainable science in myriad contexts.

applied-mathematics cli cli-app data-science hacktoberfest library mathematics powershell powershell-module statistics text-processing text-to-speech user-interface

Last synced: 27 Oct 2024

https://github.com/leriomaggio/python-data-science

Lecture notes and materials for Python Data Science course

data-science jupyter-notebooks machine-learning materials python-tutorials

Last synced: 29 Oct 2024

https://github.com/ammsa/dtcleaner

DTCleaner: data cleaning using multi-target decision trees.

data-cleaning data-mining data-preprocessing data-quality data-science data-wrangling

Last synced: 28 Oct 2024

https://github.com/jmari/iPharo

Pharo Smaltalk kernel for Jupyter

data-science jupyter-notebook pharo pharo-smalltalk smalltalk

Last synced: 17 Nov 2024

https://github.com/hunar4321/reweight-gpt

Reweight GPT - a simple neural network using transformer architecture for next character prediction

algorithms data-science gpt language-model machine-learning nerual-networks numpy pytorch

Last synced: 14 Nov 2024

https://github.com/tstreamdoth/instacart-market-basket-analysis

Use Instacart public dataset to report which products are often shopped together. 🍋🍉🥑🥦

data-analysis data-science instacart market-basket-analysis

Last synced: 28 Oct 2024

https://github.com/root-11/tablite

multiprocessing enabled out-of-memory data analysis library for tabular data.

data-analysis data-science datatype disk etl excel filereader pandas pivot-tables python table tabular-data

Last synced: 13 Feb 2025

https://github.com/nhsdigital/data-analytics-services

This repo collects the open-source work of the Analytics Service within NHS Digital Data Services

data-science health healthcare nhs nhs-digital nhs-digital-publication pyspark python python3 r rap reproducible-analytical-pipeline sql

Last synced: 23 Dec 2024

https://github.com/aachartmodel/aachartkit-swift-pro

📈📊👑👑👑AAChartKit-Swift-Pro is a professional version of AAChartKit-Swift, it is an elegant and friendly chart framework for iOS, iPadOS, macOS. AAChartKit-Swift-Pro is a more powerful data visualization framework that supports more types beautiful chart like bellcurve, bullet, columnpyramid, cylinder, dependencywheel, heatmap, histogram, networkgraph, organization, packedbubble, pareto, sankey, series, solidgauge, streamgraph, sunburst, tilemap, timeline, treemap, variablepie, variwide, vector, venn, windbarb, wordcloud, xrange charts and so on.

aacharts chart charting-library data-science data-visualization framework highcharts hybrid ios ipados macos plot swift webview

Last synced: 07 Nov 2024

https://github.com/google-marketing-solutions/feedx

Transparent, robust and trustworthy A/B experimentation for Shopping feeds.

ab-testing data-science experimentation python shopping

Last synced: 05 Dec 2024

https://github.com/megagonlabs/ruler

Data Programming by Demonstration (DPBD) for Document Classification

data-labeling data-programming data-science machine-learning training-data weak-supervision

Last synced: 10 Nov 2024

https://github.com/rafzamb/sknifedatar

sknifedatar is a package that serves primarily as an extension to the modeltime 📦 ecosystem. In addition to some functionalities of spatial data and visualization.

data data-analysis data-science data-visualization forecasting r statistics time-series

Last synced: 22 Nov 2024