Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/CJWorkbench/cjworkbench

The data journalism platform with built in training

data-analysis data-journalism data-science data-visualization journalism notebook

Last synced: 24 Mar 2024

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 24 Mar 2024

https://jared-fowler.github.io/prettyglm/

prettyglm provides a set of functions which can easily create beautiful coefficient summaries which can readily be shared and explained.

classification classification-model data-science data-visualization glm linear-models r r-package regression regression-analysis regression-model regression-models rstats rstats-package statistical-models

Last synced: 23 Mar 2024

https://github.com/shaildeliwala/delbot

It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.

ai bot bots chatbot data-science flask natural-language-processing python

Last synced: 23 Mar 2024

https://github.com/activeloopai/deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

ai computer-vision cv data-science data-version-control datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops python pytorch tensorflow vector-database vector-search

Last synced: 23 Mar 2024

https://github.com/probcomp/bayeslite

BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself.

automatic-data-modeling data-science databases machine-learning probabilistic-programming

Last synced: 23 Mar 2024

https://github.com/medtagger/MedTagger

A collaborative framework for annotating medical datasets using crowdsourcing.

crowdsourcing data-science data-validation deep-learning labeling medical-imaging

Last synced: 23 Mar 2024

https://github.com/Minyus/pipelinex

PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more

data-engineering data-science deep-learning experimentation machine-learning pipeline

Last synced: 23 Mar 2024

https://github.com/sematic-ai/sematic

An open-source ML pipeline development platform

ai data-science machine-learning ml ml-ops ml-pipeline ml-pipelines mlops pipeline python python3

Last synced: 23 Mar 2024

https://github.com/operatorai/modelstore

🏬 modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud storage provider.

data-science keras machine-learning mlops modelstore python-library pytorch s3-storage scikit-learn tensorflow transformer

Last synced: 23 Mar 2024

https://github.com/zama-ai/concrete-ml

Concrete ML: Privacy Preserving ML framework built on top of Concrete, with bindings to traditional ML frameworks.

data-science fhe homomorphic-encryption machine-learning ppml privacy python scikit-learn tfhe torch

Last synced: 23 Mar 2024

https://github.com/larswaechter/voici.js

A Node.js library for pretty printing your data on the terminal🎨

console data-science javascript shell terminal tty typescript

Last synced: 22 Mar 2024

https://rivasiker.github.io/ggHoriPlot/

A user-friendly, highly customizable R package for building horizon plots in ggplot2

data-science data-visualization ggplot2 horizon-plots r r-package

Last synced: 22 Mar 2024

https://hazyresearch.github.io/snorkel

A system for quickly generating training data with weak supervision

ai data-augmentation data-science data-slicing labeling machine-learning python snorkel training-data weak-supervision

Last synced: 22 Mar 2024

https://github.com/uc-r/Advanced-R

Advanced Analytics with R training material delivered in a 2 day format

data-science educational-materials r training-materials workshop-materials

Last synced: 21 Mar 2024

https://github.com/vkoul/Econ-Data-Science

Articles/ Journals and Videos related to Economics:chart_with_upwards_trend: and Data Science :bar_chart:

casual-inference data-science econometrics economics economist machine-learning social-sciences

Last synced: 21 Mar 2024

https://github.com/wesslen/iviz-rstudio-workshop

Interactive Visualizations with RStudio Workshop for UNCC DSI

data-science htmlwidgets interactive-visualizations rstudio shiny shinyapps tidyverse

Last synced: 21 Mar 2024

https://github.com/vi3k6i5/GuidedLDA

semi supervised guided topic model with custom guidedLDA

data-science guided-topic-modeling guidedlda machine-learning seededlda topic-modeling

Last synced: 21 Mar 2024

https://github.com/katrienantonio/workshop-loss-reserv-fraud

Course material for a workshop on loss modelling, reserving and insurance fraud analytics

actuarial-science data-science insurance-claims

Last synced: 21 Mar 2024

https://github.com/sdcastillo/PA-R-Study-Manual

An online study guide for the SOA's predictive analytics exam.

data-science data-visualization machine-learning predictive-modeling r-programming

Last synced: 21 Mar 2024

https://github.com/rstudio/rviews-community

RViews Community Site for Authors and Editors

blog community data-science open-source r r-programming

Last synced: 21 Mar 2024

https://github.com/datmo/datmo

Open source production model management tool for data scientists

artificial-intelligence data-science deep-learning machine-learning reproducibility version-control

Last synced: 21 Mar 2024

https://github.com/FilippoBovo/production-data-science

Production Data Science: a workflow for collaborative data science aimed at production

collaborative data-science production workflow

Last synced: 21 Mar 2024

https://github.com/shervinea/mit-15-003-data-science-tools

Study guides for MIT's 15.003 Data Science Tools

bash data-science git manipulation r retrieval sql study-guide visualization

Last synced: 21 Mar 2024

https://github.com/SforAiDl/genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL

algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms

Last synced: 21 Mar 2024

https://github.com/neurodata/hyppo

Python package for multivariate hypothesis testing

data-science hacktoberfest hypothesis-testing independence ksample-testing python

Last synced: 21 Mar 2024

https://github.com/zama-ai/concrete-numpy

Concrete-Numpy: A library to turn programs into their homomorphic equivalent.

data-science fhe homomorphic-encryption numpy privacy python tfhe

Last synced: 21 Mar 2024

https://github.com/LaihoE/did-it-spill

Check if you have training samples in your test set

computer-vision data-science deep-learning pytorch semantic-similarity time-series

Last synced: 21 Mar 2024

https://github.com/autonlab/auton-survival

Auton Survival - an open source package for Regression, Counterfactual Estimation, Evaluation and Phenotyping with Censored Time-to-Events

causal-inference counterfactual-inference data-science deep-learning graphical-models machine-learning python regression reliability-analysis survival-analysis time-to-event

Last synced: 21 Mar 2024

https://github.com/mad-lab-fau/tpcp

Pipeline and Dataset helpers for complex algorithm evaluation.

algorithms biosignals data-management data-science machine-learning python

Last synced: 21 Mar 2024

https://github.com/mvlearn/mvlearn

Python package for multi-view machine learning

data-science machine-learning multiview-learning python

Last synced: 21 Mar 2024

https://github.com/dataquestio/project-walkthroughs

Data science, machine learning, and web development project code for https://www.youtube.com/c/Dataquestio .

data-science machine-learning pandas python

Last synced: 19 Mar 2024

https://github.com/shahinrostami/chord

Engaging visualisations, made easy.

data-science data-visualization plotting python visualization

Last synced: 19 Mar 2024

https://github.com/ploomber/sklearn-evaluation

Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

data-science deep-learning jupyter-notebook machine-learning pytorch scikit-learn sklearn tensorflow

Last synced: 18 Mar 2024

https://github.com/jamesqo/gun-violence-data

A comprehensive, accessible database that contains records of over 260k US gun violence incidents from January 2013 to March 2018.

data-science gun-violence-archive machine-learning statistics

Last synced: 18 Mar 2024

https://github.com/Ibotta/sk-dist

Distributed scikit-learn meta-estimators in PySpark

data-science machine-learning ml scikit-learn spark

Last synced: 18 Mar 2024

https://github.com/prathimacode-hub/Awesome_Python_Scripts

🚀 Curated collection of Awesome Python Scripts which will make you go wow. Dive into this world of 360+ scripts. Feel free to contribute. Show your support by ✨this repository.

algorithms algorithms-datastructures beginner-friendly contributions contributions-welcome data-science data-structures education hacktoberfest hacktoberfest2022 learn open-source practice project python python-script python-scripts python3 search

Last synced: 18 Mar 2024

https://github.com/plantinformatics/pretzel

Javascript full-stack framework for Big Data visualisation and analysis

big-data bioinformatics data-science data-visualization ember emberjs express expressjs javascript open-source

Last synced: 18 Mar 2024

https://github.com/nicolaskruchten/jupyter_pivottablejs

Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables

Last synced: 18 Mar 2024

https://github.com/ml-tooling/ml-hub

🧰 Multi-user development platform for machine learning teams. Simple to setup within minutes.

data-science docker jupyter jupyterhub machine-learning python

Last synced: 18 Mar 2024

https://github.com/Kotlin/kandy

Kotlin plotting library.

data-science graphics jupyter-notebooks kotlin plot

Last synced: 18 Mar 2024

https://github.com/LearnDataSci/articles

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python

Last synced: 18 Mar 2024

https://github.com/coalio/Assistant

A data science library providing flexible dataframes for Lua 5.1+

data-analysis data-science data-structures dataframe lua

Last synced: 18 Mar 2024

https://github.com/Kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 18 Mar 2024

https://github.com/maxhumber/redframes

General Purpose Data Manipulation Library

data-science pandas python

Last synced: 18 Mar 2024

https://github.com/tidypyverse/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 18 Mar 2024

https://github.com/HanXinzi-AI/awesome-python-machine-learning-resources

a collection of awesome machine learning and deep learning Python libraries&tools. 热门实用机器学习和深入学习Python库和工具的集合

auto-ml awesome awesome-list cv data-analysis data-mining data-science data-visualization deep-learning fintech machine-learning machine-learning-algorithms nlp pytorch recommender-system sklearn tensorflow text-mining time-series

Last synced: 18 Mar 2024

https://github.com/asavinov/prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

business-intelligence data-preparation data-preprocessing data-processing data-science data-wrangling feature-engineering map-reduce olap pandas python spark workflow

Last synced: 18 Mar 2024

https://github.com/metarank/metarank

A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine

automl data-engineering data-science deep-learning feature-engineering feature-extraction kubernetes machine-learning neural-networks personalization ranking scala search

Last synced: 17 Mar 2024

https://github.com/machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 17 Mar 2024

https://github.com/pykale/pykale

Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!

computer-vision data-science deep-learning domain-adaptation graph-analysis knowledge-aware-learning machine-learning medical-image-analysis meta-learning multimodal multimodal-learning python pytorch transfer-learning

Last synced: 17 Mar 2024

https://github.com/scottshambaugh/monaco

Quantify uncertainty and sensitivities in your computer models with an industry-grade Monte Carlo library.

data-science monaco monte-carlo python scientific-computing sensitivity-analysis simulation statistics uncertainty-analysis uncertainty-quantification

Last synced: 17 Mar 2024

https://github.com/alibaba/data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llm llms machine-learning multi-modal nlp opendata pre-training pytorch streamlit

Last synced: 17 Mar 2024

https://github.com/ZackAkil/friendlier-data-labelling

Code resources for generating a google form for labelling data.

data-science google google-apps-script google-forms google-sheets machine-learning

Last synced: 17 Mar 2024

https://github.com/plotly/dash-table

OBSOLETE: now part of https://github.com/plotly/dash

dash data-science data-visualization plotly plotly-dash python react table

Last synced: 16 Mar 2024

https://github.com/piquette/qtrn

A cli tool to streamline financial markets data analysis :wrench:

cli data data-science finance go golang options quotes scraper stock stock-analysis stock-market

Last synced: 16 Mar 2024

https://matheusfacure.github.io/python-causality-handbook/

Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.

causal-inference causality data-science econometrics harmless-econometrics impact-estimation python

Last synced: 16 Mar 2024

https://github.com/matheusfacure/python-causality-handbook

Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.

causal-inference causality data-science econometrics harmless-econometrics impact-estimation python

Last synced: 16 Mar 2024

https://github.com/somdeep/Statball

Statball - Football soccer stats analyser from top 5 european leagues with data obtained by web scraping from Fbref and Statsbomb

csharp data-science data-scraping data-viz dotnet dotnet-core fbref football football-analytics football-data scouting-data scraping soccer soccer-analytics soccer-data statsbomb tableau visualizations

Last synced: 16 Mar 2024