Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/CJWorkbench/cjworkbench
The data journalism platform with built in training
data-analysis data-journalism data-science data-visualization journalism notebook
Last synced: 24 Mar 2024
![](https://github.com/CJWorkbench.png)
https://github.com/astronomer/astro-sdk
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows
Last synced: 24 Mar 2024
![](https://github.com/astronomer.png)
https://github.com/tirthajyoti/Machine-Learning-with-Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
artificial-intelligence classification clustering data-science decision-trees deep-learning dimensionality-reduction flask k-nearest-neighbours machine-learning matplotlib naive-bayes neural-network numpy pandas pytest random-forest regression scikit-learn statistics
Last synced: 24 Mar 2024
![](https://github.com/tirthajyoti.png)
https://github.com/the-black-knight-01/Data-Science-Competitions
Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).
analytics-vidhya competition-code competitive-data-science-github data-science data-science-competition data-science-competitions datahack-competition kaggle kaggle-competition kaggle-competition-for-beginners kaggle-competition-solutions kaggle-solutions-github kaggle-winning-solutions-github machine-learning machinehack-competition xgboost
Last synced: 24 Mar 2024
![](https://github.com/the-black-knight-01.png)
https://github.com/Daniel-Mietchen/datascience
Keeping track of activities around research data
data-science data-sharing open-data open-science research research-data research-data-management research-funding science-policy
Last synced: 23 Mar 2024
![](https://github.com/Daniel-Mietchen.png)
https://jared-fowler.github.io/prettyglm/
prettyglm provides a set of functions which can easily create beautiful coefficient summaries which can readily be shared and explained.
classification classification-model data-science data-visualization glm linear-models r r-package regression regression-analysis regression-model regression-models rstats rstats-package statistical-models
Last synced: 23 Mar 2024
![](https://github.com/jared-fowler.png)
https://github.com/wandb/wandb
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
collaboration data-science data-versioning deep-learning experiment-track hyperparameter-optimization hyperparameter-search hyperparameter-tuning jax keras machine-learning ml-platform mlops model-versioning pytorch reinforcement-learning reproducibility tensorflow
Last synced: 23 Mar 2024
![](https://github.com/wandb.png)
https://github.com/frjnn/bhtsne
Parallel Barnes-Hut t-SNE implementation written in Rust.
barnes-hut bhtsne data-science data-visualization dimensionality-reduction machine-learning rust similarity-measures
Last synced: 23 Mar 2024
![](https://github.com/frjnn.png)
https://github.com/shaildeliwala/delbot
It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.
ai bot bots chatbot data-science flask natural-language-processing python
Last synced: 23 Mar 2024
![](https://github.com/shaildeliwala.png)
https://github.com/pixiedust/pixiedust
Python Helper library for Jupyter Notebooks
data-science jupyter-notebook pixiedust python python-notebook scala-notebooks spark visualization
Last synced: 23 Mar 2024
![](https://github.com/pixiedust.png)
https://github.com/activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
ai computer-vision cv data-science data-version-control datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops python pytorch tensorflow vector-database vector-search
Last synced: 23 Mar 2024
![](https://github.com/activeloopai.png)
https://github.com/bytewax/bytewax
Python Stream Processing
data-engineering data-processing data-science dataflow machine-learning python rust stream-processing streaming-data
Last synced: 23 Mar 2024
![](https://github.com/bytewax.png)
https://github.com/pretzelai/pretzelai
Open-source, browser-local data exploration using DuckDB-Wasm and PRQL
analytics artificial-intelligence business-intelligence businessintelligence dashboard data data-analysis data-analytics data-science data-visualization duckdb notebooks open-source prql reporting sql sql-editor sql-editor-online visualization wasm
Last synced: 23 Mar 2024
![](https://github.com/pretzelai.png)
https://github.com/probcomp/bayeslite
BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself.
automatic-data-modeling data-science databases machine-learning probabilistic-programming
Last synced: 23 Mar 2024
![](https://github.com/probcomp.png)
https://github.com/code-kern-ai/refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
active-learning annotations artificial-intelligence data-centric-ai data-labeling data-science deep-learning human-in-the-loop labeling labeling-tool machine-learning natural-language-processing neural-search nlp python spacy supervised-learning text-annotation text-classification transformers
Last synced: 23 Mar 2024
![](https://github.com/code-kern-ai.png)
https://github.com/medtagger/MedTagger
A collaborative framework for annotating medical datasets using crowdsourcing.
crowdsourcing data-science data-validation deep-learning labeling medical-imaging
Last synced: 23 Mar 2024
![](https://github.com/medtagger.png)
https://github.com/SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
big-data data-analysis data-engineering data-science data-transformation dataset etl etl-pipeline framework machine-learning modularization pipeline scala setl spark
Last synced: 23 Mar 2024
![](https://github.com/SETL-Framework.png)
https://github.com/Minyus/pipelinex
PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
data-engineering data-science deep-learning experimentation machine-learning pipeline
Last synced: 23 Mar 2024
![](https://github.com/Minyus.png)
https://github.com/airbnb/artificial-adversary
🗣️ Tool to generate adversarial text examples and test machine learning models against them
adversarial-examples black-box-attacks black-box-benchmarking classification data-mining data-science machine-learning metrics python python2 python3 spam spam-classification spam-detection spam-filtering text text-analysis text-classification text-mining text-processing
Last synced: 23 Mar 2024
![](https://github.com/airbnb.png)
https://github.com/sematic-ai/sematic
An open-source ML pipeline development platform
ai data-science machine-learning ml ml-ops ml-pipeline ml-pipelines mlops pipeline python python3
Last synced: 23 Mar 2024
![](https://github.com/sematic-ai.png)
https://github.com/operatorai/modelstore
🏬 modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud storage provider.
data-science keras machine-learning mlops modelstore python-library pytorch s3-storage scikit-learn tensorflow transformer
Last synced: 23 Mar 2024
![](https://github.com/operatorai.png)
https://github.com/aimhubio/aim
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
ai data-science data-visualization experiment-tracking machine-learning metadata metadata-tracking ml mlflow mlops prompt-engineering python pytorch tensorboard tensorflow visualization
Last synced: 23 Mar 2024
![](https://github.com/aimhubio.png)
https://github.com/zama-ai/concrete-ml
Concrete ML: Privacy Preserving ML framework built on top of Concrete, with bindings to traditional ML frameworks.
data-science fhe homomorphic-encryption machine-learning ppml privacy python scikit-learn tfhe torch
Last synced: 23 Mar 2024
![](https://github.com/zama-ai.png)
https://github.com/ResponsiblyAI/responsibly
Toolkit for Auditing and Mitigating Bias and Fairness of Machine Learning Systems 🔎🤖🧰
artificial-intelligence audit bias bias-correction bias-finder bias-reduction data-science ethics fairness fairness-ai fairness-awareness-model fairness-ml fairness-testing machine-bias machine-learning natural-language-processing python
Last synced: 23 Mar 2024
![](https://github.com/ResponsiblyAI.png)
https://github.com/larswaechter/voici.js
A Node.js library for pretty printing your data on the terminal🎨
console data-science javascript shell terminal tty typescript
Last synced: 22 Mar 2024
![](https://github.com/larswaechter.png)
https://rivasiker.github.io/ggHoriPlot/
A user-friendly, highly customizable R package for building horizon plots in ggplot2
data-science data-visualization ggplot2 horizon-plots r r-package
Last synced: 22 Mar 2024
![](https://github.com/rivasiker.png)
https://github.com/benedekrozemberczki/NestedSubtreeHash
A distributed implementation of "Nested Subtree Hash Kernels for Large-Scale Graph Classification Over Streams" (ICDM 2012).
data-mining data-science deepwalk distributed-machine-learning feature-extraction gensim graph-classification graph-kernel graph-mining hashing large-scale-learning machine-learning multi-scale node2vec representation-learning streaming-data streaming-processing word2vec
Last synced: 22 Mar 2024
![](https://github.com/benedekrozemberczki.png)
https://hazyresearch.github.io/snorkel
A system for quickly generating training data with weak supervision
ai data-augmentation data-science data-slicing labeling machine-learning python snorkel training-data weak-supervision
Last synced: 22 Mar 2024
![](https://github.com/snorkel-team.png)
https://github.com/uc-r/Advanced-R
Advanced Analytics with R training material delivered in a 2 day format
data-science educational-materials r training-materials workshop-materials
Last synced: 21 Mar 2024
![](https://github.com/uc-r.png)
https://github.com/vkoul/Econ-Data-Science
Articles/ Journals and Videos related to Economics:chart_with_upwards_trend: and Data Science :bar_chart:
casual-inference data-science econometrics economics economist machine-learning social-sciences
Last synced: 21 Mar 2024
![](https://github.com/vkoul.png)
https://github.com/wesslen/iviz-rstudio-workshop
Interactive Visualizations with RStudio Workshop for UNCC DSI
data-science htmlwidgets interactive-visualizations rstudio shiny shinyapps tidyverse
Last synced: 21 Mar 2024
![](https://github.com/wesslen.png)
https://github.com/TheScientistBr/DataScienceTraining
Treinamento em Data Science
data-science r-programming treinamento
Last synced: 21 Mar 2024
![](https://github.com/TheScientistBr.png)
https://github.com/vi3k6i5/GuidedLDA
semi supervised guided topic model with custom guidedLDA
data-science guided-topic-modeling guidedlda machine-learning seededlda topic-modeling
Last synced: 21 Mar 2024
![](https://github.com/vi3k6i5.png)
https://github.com/katrienantonio/workshop-loss-reserv-fraud
Course material for a workshop on loss modelling, reserving and insurance fraud analytics
actuarial-science data-science insurance-claims
Last synced: 21 Mar 2024
![](https://github.com/katrienantonio.png)
https://github.com/sdcastillo/PA-R-Study-Manual
An online study guide for the SOA's predictive analytics exam.
data-science data-visualization machine-learning predictive-modeling r-programming
Last synced: 21 Mar 2024
![](https://github.com/sdcastillo.png)
https://github.com/rstudio/rviews-community
RViews Community Site for Authors and Editors
blog community data-science open-source r r-programming
Last synced: 21 Mar 2024
![](https://github.com/rstudio.png)
https://github.com/datmo/datmo
Open source production model management tool for data scientists
artificial-intelligence data-science deep-learning machine-learning reproducibility version-control
Last synced: 21 Mar 2024
![](https://github.com/datmo.png)
https://github.com/FilippoBovo/production-data-science
Production Data Science: a workflow for collaborative data science aimed at production
collaborative data-science production workflow
Last synced: 21 Mar 2024
![](https://github.com/FilippoBovo.png)
https://github.com/shervinea/mit-15-003-data-science-tools
Study guides for MIT's 15.003 Data Science Tools
bash data-science git manipulation r retrieval sql study-guide visualization
Last synced: 21 Mar 2024
![](https://github.com/shervinea.png)
https://github.com/SforAiDl/genrl
A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms
Last synced: 21 Mar 2024
![](https://github.com/SforAiDl.png)
https://github.com/safreita1/TIGER
Python toolbox to evaluate graph vulnerability and robustness (CIKM 2021)
adversarial-attacks attack cascading-failures data-mining data-science defense diffusion epidemics graph graph-attack graph-mining machine-learning netshield network-attack networks robustness simulation vulnerability
Last synced: 21 Mar 2024
![](https://github.com/safreita1.png)
https://github.com/davisidarta/topometry
Systematically learn and evaluate manifolds from high-dimensional data
clustering data-science data-visualization dimensionality-reduction graph graph-layout hypothesis-generation laplace-beltrami machine-learning manifold-learning scikit-learn single-cell visualization
Last synced: 21 Mar 2024
![](https://github.com/davisidarta.png)
https://github.com/neurodata/hyppo
Python package for multivariate hypothesis testing
data-science hacktoberfest hypothesis-testing independence ksample-testing python
Last synced: 21 Mar 2024
![](https://github.com/neurodata.png)
https://github.com/zama-ai/concrete-numpy
Concrete-Numpy: A library to turn programs into their homomorphic equivalent.
data-science fhe homomorphic-encryption numpy privacy python tfhe
Last synced: 21 Mar 2024
![](https://github.com/zama-ai.png)
https://github.com/alexhallam/tablespoon
🥄✨Time-series Benchmark methods that are Simple and Probabilistic
data-science forecasting mean naive probabilistic probabilistic-programming probability python scipy seasonal-naive simple simple-models time-series uncertainty-quantification
Last synced: 21 Mar 2024
![](https://github.com/alexhallam.png)
https://github.com/LaihoE/did-it-spill
Check if you have training samples in your test set
computer-vision data-science deep-learning pytorch semantic-similarity time-series
Last synced: 21 Mar 2024
![](https://github.com/LaihoE.png)
https://github.com/autonlab/auton-survival
Auton Survival - an open source package for Regression, Counterfactual Estimation, Evaluation and Phenotyping with Censored Time-to-Events
causal-inference counterfactual-inference data-science deep-learning graphical-models machine-learning python regression reliability-analysis survival-analysis time-to-event
Last synced: 21 Mar 2024
![](https://github.com/autonlab.png)
https://github.com/saezlab/decoupler-py
Python package to perform enrichment analysis from omics data.
bioinformatics data-science enrichment enrichment-analysis numba python single-cell spatial-transcriptomics transcriptomics
Last synced: 21 Mar 2024
![](https://github.com/saezlab.png)
https://github.com/vanderschaarlab/hyperimpute
A framework for prototyping and benchmarking imputation methods
data-science imputation imputation-algorithm machine-learning machine-learning-prerequisites preprocessing-data python scikit-learn
Last synced: 21 Mar 2024
![](https://github.com/vanderschaarlab.png)
https://github.com/sissa-data-science/DADApy
Distance-based Analysis of DAta-manifolds in python
data-analysis data-science density-based-clustering density-estimation intrinsic-dimension machine-learning manifolds python
Last synced: 21 Mar 2024
![](https://github.com/sissa-data-science.png)
https://github.com/mad-lab-fau/tpcp
Pipeline and Dataset helpers for complex algorithm evaluation.
algorithms biosignals data-management data-science machine-learning python
Last synced: 21 Mar 2024
![](https://github.com/mad-lab-fau.png)
https://github.com/mvlearn/mvlearn
Python package for multi-view machine learning
data-science machine-learning multiview-learning python
Last synced: 21 Mar 2024
![](https://github.com/mvlearn.png)
https://github.com/NorskRegnesentral/skweak
skweak: A software toolkit for weak supervision applied to NLP tasks
data-science distant-supervision natural-language-processing nlp-library nlp-machine-learning python spacy training-data weak-supervision
Last synced: 21 Mar 2024
![](https://github.com/NorskRegnesentral.png)
https://github.com/tensorflow/probability
Probabilistic reasoning and statistical analysis in TensorFlow
bayesian-methods data-science deep-learning machine-learning neural-networks probabilistic-programming statistics tensorflow
Last synced: 20 Mar 2024
![](https://github.com/tensorflow.png)
https://github.com/dataquestio/project-walkthroughs
Data science, machine learning, and web development project code for https://www.youtube.com/c/Dataquestio .
data-science machine-learning pandas python
Last synced: 19 Mar 2024
![](https://github.com/dataquestio.png)
https://github.com/shahinrostami/chord
Engaging visualisations, made easy.
data-science data-visualization plotting python visualization
Last synced: 19 Mar 2024
![](https://github.com/shahinrostami.png)
https://github.com/pachyderm/pachyderm
Data-Centric Pipelines and Data Versioning
analytics big-data containers data-analysis data-science distributed-systems docker go kubernetes pachyderm
Last synced: 19 Mar 2024
![](https://github.com/pachyderm.png)
https://github.com/nelson-lang/nelson
Nelson numerical interpreter
cpp17 data-science data-structures interpreter mathematical-functions matlab matrix-functions nelson octave programming-language scientific-computing scilab
Last synced: 19 Mar 2024
![](https://github.com/nelson-lang.png)
https://github.com/ploomber/sklearn-evaluation
Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.
data-science deep-learning jupyter-notebook machine-learning pytorch scikit-learn sklearn tensorflow
Last synced: 18 Mar 2024
![](https://github.com/ploomber.png)
https://github.com/jamesqo/gun-violence-data
A comprehensive, accessible database that contains records of over 260k US gun violence incidents from January 2013 to March 2018.
data-science gun-violence-archive machine-learning statistics
Last synced: 18 Mar 2024
![](https://github.com/jamesqo.png)
https://github.com/Ibotta/sk-dist
Distributed scikit-learn meta-estimators in PySpark
data-science machine-learning ml scikit-learn spark
Last synced: 18 Mar 2024
![](https://github.com/Ibotta.png)
https://github.com/prathimacode-hub/Awesome_Python_Scripts
🚀 Curated collection of Awesome Python Scripts which will make you go wow. Dive into this world of 360+ scripts. Feel free to contribute. Show your support by ✨this repository.
algorithms algorithms-datastructures beginner-friendly contributions contributions-welcome data-science data-structures education hacktoberfest hacktoberfest2022 learn open-source practice project python python-script python-scripts python3 search
Last synced: 18 Mar 2024
![](https://github.com/prathimacode-hub.png)
https://github.com/nipy/nipype
Workflows and interfaces for neuroimaging packages
big-data brain-imaging brainweb data-science dataflow dataflow-programming neuroimaging python workflow-engine
Last synced: 18 Mar 2024
![](https://github.com/nipy.png)
https://github.com/deepgraph/deepgraph
Analyze Data with Pandas-based Networks. Documentation:
data-analysis data-mining data-science data-structures data-visualization graph-database graph-theory graphs graphviz interfacing iterative-methods multilayer-networks network network-analysis network-visualization networkx pandas parallel partitioning
Last synced: 18 Mar 2024
![](https://github.com/deepgraph.png)
https://github.com/jasmcaus/caer
High-performance Vision library in Python. Scale your research, not boilerplate.
ai artificial-intelligence augmentation caer computer-vision cuda data-science deep-learning gpu image-classification image-processing image-segmentation machine-learning neural-network opencv python segmentation type-checking video-processing vision
Last synced: 18 Mar 2024
![](https://github.com/jasmcaus.png)
https://github.com/krassowski/jupyter-helpers
A collection of helpers for Jupyter/IPython
data-science jupyter jupyter-lab jupyter-notebook jupyter-widget jupyterlab jupyterlab-extension
Last synced: 18 Mar 2024
![](https://github.com/krassowski.png)
https://github.com/mmkim1210/GeneticsMakie.jl
🧬High-performance genetics- and genomics-related data visualization using Makie.jl
bioinformatics cairomakie colocalization data-science fine-mapping genetics genomics gwas julia julia-language linkage locuszoom makie multi-ethnic multivariate openmendel phewas qtl v2f visualization
Last synced: 18 Mar 2024
![](https://github.com/mmkim1210.png)
https://github.com/plantinformatics/pretzel
Javascript full-stack framework for Big Data visualisation and analysis
big-data bioinformatics data-science data-visualization ember emberjs express expressjs javascript open-source
Last synced: 18 Mar 2024
![](https://github.com/plantinformatics.png)
https://github.com/nicolaskruchten/jupyter_pivottablejs
Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables
Last synced: 18 Mar 2024
![](https://github.com/nicolaskruchten.png)
https://github.com/ml-tooling/ml-hub
🧰 Multi-user development platform for machine learning teams. Simple to setup within minutes.
data-science docker jupyter jupyterhub machine-learning python
Last synced: 18 Mar 2024
![](https://github.com/ml-tooling.png)
https://github.com/Kotlin/kandy
Kotlin plotting library.
data-science graphics jupyter-notebooks kotlin plot
Last synced: 18 Mar 2024
![](https://github.com/Kotlin.png)
https://github.com/LearnDataSci/articles
A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python
Last synced: 18 Mar 2024
![](https://github.com/LearnDataSci.png)
https://github.com/coalio/Assistant
A data science library providing flexible dataframes for Lua 5.1+
data-analysis data-science data-structures dataframe lua
Last synced: 18 Mar 2024
![](https://github.com/coalio.png)
https://github.com/Kotlin/dataframe
Structured data processing in Kotlin
data-analysis data-science dataframe kotlin
Last synced: 18 Mar 2024
![](https://github.com/Kotlin.png)
https://github.com/maxhumber/redframes
General Purpose Data Manipulation Library
Last synced: 18 Mar 2024
![](https://github.com/maxhumber.png)
https://github.com/tidypyverse/tidypandas
A grammar of data manipulation for pandas inspired by tidyverse
data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse
Last synced: 18 Mar 2024
![](https://github.com/tidypyverse.png)
https://github.com/HanXinzi-AI/awesome-python-machine-learning-resources
a collection of awesome machine learning and deep learning Python libraries&tools. 热门实用机器学习和深入学习Python库和工具的集合
auto-ml awesome awesome-list cv data-analysis data-mining data-science data-visualization deep-learning fintech machine-learning machine-learning-algorithms nlp pytorch recommender-system sklearn tensorflow text-mining time-series
Last synced: 18 Mar 2024
![](https://github.com/HanXinzi-AI.png)
https://github.com/asavinov/prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
business-intelligence data-preparation data-preprocessing data-processing data-science data-wrangling feature-engineering map-reduce olap pandas python spark workflow
Last synced: 18 Mar 2024
![](https://github.com/asavinov.png)
https://github.com/metarank/metarank
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
automl data-engineering data-science deep-learning feature-engineering feature-extraction kubernetes machine-learning neural-networks personalization ranking scala search
Last synced: 17 Mar 2024
![](https://github.com/metarank.png)
https://github.com/machine-learning-apps/Issue-Label-Bot
Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"
bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow
Last synced: 17 Mar 2024
![](https://github.com/machine-learning-apps.png)
https://github.com/cjroth/chronist
Long-term analysis of emotion, age, and sentiment using Lifeslice and text records.
data data-analysis data-science data-visualization dataset dataviz emotion emotion-analytics es6 javascript matplotlib pandas photoanalysis python sentiment sentiment-analysis
Last synced: 17 Mar 2024
![](https://github.com/cjroth.png)
https://github.com/natnew/Awesome-Data-Science
Carefully curated list of awesome data science resources.
ai awesome awesome-list data data-science deep-learning explainable-ai interoperability large-scale-machine-learning machine-learning machine-learning-operations ml-operations responsible-ai
Last synced: 17 Mar 2024
![](https://github.com/natnew.png)
https://github.com/h2oai/nitro
Create apps 10x quicker, without Javascript/HTML/CSS.
app apps data-analysis data-science developer-tools devtools graphics h2o-nitro low-code python ui ui-components user-interface web-application webapp widget-library widgets
Last synced: 17 Mar 2024
![](https://github.com/h2oai.png)
https://github.com/jazzdotdev/jazz
The Scripting Engine that Combines Speed, Safety, and Simplicity
actix android chromeos crypto cryptography data-science database development-environment embeddable jazz jinja2 linux lua markdown rust scraping scripting web witness
Last synced: 17 Mar 2024
![](https://github.com/jazzdotdev.png)
https://github.com/nfstream/nfstream
NFStream: a Flexible Network Data Analysis Framework.
artificial-intelligence cybersecurity data-analysis data-mining data-science dataset-generation deep-packet-inspection machine-learning ndpi netflow network-analysis network-monitoring network-security packet-analyser packet-capture pcap python traffic-analysis traffic-classification
Last synced: 17 Mar 2024
![](https://github.com/nfstream.png)
https://github.com/pykale/pykale
Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!
computer-vision data-science deep-learning domain-adaptation graph-analysis knowledge-aware-learning machine-learning medical-image-analysis meta-learning multimodal multimodal-learning python pytorch transfer-learning
Last synced: 17 Mar 2024
![](https://github.com/pykale.png)
https://github.com/scottshambaugh/monaco
Quantify uncertainty and sensitivities in your computer models with an industry-grade Monte Carlo library.
data-science monaco monte-carlo python scientific-computing sensitivity-analysis simulation statistics uncertainty-analysis uncertainty-quantification
Last synced: 17 Mar 2024
![](https://github.com/scottshambaugh.png)
https://github.com/alibaba/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llm llms machine-learning multi-modal nlp opendata pre-training pytorch streamlit
Last synced: 17 Mar 2024
![](https://github.com/alibaba.png)
https://github.com/kennethleungty/Failed-ML
Compilation of high-profile real-world examples of failed machine learning projects
ai artificial-intelligence classification computer-vision data-engineering data-quality data-science deep-learning failed-data-science failed-machine-learning failed-ml fml forecasting machine-learning ml natural-language-processing production recsys regression
Last synced: 17 Mar 2024
![](https://github.com/kennethleungty.png)
https://github.com/ZackAkil/friendlier-data-labelling
Code resources for generating a google form for labelling data.
data-science google google-apps-script google-forms google-sheets machine-learning
Last synced: 17 Mar 2024
![](https://github.com/ZackAkil.png)
https://github.com/plotly/dash-table
OBSOLETE: now part of https://github.com/plotly/dash
dash data-science data-visualization plotly plotly-dash python react table
Last synced: 16 Mar 2024
![](https://github.com/plotly.png)
https://github.com/blockchain-etl/awesome-bigquery-views
Useful SQL queries for Blockchain ETL datasets in BigQuery.
blockchain-analytics crypto cryptocurrency data-analytics data-engineering data-science gcp google-cloud google-cloud-platform on-chain-analysis web3
Last synced: 16 Mar 2024
![](https://github.com/blockchain-etl.png)
https://github.com/piquette/qtrn
A cli tool to streamline financial markets data analysis :wrench:
cli data data-science finance go golang options quotes scraper stock stock-analysis stock-market
Last synced: 16 Mar 2024
![](https://github.com/piquette.png)
https://aymara.github.io/lima/
The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
ai cpp data-science deep-learning entity-extraction free-software information-extraction linux machine-learning multilingual named-entity-recognition natural-language-processing neural-network nlp nlp-library powerful python relation-extraction tokenization windows
Last synced: 16 Mar 2024
![](https://github.com/aymara.png)
https://matheusfacure.github.io/python-causality-handbook/
Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
causal-inference causality data-science econometrics harmless-econometrics impact-estimation python
Last synced: 16 Mar 2024
![](https://github.com/matheusfacure.png)
https://github.com/matheusfacure/python-causality-handbook
Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
causal-inference causality data-science econometrics harmless-econometrics impact-estimation python
Last synced: 16 Mar 2024
![](https://github.com/matheusfacure.png)
https://github.com/somdeep/Statball
Statball - Football soccer stats analyser from top 5 european leagues with data obtained by web scraping from Fbref and Statsbomb
csharp data-science data-scraping data-viz dotnet dotnet-core fbref football football-analytics football-data scouting-data scraping soccer soccer-analytics soccer-data statsbomb tableau visualizations
Last synced: 16 Mar 2024
![](https://github.com/somdeep.png)