Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/shahinrostami/chord

Engaging visualisations, made easy.

data-science data-visualization plotting python visualization

Last synced: 11 Jul 2024

https://github.com/ploomber/sklearn-evaluation

Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

data-science deep-learning jupyter-notebook machine-learning pytorch scikit-learn sklearn tensorflow

Last synced: 11 Jul 2024

https://github.com/jamesqo/gun-violence-data

A comprehensive, accessible database that contains records of over 260k US gun violence incidents from January 2013 to March 2018.

data-science gun-violence-archive machine-learning statistics

Last synced: 11 Jul 2024

https://github.com/Ibotta/sk-dist

Distributed scikit-learn meta-estimators in PySpark

data-science machine-learning ml scikit-learn spark

Last synced: 11 Jul 2024

https://github.com/prathimacode-hub/Awesome_Python_Scripts

🚀 Curated collection of Awesome Python Scripts which will make you go wow. Dive into this world of 360+ scripts. Feel free to contribute. Show your support by ✨this repository.

algorithms algorithms-datastructures beginner-friendly contributions contributions-welcome data-science data-structures education hacktoberfest hacktoberfest2022 learn open-source practice project python python-script python-scripts python3 search

Last synced: 11 Jul 2024

https://github.com/plantinformatics/pretzel

Javascript full-stack framework for Big Data visualisation and analysis

big-data bioinformatics data-science data-visualization ember emberjs express expressjs javascript open-source

Last synced: 10 Jul 2024

https://github.com/nicolaskruchten/jupyter_pivottablejs

Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js

data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables

Last synced: 10 Jul 2024

https://github.com/ml-tooling/ml-hub

🧰 Multi-user development platform for machine learning teams. Simple to setup within minutes.

data-science docker jupyter jupyterhub machine-learning python

Last synced: 10 Jul 2024

https://github.com/Kotlin/kandy

Kotlin plotting library.

data-science graphics jupyter-notebooks kotlin plot

Last synced: 10 Jul 2024

https://github.com/LearnDataSci/articles

A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci

data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python

Last synced: 10 Jul 2024

https://github.com/coalio/Assistant

A data science library providing flexible dataframes for Lua 5.1+

data-analysis data-science data-structures dataframe lua

Last synced: 10 Jul 2024

https://github.com/Kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 10 Jul 2024

https://github.com/maxhumber/redframes

General Purpose Data Manipulation Library

data-science pandas python

Last synced: 10 Jul 2024

https://github.com/tidypyverse/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 10 Jul 2024

https://github.com/HanXinzi-AI/awesome-python-machine-learning-resources

a collection of awesome machine learning and deep learning Python libraries&tools. 热门实用机器学习和深入学习Python库和工具的集合

auto-ml awesome awesome-list cv data-analysis data-mining data-science data-visualization deep-learning fintech machine-learning machine-learning-algorithms nlp pytorch recommender-system sklearn tensorflow text-mining time-series

Last synced: 10 Jul 2024

https://github.com/asavinov/prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

business-intelligence data-preparation data-preprocessing data-processing data-science data-wrangling feature-engineering map-reduce olap pandas python spark workflow

Last synced: 10 Jul 2024

https://github.com/metarank/metarank

A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine

automl data-engineering data-science deep-learning feature-engineering feature-extraction kubernetes machine-learning neural-networks personalization ranking scala search

Last synced: 09 Jul 2024

https://github.com/machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 09 Jul 2024

https://github.com/pykale/pykale

Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!

computer-vision data-science deep-learning domain-adaptation graph-analysis knowledge-aware-learning machine-learning medical-image-analysis meta-learning multimodal multimodal-learning python pytorch transfer-learning

Last synced: 09 Jul 2024

https://github.com/scottshambaugh/monaco

Quantify uncertainty and sensitivities in your computer models with an industry-grade Monte Carlo library.

data-science monaco monte-carlo python scientific-computing sensitivity-analysis simulation statistics uncertainty-analysis uncertainty-quantification

Last synced: 09 Jul 2024

https://github.com/ZackAkil/friendlier-data-labelling

Code resources for generating a google form for labelling data.

data-science google google-apps-script google-forms google-sheets machine-learning

Last synced: 08 Jul 2024

https://github.com/plotly/dash-table

OBSOLETE: now part of https://github.com/plotly/dash

dash data-science data-visualization plotly plotly-dash python react table

Last synced: 08 Jul 2024

https://github.com/piquette/qtrn

A cli tool to streamline financial markets data analysis :wrench:

cli data data-science finance go golang options quotes scraper stock stock-analysis stock-market

Last synced: 07 Jul 2024

https://github.com/voxel51/voxelgpt

AI assistant that can query visual datasets, search the FiftyOne docs, and answer general computer vision questions

artificial-intelligence chatgpt computer-vision data-science deep-learning fiftyone langchain llm machine-learning openai python

Last synced: 07 Jul 2024

https://matheusfacure.github.io/python-causality-handbook/

Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.

causal-inference causality data-science econometrics harmless-econometrics impact-estimation python

Last synced: 06 Jul 2024

https://github.com/matheusfacure/python-causality-handbook

Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.

causal-inference causality data-science econometrics harmless-econometrics impact-estimation python

Last synced: 06 Jul 2024

https://github.com/somdeep/Statball

Statball - Football soccer stats analyser from top 5 european leagues with data obtained by web scraping from Fbref and Statsbomb

csharp data-science data-scraping data-viz dotnet dotnet-core fbref football football-analytics football-data scouting-data scraping soccer soccer-analytics soccer-data statsbomb tableau visualizations

Last synced: 06 Jul 2024

https://github.com/Azure/AzureDSVM

AzureDSVM is an R package that offers convenient harness of Azure DSVM, remote execution of scalable and elastic data science work, and monitoring of on-demand resource consumption.

azure data-science data-science-virtual-machine r

Last synced: 06 Jul 2024

https://github.com/jmari/iPharo

Pharo Smaltalk kernel for Jupyter

data-science jupyter-notebook pharo pharo-smalltalk smalltalk

Last synced: 05 Jul 2024

https://github.com/antonycourtney/tad

A desktop application for viewing and analyzing tabular data

csv data-analysis data-science database desktop-application duckdb parquet-viewer pivot-tables pivots tabular-data

Last synced: 05 Jul 2024

https://github.com/jgoerner/data-science-stack-cookiecutter

🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)

airflow apistar cookiecutter data-science docker docker-image jupyter minio postgres python superset

Last synced: 05 Jul 2024

https://github.com/olavolav/uniplot

Lightweight plotting to the terminal. 4x resolution via Unicode.

data-analysis data-science plot python

Last synced: 05 Jul 2024

https://github.com/TeoMeWhy/teomerefs

Guia de referências técnicas para carreira em dados

data data-science machine-learning python

Last synced: 05 Jul 2024

https://github.com/RamiKrispin/Introduction-to-Docker

(WIP) Getting started with Docker - An introduction to Docker with data science and engineering applications

data-engineering data-science docker dockerfile

Last synced: 04 Jul 2024

https://github.com/tomasonjo/blogs

Jupyter notebooks that support my graph data science blog posts at https://bratanic-tomaz.medium.com/

data-science graph graph-algorithms neo4j

Last synced: 04 Jul 2024

https://github.com/okfn-brasil/rosie

🤖 Python application responsible for Serenata de Amor's intelligence

artificial-intelligence data-science machine-learning

Last synced: 04 Jul 2024

https://github.com/okfn-brasil/whistleblower

🚨A Twitter bot for publicly reporting suspicions found by Rosie, Serenata de Amor's AI

data-science facebook-messenger-bot machine-learning twitter-bot

Last synced: 04 Jul 2024

https://github.com/louisfb01/start-machine-learning

A complete guide to start and improve in machine learning (ML), artificial intelligence (AI) in 2024 without ANY background in the field and stay up-to-date with the latest news and state-of-the-art techniques!

artificial-intelligence cheat-sheets course coursera coursera-machine-learning data-science deep-learning learn-to-code learning learning-python linear-algebra machine-learning neural-networks practice probability-statistics read-articles tutorial tutorials youtube youtube-playlist

Last synced: 04 Jul 2024

https://github.com/akfamily/aktools

AKTools is an elegant and simple HTTP API library for AKShare, built for AKSharers!

akshare asyncio data data-science fastapi openapi pydanti

Last synced: 04 Jul 2024

https://github.com/launchflow/buildflow

BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest. No configuration outside of the code is required.

batch data-science pipeline python streaming

Last synced: 03 Jul 2024

https://github.com/pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

data-science epub extract-data font mupdf ocr pdf pdf-documents pymupdf python table-extraction tesseract text-processing text-shaping xps

Last synced: 03 Jul 2024

https://github.com/MLMI2-CSSI/foundry

Simplifying the discovery and usage of machine-learning ready datasets in materials science and chemistry

chemistry data-science datasets machine-learning materials-science

Last synced: 03 Jul 2024

https://github.com/run-house/runhouse

The fastest way to iterate and deploy AI workloads on your own infra. Unobtrusive, debuggable, PyTorch-like APIs.

api artificial-intelligence aws azure collaboration data-science deployment distributed fastapi gcp infrastructure machine-learning middleware observability python pytorch ray sagemaker serverless

Last synced: 03 Jul 2024

https://github.com/kdr-aus/ogma

Scripting language focused on processing tabular data.

data-science language rust scripting-language table-data

Last synced: 03 Jul 2024

https://github.com/hamelsmu/Seq2Seq_Tutorial

Code For Medium Article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"

data-science deep-learning deeplearning keras keras-tutorials machine-learning medium-article nlp-machine-learning rnn-encoder-decoder seq2seq-tutorial sequence-to-sequence

Last synced: 02 Jul 2024

https://github.com/mandiant/ThreatPursuit-VM

Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.

analytics cyber data-science fireeye intelligence intelligence-analysis malware mandiant threat threathunting threatintelligence virtual-machine

Last synced: 02 Jul 2024

https://github.com/neonwatty/machine_learning_refined

Notes, examples, and Python demos for the 2nd edition of the textbook "Machine Learning Refined" (published by Cambridge University Press).

artificial-intelligence autograd collab data-science deep-learning jax jupyter-notebook lecture-notes machine-learning machine-learning-algorithms mathematical-optimization neural-network numpy python slides

Last synced: 02 Jul 2024

https://github.com/allenai/allennlp

An open-source NLP research library, built on PyTorch.

data-science deep-learning natural-language-processing nlp python pytorch

Last synced: 02 Jul 2024

https://github.com/PKU-DAIR/Hetu

A high-performance distributed deep learning system targeting large-scale and automated distributed training.

artificial-intelligence autograd data-science deep-learning deep-neural-networks distributed-systems distributed-training embeddings gpu high-dimensional machine-learning python state-of-the-art

Last synced: 01 Jul 2024

https://github.com/pablofrommars/fsharp-notebook

Data Science Notebook for F# interactive

data-science data-visualization fsharp vscode-extension

Last synced: 01 Jul 2024

https://github.com/jgoerner/beyond-jupyter

🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)

airflow apache apistar data-science docker docker-compose jupyter jupyter-notebook minio postgres superset

Last synced: 01 Jul 2024

https://github.com/rhiever/datacleaner

A Python tool that automatically cleans data sets and readies them for analysis.

automation data-science machine-learning python

Last synced: 30 Jun 2024

https://github.com/makcedward/nlp

:memo: This repository recorded my NLP journey.

ai data-science deep-learning machine-learning nlp

Last synced: 30 Jun 2024

https://github.com/ak-coram/cl-duckdb

Common Lisp CFFI wrapper around the DuckDB C API

c-bindings common-lisp data-science duckdb lisp olap parquet sql

Last synced: 30 Jun 2024

https://github.com/mlrun/mlrun

MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.

data-engineering data-science experiment-tracking kubernetes machine-learning mlops mlops-workflow model-serving python workflow

Last synced: 29 Jun 2024

https://github.com/ploomber/soorgeon

Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊

data-engineering data-science jupyter jupyter-notebooks machine-learning mlops workflow

Last synced: 29 Jun 2024

https://github.com/ploomber/soopervisor

☁️ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.

airflow argo argo-workflows aws data-science kubeflow kubeflow-pipelines kubernetes machine-learning slurm workflow

Last synced: 29 Jun 2024

https://github.com/aporia-ai/mlnotify

🔔 No need to keep checking your training - just one import line and you'll know the second it's done.

data-science deep-learning deeplearning machine-learning machinelearning machinelearning-python ml notification notifications opensource python python3 tool tools

Last synced: 29 Jun 2024

https://github.com/datacarpentry/semester-biology

Forkable teaching materials for course on working with data in R

biology data-carpentry data-science r spatial-data sql teaching-materials

Last synced: 29 Jun 2024

https://github.com/iterative/mlem

🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞

cli data-science deployment developer-tools git machine-learning mlem model-registry python

Last synced: 29 Jun 2024

https://github.com/carloocchiena/the_statistics_handbook

the statistics handbook open source repository

data-science latex mathematics statistics

Last synced: 29 Jun 2024