Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-07-29 13:36:33 UTC
- JSON Representation
https://github.com/databricks/koalas
Koalas: pandas API on Apache Spark
big-data data-science dataframe mlflow pandas pydata spark
Last synced: 31 Jul 2024
https://github.com/eto-ai/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust
Last synced: 02 Aug 2024
https://github.com/lancedb/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust
Last synced: 31 Jul 2024
https://github.com/khanhnamle1994/cracking-the-data-science-interview
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
concepts data-journalism data-portfolio data-science data-wrangling deep-learning downloadable-cheatsheets machine-learning python statistics
Last synced: 31 Jul 2024
https://github.com/jonkrohn/ML-foundations
Machine Learning Foundations: Linear Algebra, Calculus, Statistics & Computer Science
calculus computer-science data-science data-structures jupyter-notebook linear-algebra machine-learning mathematics numpy probability python pytorch statistics tensorflow
Last synced: 02 Aug 2024
https://github.com/giswqs/geemap
A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
colab data-science dataviz earth-engine earthengine folium geospatial gis google-earth-engine image-processing ipyleaflet ipywidgets jupyter jupyter-notebook landsat mapping python remote-sensing streamlit streamlit-webapp
Last synced: 10 Aug 2024
https://github.com/gee-community/geemap
A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
colab data-science dataviz earth-engine earthengine folium geospatial gis google-earth-engine image-processing ipyleaflet ipywidgets jupyter jupyter-notebook landsat mapping python remote-sensing streamlit streamlit-webapp
Last synced: 30 Jul 2024
https://github.com/spark-notebook/spark-notebook
Interactive and Reactive Data Science using Scala and Spark.
apache-spark data-science notebook reactive scala spark
Last synced: 01 Aug 2024
https://github.com/alibaba/GraphScope
🔨 🍇 💻 🚀 GraphScope: A One-Stop Large-Scale Graph Computing System from Alibaba | 一站式图计算系统
analytics big-data data-science graph graph-analytics graph-computation graph-computing graph-data graph-neural-networks gremlin
Last synced: 01 Aug 2024
https://github.com/ethen8181/machine-learning
:earth_americas: machine learning tutorials (mainly in Python3)
data-science deep-learning jupyter-notebook machine-learning python python3
Last synced: 31 Jul 2024
https://github.com/TDAmeritrade/stumpy
STUMPY is a powerful and scalable Python library for modern time series analysis
anomaly-detection dask data-science matrix-profile motif-discovery numba pattern-matching pydata python time-series-analysis time-series-data-mining time-series-segmentation
Last synced: 31 Jul 2024
https://github.com/moataz-elmesmary/data-science-roadmap
Data Science Roadmap from A to Z
big-data chatgpt cheatsheet cv-template data-analysis data-engineering data-science data-visualization deep-learning interview-questions linear-algebra llms machine-learning mathematics neural-network nlp probability python sql statistics
Last synced: 02 Aug 2024
https://github.com/antonycourtney/tad
A desktop application for viewing and analyzing tabular data
csv data-analysis data-science database desktop-application duckdb parquet-viewer pivot-tables pivots tabular-data
Last synced: 31 Jul 2024
https://github.com/Moataz-Elmesmary/Data-Science-Roadmap
Data Science Roadmap from A to Z
big-data chatgpt cheatsheet cv-template data-analysis data-engineering data-science data-visualization deep-learning interview-questions linear-algebra llms machine-learning mathematics neural-network nlp probability python sql statistics
Last synced: 31 Jul 2024
https://github.com/aksnzhy/xlearn
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
data-analysis data-science factorization-machines ffm fm machine-learning statistics
Last synced: 30 Jul 2024
https://github.com/nidhaloff/igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
artificial-intelligence automation automl automl-experiments data-analysis data-science hacktoberfest hacktoberfest2021 machine-learning machine-learning-algorithms machine-learning-library machinelearning neural-network neural-networks preprocessing scikit-learn scikitlearn-machine-learning sklearn
Last synced: 30 Jul 2024
https://github.com/tirthajyoti/Machine-Learning-with-Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
artificial-intelligence classification clustering data-science decision-trees deep-learning dimensionality-reduction flask k-nearest-neighbours machine-learning matplotlib naive-bayes neural-network numpy pandas pytest random-forest regression scikit-learn statistics
Last synced: 07 Aug 2024
https://github.com/shogun-toolbox/shogun
Shōgun
artificial-intelligence c-plus-plus cmake data-science machine-learning swig
Last synced: 30 Jul 2024
https://github.com/bfortuner/ml-glossary
Machine learning glossary
cheatsheets data-science deep-learning deep-learning-tutorial machine-learning neural-network
Last synced: 31 Jul 2024
https://github.com/stellargraph/stellargraph
StellarGraph - Machine Learning on Graphs
data-science deep-learning gcn geometric-deep-learning graph-analysis graph-convolutional-networks graph-data graph-machine-learning graph-neural-networks graphs heterogeneous-networks interpretability link-prediction machine-learning machine-learning-algorithms networkx python saliency-map stellargraph-library
Last synced: 31 Jul 2024
https://github.com/mljar/mljar-supervised
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
automated-machine-learning automatic-machine-learning automl catboost data-science decision-tree ensemble feature-engineering hyper-parameters hyperparameter-optimization lightgbm machine-learning mljar models-tuning neural-network random-forest scikit-learn shap tuning-algorithm xgboost
Last synced: 30 Jul 2024
https://github.com/giswqs/leafmap
A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
data-science dataviz folium geoparquet geopython geospatial geospatial-analysis gis ipyleaflet jupyter jupyter-notebook leafmap mapping plotly python streamlit streamlit-webapp whiteboxtools
Last synced: 05 Sep 2024
https://github.com/opengeos/leafmap
A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
data-science dataviz folium geoparquet geopython geospatial geospatial-analysis gis ipyleaflet jupyter jupyter-notebook leafmap mapping plotly python streamlit streamlit-webapp whiteboxtools
Last synced: 30 Jul 2024
https://github.com/determined-ai/determined
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
data-science deep-learning distributed-training hyperparameter-optimization hyperparameter-search hyperparameter-tuning keras kubernetes machine-learning ml-infrastructure ml-platform mlops pytorch tensorflow
Last synced: 31 Jul 2024
https://github.com/fbdesignpro/sweetviz
Visualize and compare datasets, target values and associations, with one line of code.
data-analysis data-exploration data-profiling data-science data-visualization eda exploration exploratory-data-analysis machine-learning pandas pandas-dataframe python statistics
Last synced: 30 Jul 2024
https://github.com/parrt/dtreeviz
A python library for decision tree visualization and model interpretation.
data-science decision-trees machine-learning model-interpretation python random-forest scikit-learn visualization xgboost
Last synced: 01 Aug 2024
https://github.com/datafold/data-diff
Compare tables within or across databases
data data-diffing data-engineering data-quality data-quality-monitoring data-science database databricks-sql dataengineering dataquality dbt mysql oracle-database postgres postgresql python rdbms snowflake sql trino
Last synced: 31 Jul 2024
https://github.com/dotnet/interactive
.NET Interactive combines the power of .NET with many other languages to create notebooks, REPLs, and embedded coding experiences. Share code, explore data, write, and learn across your apps in ways you couldn't before.
csharp data-science dotnet-interactive fsharp interactive-programming jupyter notebooks polyglot polyglot-dev powershell
Last synced: 30 Jul 2024
https://github.com/libffcv/ffcv
FFCV: Fast Forward Computer Vision (and other ML workloads!)
data-science machine-learning pytorch
Last synced: 31 Jul 2024
https://github.com/rasbt/deep-learning-book
Repository for "Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python"
artificial-intelligence data-science deep-learning machine-learning neural-network python pytorch tensorflow
Last synced: 31 Jul 2024
https://github.com/GokuMohandas/mlops-course
Learn how to design, develop, deploy and iterate on production-grade ML applications.
data-engineering data-quality data-science deep-learning distributed-ml llms machine-learning mlops natural-language-processing python pytorch ray
Last synced: 31 Jul 2024
https://github.com/quadratichq/quadratic
Quadratic | Data Science Spreadsheet with Python & SQL
data data-analysis data-engineering data-science etl python quadratic spreadsheet sql wasm webgl
Last synced: 01 Aug 2024
https://github.com/TeamHG-Memex/eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
crfsuite data-science explanation inspection lightgbm machine-learning nlp python scikit-learn xgboost
Last synced: 02 Aug 2024
https://github.com/mrdbourke/zero-to-mastery-ml
All course materials for the Zero to Mastery Machine Learning and Data Science course.
data-science deep-learning machine-learning
Last synced: 02 Aug 2024
https://github.com/whylabs/whylogs
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
ai-pipelines analytics approximate-statistics calculate-statistics constraints data-constraints data-pipeline data-quality data-science dataops dataset logging machine-learning ml-pipelines mlops model-performance python statistical-properties
Last synced: 01 Aug 2024
https://github.com/justinzm/gopup
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
covid19-data data data-analysis data-science datasets economic-data gopup index-data python
Last synced: 31 Jul 2024
https://github.com/afshinea/stanford-cs-221-artificial-intelligence
VIP cheatsheets for Stanford's CS 221 Artificial Intelligence
a-star artificial-intelligence bayesian-networks cheatsheet constraint-satisfaction-problem data-science markov-decision-processes
Last synced: 02 Aug 2024
https://matheusfacure.github.io/python-causality-handbook/
Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
causal-inference causality data-science econometrics harmless-econometrics impact-estimation python
Last synced: 01 Aug 2024
https://github.com/matheusfacure/python-causality-handbook
Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
causal-inference causality data-science econometrics harmless-econometrics impact-estimation python
Last synced: 31 Jul 2024
https://github.com/reiinakano/scikit-plot
An intuitive library to add plotting functionality to scikit-learn objects.
data-science machine-learning plot plotting scikit-learn visualization
Last synced: 30 Jul 2024
https://github.com/hosseinmoein/DataFrame
C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
ai cpp data-analysis data-science dataframe financial-data-analysis financial-engineering heterogeneous-data large-data machine-learning multidimensional-data numerical-analysis pandas polars statistical statistical-analysis tensor tensorboard trading-algorithms trading-strategies
Last synced: 31 Jul 2024
https://github.com/yunabe/lgo
Interactive Go programming with Jupyter
data-science go golang jupyter-notebook jupyter-notebook-kernel machine-learning repl
Last synced: 31 Jul 2024
https://github.com/camDavidsonPilon/lifelines
Survival analysis in Python
cox-regression data-science maximum-likelihood python reliability-analysis statistics survival-analysis
Last synced: 02 Aug 2024
https://github.com/CamDavidsonPilon/lifelines
Survival analysis in Python
cox-regression data-science maximum-likelihood python reliability-analysis statistics survival-analysis
Last synced: 31 Jul 2024
https://github.com/claimed-framework/component-library
The goal of CLAIMED is to enable low-code/no-code rapid prototyping style programming to seamlessly CI/CD into production.
Last synced: 02 Aug 2024
https://github.com/PizzaDeDados/datascience-pizza
🍕 Repositório para juntar informações sobre materiais de estudo em análise de dados e áreas afins, empresas que trabalham com dados e dicionário de conceitos
dados data-science data-scientists hacktoberfest machine-learning
Last synced: 30 Jul 2024
https://github.com/ashishpatel26/Andrew-NG-Notes
This is Andrew NG Coursera Handwritten Notes.
andrew-ng andrew-ng-course andrew-ng-machine-learning andrewng coursera coursera-machine-learning data-science deep-learning deep-neural-networks dl machine-learning ml neural-network neural-networks numpy pandas python pytorch reinforcement-learning
Last synced: 03 Aug 2024
https://github.com/mito-ds/mito
The mitosheet package, trymito.io, and other public Mito code.
data data-analysis data-science data-visualization jupyter pandas python streamlit-component
Last synced: 31 Jul 2024
https://github.com/approximatelabs/sketch
AI code-writing assistant that understands data content
ai codex copilot data data-science dataframe datasketch datasketches df ds gpt3 lambdaprompt pandas python sketches tabular-data
Last synced: 01 Aug 2024
https://github.com/tirthajyoti/Papers-Literature-ML-DL-RL-AI
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
artificial-intelligence data-mining data-science deep-learning game-theory hardware learning-theory literature machine-learning machine-learning-algorithms neural-network paper pattern-recognition reinforcement-learning silicon statistical-learning statistics
Last synced: 31 Jul 2024
https://github.com/mGalarnyk/datasciencecoursera
Data Science Repo and blog for John Hopkins Coursera Courses. Please let me know if you have any questions.
data-science jhu-coursera john-hopkins-coursera python r stanford
Last synced: 02 Aug 2024
https://github.com/quantumblacklabs/causalnex
A Python library that helps data scientists to infer causation rather than observing correlation.
bayesian-inference bayesian-networks causal-inference causal-models causal-networks causalnex data-science machine-learning
Last synced: 04 Aug 2024
https://github.com/mckinsey/causalnex
A Python library that helps data scientists to infer causation rather than observing correlation.
bayesian-inference bayesian-networks causal-inference causal-models causal-networks causalnex data-science machine-learning
Last synced: 01 Aug 2024
https://github.com/MacroAnalyst/Linear_Algebra_With_Python
Lecture Notes for Linear Algebra Featuring Python. This series of lecture notes will walk you through all the must-know concepts that set the foundation of data science or advanced quantitative skillsets. Suitable for statistician/econometrician, quantitative analysts, data scientists and etc. to quickly refresh the linear algebra with the assistance of Python computation and visualization.
computational-science data-analysis data-science data-visualization diagonalization eigenvalues eigenvectors gram-schmidt jupyter linear-algebra linear-transformations mathematics matrix matrix-calculations multivariate-normal-distribution null-space python singular-value-decomposition symmetric-matrices vector-space
Last synced: 30 Jul 2024
https://github.com/weijie-chen/Linear-Algebra-With-Python
Lecture Notes for Linear Algebra Featuring Python. This series of lecture notes will walk you through all the must-know concepts that set the foundation of data science or advanced quantitative skillsets. Suitable for statistician/econometrician, quantitative analysts, data scientists and etc. to quickly refresh the linear algebra with the assistance of Python computation and visualization.
computational-science data-analysis data-science data-visualization diagonalization eigenvalues eigenvectors gram-schmidt jupyter linear-algebra linear-transformations mathematics matrix matrix-calculations multivariate-normal-distribution null-space python singular-value-decomposition symmetric-matrices vector-space
Last synced: 31 Jul 2024
https://github.com/github/codesearchnet
Datasets, tools, and benchmarks for representation learning of code.
bert cnn data data-science datasets deep-learning machine-learning machine-learning-on-source-code ml natural-language-processing neural-networks nlp nlp-machine-learning open-data programming-language-theory python representation-learning rnn self-attention tensorflow
Last synced: 03 Aug 2024
https://github.com/chiphuyen/lazynlp
Library to scrape and clean web pages to create massive datasets.
artificial-intelligence data-science language-model natural-language-processing nlp open python text-mining
Last synced: 01 Aug 2024
https://github.com/justmarkham/pandas-videos
Jupyter notebook and datasets from the pandas video series
data-analysis data-cleaning data-science jupyter-notebook pandas python tutorial
Last synced: 31 Jul 2024
https://github.com/wooey/Wooey
A Django app that creates automatic web UIs for Python scripts.
data-science django python python-scripts web wooey workflows
Last synced: 29 Jul 2024
https://github.com/wooey/wooey
A Django app that creates automatic web UIs for Python scripts.
data-science django python python-scripts web wooey workflows
Last synced: 30 Jul 2024
https://github.com/alexhallam/tv
📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.
cli column command-line command-line-tool csv csv-cat csv-column csv-pretty-print csv-viewer csv-visualization data-science dataframe datatable pretty-print pretty-printer rust tabular-data terminal tibble
Last synced: 31 Jul 2024
https://github.com/metarank/metarank
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
automl data-engineering data-science deep-learning feature-engineering feature-extraction kubernetes machine-learning neural-networks personalization ranking scala search
Last synced: 01 Aug 2024
https://github.com/cerlymarco/MEDIUM_NoteBook
Repository containing notebooks of my posts on Medium
artificial-intelligence data-science deep-learning machine-learning notebooks
Last synced: 02 Aug 2024
https://github.com/RubixML/ML
A high-level machine learning and deep learning library for the PHP language.
ai algorithm analytics anomaly-detection artificial-intelligence classification clustering cross-validation data-science deep-learning machine-learning machine-learning-library natural-language-processing neural-network php php-ai php-machine-learning php-ml prediction regression
Last synced: 31 Jul 2024
https://github.com/danijar/handout
Turn Python scripts into handouts with Markdown and figures
data-science notebook productivity prototyping python research
Last synced: 31 Jul 2024
https://github.com/Lightning-AI/torchmetrics
Torchmetrics - Machine learning metrics for distributed, scalable PyTorch applications.
analyses data-science deep-learning machine-learning metrics python pytorch
Last synced: 31 Jul 2024
https://github.com/sfu-db/dataprep
Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
apis apiwrapper cleaning connector data-exploration data-science datacleaning dataconnector dataprep datapreparation eda exploratory-data-analysis webconnector
Last synced: 02 Aug 2024
https://github.com/feathr-ai/feathr
Feathr – A scalable, unified data and AI engineering platform for enterprise
apache-spark artificial-intelligence azure data-engineering data-quality data-science feature-engineering feature-governance feature-management feature-marketplace feature-metadata feature-platform feature-store machine-learning mlops
Last synced: 04 Aug 2024
https://github.com/Eventual-Inc/Daft
Distributed DataFrame for Python designed for the cloud, powered by Rust
big-data data-engineering data-science dataframe distributed-computing machine-learning python rust
Last synced: 01 Aug 2024
https://github.com/BlazingDB/blazingsql
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
arrow artificial-intelligence blazingsql conda-environment cudf data-science gpu gpu-acceleration gpu-dataframes machine-learning machine-learning-workflow python rapids rapidsai sql sql-engine
Last synced: 31 Jul 2024
https://github.com/pymc-devs/pymc-resources
PyMC educational resources
bayesian-inference bayesian-statistics data-analysis data-science
Last synced: 31 Jul 2024
https://github.com/szilard/benchm-ml
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
data-science deep-learning gradient-boosting-machine h2o machine-learning python r random-forest spark xgboost
Last synced: 30 Jul 2024
https://github.com/refuel-ai/autolabel
Label, clean and enrich text datasets with LLMs.
anthropic-claude data-science gpt-4 huggingface-transformers langchain large-language-models llm llms machine-learning openai python
Last synced: 31 Jul 2024
https://github.com/milesCranmer/PySR
High-Performance Symbolic Regression in Python and Julia
algorithm automl data-science distributed-systems equation-discovery evolutionary-algorithms explainable-ai genetic-algorithm interpretable-ml julia machine-learning numpy python scikit-learn symbolic symbolic-regression
Last synced: 01 Aug 2024
https://github.com/MilesCranmer/PySR
High-Performance Symbolic Regression in Python and Julia
algorithm automl data-science distributed-systems equation-discovery evolutionary-algorithms explainable-ai genetic-algorithm interpretable-ml julia machine-learning numpy python scikit-learn symbolic symbolic-regression
Last synced: 31 Jul 2024
https://github.com/Ph055a/OSINT_Collection
Maintained collection of OSINT related resources. (All Free & Actionable)
court-search data-science dataset infosec investigation journalism osint research search
Last synced: 01 Aug 2024
https://github.com/diffgram/diffgram
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
annotation annotation-tool annotations data data-analytics data-annotation data-science datasets datastore deep-learning image-annotation kubernetes labeling machine-learning training-data video-annotation
Last synced: 30 Jul 2024
https://github.com/pydoit/doit
task management & automation tool
build-automation build-system build-tool data-pipeline data-science hacktoberfest python task-runner workflow workflow-automation workflow-management
Last synced: 31 Jul 2024
https://github.com/shashankvemuri/Finance
150+ quantitative finance Python programs to help you gather, manipulate, and analyze stock market data
algorithmic-trading data-science finance machine-learning pandas python quantitative-finance stock stock-market stocks technical-indicators trading-strategies
Last synced: 01 Aug 2024
https://github.com/the-turing-way/the-turing-way
Host repository for The Turing Way: a how to guide for reproducible data science
closember community data-science education hacktoberfest hut23 hut23-270 hut23-396
Last synced: 31 Jul 2024
https://github.com/edtechre/pybroker
Algorithmic Trading in Python with Machine Learning
ai algorithmic-trading algotrading artificial-intelligence backtesting crypto cryptocurrency data-science finance framework investment machine-learning python quantitative-finance stocks trading trading-strategies
Last synced: 31 Jul 2024
https://github.com/scverse/scanpy
Single-cell analysis in Python. Scales to >1M cells.
anndata bioinformatics data-science machine-learning python scanpy scverse transcriptomics visualize-data
Last synced: 01 Aug 2024
https://github.com/NannyML/nannyml
nannyml: post-deployment data science in python
data-analysis data-drift data-science deep-learning jupyter-notebook machine-learning machinelearning ml mlops model-monitoring monitoring performance-estimation performance-monitoring postdeploymentdatascience python visualization
Last synced: 02 Aug 2024
https://github.com/Esri/arcgis-python-api
Documentation and samples for ArcGIS API for Python
arcgis data-science gis jupyter jupyterlab-extension mapping python spatial-data spatial-data-analysis
Last synced: 01 Aug 2024
https://github.com/shervinea/mit-15-003-data-science-tools
Study guides for MIT's 15.003 Data Science Tools
bash data-science git manipulation r retrieval sql study-guide visualization
Last synced: 02 Aug 2024
https://github.com/TileDB-Inc/TileDB
The Universal Storage Engine
arrays data-analysis data-science dataframes dense-data hdfs s3 s3-storage scientific-computing sparse-arrays sparse-data storage-engine tiledb
Last synced: 31 Jul 2024
https://github.com/feature-engine/feature_engine
Feature engineering package with sklearn like functionality
data-science feature-engineering feature-extraction feature-selection machine-learning python scikit-learn
Last synced: 31 Jul 2024
https://github.com/featureform/featureform
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
data-quality data-science embeddings embeddings-similarity feature-engineering feature-store hacktoberfest machine-learning ml mlops python vector-database
Last synced: 01 Aug 2024
https://github.com/featureform/embeddinghub
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
data-quality data-science embeddings embeddings-similarity feature-engineering feature-store hacktoberfest machine-learning ml mlops python vector-database
Last synced: 31 Jul 2024
https://github.com/alan-turing-institute/MLJ.jl
A Julia machine learning framework
classification clustering data-science ensemble-learning julia machine-learning pipeline pipelines predictive-modeling regression stacking statistics tuning tuning-parameters
Last synced: 31 Jul 2024
https://github.com/variety/variety
Variety: a MongoDB Schema Analyzer
data-science javascript mongo mongodb nosql nosql-analytics
Last synced: 31 Jul 2024
https://github.com/chdb-io/chdb
chDB is an embedded OLAP SQL Engine 🚀 powered by ClickHouse
chdb clickhouse clickhouse-database clickhouse-server data-science database embedded-database olap python sql
Last synced: 31 Jul 2024
https://github.com/justmarkham/scikit-learn-tips
:robot::zap: 50 scikit-learn tips
data-school data-science machine-learning python scikit-learn
Last synced: 02 Aug 2024
https://github.com/letianzj/QuantResearch
Quantitative analysis, strategies and backtests
algorithmic-trading algotrading asset-allocation asset-management backtesting-trading-strategies backtests data-science deep-learning derivatives-pricing financial-analysis machine-learning pairs-trading portfolio-management quantitative-finance quantitative-trading reinforcement-learning risk-management statistical-arbitrage trading-algorithms trading-strategies
Last synced: 01 Aug 2024
https://github.com/iphysresearch/DataSciComp
A collection of popular Data Science Challenges/Competitions || Countdown timers to keep track of the entry deadlines.
challenge competition data-challenge data-science data-science-competitions project
Last synced: 01 Aug 2024
https://github.com/Azure/azureml-examples
Official community-driven Azure Machine Learning examples, tested with GitHub Actions.
azure azure-machine-learning azureml data-science ml
Last synced: 08 Aug 2024
https://github.com/yuankunzhang/charming
A visualization library for Rust
chart data-science rust visualization webassembly
Last synced: 31 Jul 2024
https://github.com/github/covid19-dashboard
A site that displays up to date COVID-19 stats, powered by fastpages.
altair analytics covid-19 covid-data covid19 data-science data-visualisation fastai fastpages github-actions github-pages jupyter matplotlib nteract papermill pymc3 python
Last synced: 31 Jul 2024
https://github.com/ClimbsRocks/auto_ml
[UNMAINTAINED] Automated machine learning for analytics & production
analytics artificial-intelligence automated-machine-learning automl data-science deep-learning deeplearning feature-engineering gradient-boosting hyperparameter-optimization keras lightgbm machine-learning machine-learning-library machine-learning-pipelines production-ready python scikit-learn tensorflow xgboost
Last synced: 01 Aug 2024
https://github.com/neonwatty/machine_learning_refined
Notes, examples, and Python demos for the 2nd edition of the textbook "Machine Learning Refined" (published by Cambridge University Press).
artificial-intelligence autograd collab data-science deep-learning jax jupyter-notebook lecture-notes machine-learning machine-learning-algorithms mathematical-optimization neural-network numpy python slides
Last synced: 31 Jul 2024