Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/scikit-learn/scikit-learn

scikit-learn: machine learning in Python

data-analysis data-science machine-learning python statistics

Last synced: 23 Dec 2024

https://github.com/pandas-dev/pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

alignment data-analysis data-science flexible pandas python

Last synced: 23 Dec 2024

https://github.com/ray-project/ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

automl data-science deep-learning deployment distributed hyperparameter-optimization hyperparameter-search java llm-serving machine-learning model-selection optimization parallel python pytorch ray reinforcement-learning rllib serving tensorflow

Last synced: 23 Dec 2024

https://github.com/gradio-app/gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

data-analysis data-science data-visualization deep-learning deploy gradio gradio-interface hacktoberfest interface machine-learning models python python-notebook ui ui-components

Last synced: 23 Dec 2024

https://github.com/microsoft/Data-Science-For-Beginners

10 Weeks, 20 Lessons, Data Science for All!

data-analysis data-science data-visualization pandas python

Last synced: 30 Oct 2024

https://github.com/Lightning-AI/pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

ai artificial-intelligence data-science deep-learning machine-learning python pytorch

Last synced: 29 Oct 2024

https://github.com/lightning-ai/pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

ai artificial-intelligence data-science deep-learning machine-learning python pytorch

Last synced: 23 Dec 2024

https://github.com/donnemartin/data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

aws big-data caffe data-science deep-learning hadoop kaggle keras machine-learning mapreduce matplotlib numpy pandas python scikit-learn scipy spark tensorflow theano

Last synced: 23 Dec 2024

https://github.com/camdavidsonpilon/probabilistic-programming-and-bayesian-methods-for-hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

bayesian-methods data-science jupyter-notebook mathematical-analysis pymc statistics

Last synced: 23 Dec 2024

https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

bayesian-methods data-science jupyter-notebook mathematical-analysis pymc statistics

Last synced: 25 Oct 2024

https://github.com/okulbilisim/awesome-datascience

:memo: An awesome Data Science repository to learn and apply for real world problems.

analytics awesome-list data-mining data-science data-scientists data-visualization deep-learning hacktoberfest machine-learning science

Last synced: 16 Nov 2024

https://github.com/d2l-ai/d2l-en

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

book computer-vision data-science deep-learning gaussian-processes hyperparameter-optimization jax kaggle keras machine-learning mxnet natural-language-processing notebook python pytorch recommender-system reinforcement-learning tensorflow

Last synced: 23 Dec 2024

https://github.com/eriklindernoren/ml-from-scratch

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

data-mining data-science deep-learning deep-reinforcement-learning genetic-algorithm machine-learning machine-learning-from-scratch

Last synced: 23 Dec 2024

https://github.com/eriklindernoren/ML-From-Scratch

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

data-mining data-science deep-learning deep-reinforcement-learning genetic-algorithm machine-learning machine-learning-from-scratch

Last synced: 26 Oct 2024

https://github.com/fastai/fastbook

The fastai book, published as Jupyter Notebooks

book data-science deep-learning fastai machine-learning notebooks python

Last synced: 23 Dec 2024

https://github.com/qax-os/excelize

Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets

analytics chart data-science ecma-376 excel excelize formula go golang microsoft office ooxml openxml spreadsheet statistics table vba visualization xlsx xml

Last synced: 23 Dec 2024

https://github.com/ipython/ipython

Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.

closember data-science hacktoberfest ipython jupyter notebook python repl spec-0

Last synced: 23 Dec 2024

https://github.com/prefecthq/prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

automation data data-engineering data-ops data-science infrastructure ml-ops observability orchestration pipeline prefect python workflow workflow-engine

Last synced: 23 Dec 2024

https://github.com/PrefectHQ/prefect

Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines

automation data data-engineering data-ops data-science infrastructure ml-ops observability orchestration pipeline prefect python workflow workflow-engine

Last synced: 29 Oct 2024

https://github.com/dair-ai/ml-youtube-courses

📺 Discover the latest machine learning / AI courses on YouTube.

ai data-science deep-learning machine-learning natural-language-processing nlp

Last synced: 03 Dec 2024

https://github.com/dair-ai/ML-YouTube-Courses

📺 Discover the latest machine learning / AI courses on YouTube.

ai data-science deep-learning machine-learning natural-language-processing nlp

Last synced: 25 Oct 2024

https://github.com/sinaptik-ai/pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql

Last synced: 23 Dec 2024

https://github.com/Sinaptik-AI/pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql

Last synced: 29 Oct 2024

https://github.com/mwaskom/seaborn

Statistical data visualization in Python

data-science data-visualization matplotlib pandas python

Last synced: 23 Dec 2024

https://github.com/rasbt/python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource

data-mining data-science logistic-regression machine-learning machine-learning-algorithms neural-network python scikit-learn

Last synced: 24 Dec 2024

https://github.com/allenai/allennlp

An open-source NLP research library, built on PyTorch.

data-science deep-learning natural-language-processing nlp python pytorch

Last synced: 29 Sep 2024

https://github.com/openrefine/openrefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 23 Dec 2024

https://github.com/OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 27 Oct 2024

https://github.com/trinodb/trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

analytics big-data data-science database databases datalake delta-lake distributed-database distributed-systems hadoop hive iceberg java jdbc presto prestodb query-engine sql trino

Last synced: 23 Dec 2024

https://github.com/fastai/numerical-linear-algebra

Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course

algorithms data-science deep-learning linear-algebra machine-learning numpy python

Last synced: 24 Dec 2024

https://github.com/aws/amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

aws data-science deep-learning examples inference jupyter-notebook machine-learning mlops reinforcement-learning sagemaker training

Last synced: 23 Dec 2024

https://github.com/dair-ai/ml-papers-of-the-week

🔥Highlighting the top ML papers every week.

ai data-science deeplearning machine-learning nlp

Last synced: 03 Dec 2024

https://github.com/tangyudi/ai-learn

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

algorithm artificial-intelligence caffe cv data-analysis data-mining data-science deep-learning keras machine-learning mathematics matplotlib nlp numpy pandas python pytorch seaborn tensorflow tensorflow2

Last synced: 24 Dec 2024

https://github.com/tangyudi/Ai-Learn

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

algorithm artificial-intelligence caffe cv data-analysis data-mining data-science deep-learning keras machine-learning mathematics matplotlib nlp numpy pandas python pytorch seaborn tensorflow tensorflow2

Last synced: 14 Nov 2024

https://github.com/modin-project/modin

Modin: Scale your Pandas workflows by changing a single line of code

analytics data-science dataframe datascience distributed modin pandas python sql

Last synced: 29 Oct 2024

https://github.com/akfamily/akshare

AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库

academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock

Last synced: 23 Dec 2024

https://github.com/tflearn/tflearn

Deep learning library featuring a higher-level API for TensorFlow.

data-science deep-learning machine-learning neural-network tensorflow tflearn

Last synced: 23 Dec 2024

https://github.com/chiphuyen/machine-learning-systems-design

A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems"

data-science machine-learning-production mlops

Last synced: 04 Dec 2024

https://github.com/goplus/gop

The Go+ programming language is designed for engineering, STEM education, and data science. Our vision is to enable everyone to become a builder of the digital world.

data-science engineering golang gop goplus low-code programming-language scientific-computing stem stem-education

Last synced: 23 Dec 2024

https://github.com/dair-ai/ML-Papers-of-the-Week

🔥Highlighting the top ML papers every week.

ai data-science deeplearning machine-learning nlp

Last synced: 27 Oct 2024

https://github.com/marimo-team/marimo

A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.

artificial-intelligence dag data-science data-visualization dataflow developer-tools machine-learning notebooks pipeline python reactive web-app

Last synced: 23 Dec 2024

https://github.com/vaexio/vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

bigdata data-science dataframe hdf5 machine-learning machinelearning memory-mapped-file pyarrow python tabular-data visualization

Last synced: 23 Dec 2024

https://github.com/activeloopai/deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

ai computer-vision cv data-science datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops multi-modal python pytorch tensorflow vector-database vector-search

Last synced: 23 Dec 2024

https://github.com/hugoblox/hugo-blox-builder

🚨 GROW YOUR AUDIENCE WITH HUGOBLOX! 🚀 HugoBlox is an easy, fast no-code website builder for researchers, entrepreneurs, data scientists, and developers. Build stunning sites in minutes. 适合研究人员、企业家、数据科学家和开发者的简单快速无代码网站构建器。用拖放功能、可定制模板和内置SEO工具快速创建精美网站!

academic blog blog-engine cms data-science documentation-tool github-pages hugo hugo-theme jupyter netlify open-science page-builder portfolio r rmarkdown rstudio static-site-generator theme website-builder

Last synced: 23 Dec 2024

https://github.com/activeloopai/Hub

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

ai computer-vision cv data-science datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops multi-modal python pytorch tensorflow vector-database vector-search

Last synced: 08 Dec 2024