Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-12-23 00:06:28 UTC
- JSON Representation
https://github.com/microsoft/ml-for-beginners
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
data-science education machine-learning machine-learning-algorithms machinelearning machinelearning-python microsoft-for-beginners ml python r scikit-learn scikit-learn-python
Last synced: 23 Dec 2024
https://microsoft.github.io/ML-For-Beginners/
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
data-science education machine-learning machine-learning-algorithms machinelearning machinelearning-python ml python r scikit-learn scikit-learn-python
Last synced: 01 Nov 2024
https://github.com/microsoft/ML-For-Beginners
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
data-science education machine-learning machine-learning-algorithms machinelearning machinelearning-python ml python r scikit-learn scikit-learn-python
Last synced: 27 Oct 2024
https://github.com/apache/incubator-superset
Apache Superset is a Data Visualization and Data Exploration Platform
analytics apache apache-superset asf bi business-analytics business-intelligence data-analysis data-analytics data-engineering data-science data-visualization data-viz flask python react sql-editor superset
Last synced: 09 Dec 2024
https://github.com/apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
analytics apache apache-superset asf bi business-analytics business-intelligence data-analysis data-analytics data-engineering data-science data-visualization data-viz flask python react sql-editor superset
Last synced: 23 Dec 2024
https://github.com/airbnb/caravel
Apache Superset is a Data Visualization and Data Exploration Platform
analytics apache apache-superset asf bi business-analytics business-intelligence data-analysis data-analytics data-engineering data-science data-visualization data-viz flask python react sql-editor superset
Last synced: 23 Nov 2024
https://github.com/keras-team/keras
Deep Learning for humans
data-science deep-learning jax machine-learning neural-networks python pytorch tensorflow
Last synced: 23 Dec 2024
https://github.com/scikit-learn/scikit-learn
scikit-learn: machine learning in Python
data-analysis data-science machine-learning python statistics
Last synced: 23 Dec 2024
https://github.com/pandas-dev/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
alignment data-analysis data-science flexible pandas python
Last synced: 23 Dec 2024
https://github.com/apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow apache apache-airflow automation dag data-engineering data-integration data-orchestrator data-pipelines data-science elt etl machine-learning mlops orchestration python scheduler workflow workflow-engine workflow-orchestration
Last synced: 23 Dec 2024
https://github.com/apache/incubator-airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow apache apache-airflow automation dag data-engineering data-integration data-orchestrator data-pipelines data-science elt etl machine-learning mlops orchestration python scheduler workflow workflow-engine workflow-orchestration
Last synced: 23 Nov 2024
https://github.com/gokumohandas/made-with-ml
Learn how to design, develop, deploy and iterate on production-grade ML applications.
data-engineering data-quality data-science deep-learning distributed-ml distributed-training llms machine-learning mlops natural-language-processing python pytorch ray
Last synced: 23 Dec 2024
https://github.com/GokuMohandas/Made-With-ML
Learn how to design, develop, deploy and iterate on production-grade ML applications.
data-engineering data-quality data-science deep-learning distributed-ml distributed-training llms machine-learning mlops natural-language-processing python pytorch ray
Last synced: 27 Oct 2024
https://github.com/streamlit/streamlit
Streamlit — A faster way to build and share data apps.
data-analysis data-science data-visualization deep-learning developer-tools machine-learning python streamlit
Last synced: 23 Dec 2024
https://github.com/ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
automl data-science deep-learning deployment distributed hyperparameter-optimization hyperparameter-search java llm-serving machine-learning model-selection optimization parallel python pytorch ray reinforcement-learning rllib serving tensorflow
Last synced: 23 Dec 2024
https://github.com/gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
data-analysis data-science data-visualization deep-learning deploy gradio gradio-interface hacktoberfest interface machine-learning models python python-notebook ui ui-components
Last synced: 23 Dec 2024
https://github.com/explosion/spacy
💫 Industrial-strength Natural Language Processing (NLP) in Python
ai artificial-intelligence cython data-science deep-learning entity-linking machine-learning named-entity-recognition natural-language-processing neural-network neural-networks nlp nlp-library python spacy text-classification tokenization
Last synced: 23 Dec 2024
https://github.com/explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
ai artificial-intelligence cython data-science deep-learning entity-linking machine-learning named-entity-recognition natural-language-processing neural-network neural-networks nlp nlp-library python spacy text-classification tokenization
Last synced: 26 Oct 2024
https://github.com/amai-gmbh/ai-expert-roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2022
ai ai-roadmap artificial-intelligence data-analysis data-science deep-learning machine-learning neural-network roadmap study-plan
Last synced: 17 Dec 2024
https://github.com/AMAI-GmbH/AI-Expert-Roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2022
ai ai-roadmap artificial-intelligence data-analysis data-science deep-learning machine-learning neural-network roadmap study-plan
Last synced: 25 Oct 2024
https://github.com/microsoft/data-science-for-beginners
10 Weeks, 20 Lessons, Data Science for All!
data-analysis data-science data-visualization microsoft-for-beginners pandas python
Last synced: 23 Dec 2024
https://github.com/microsoft/Data-Science-For-Beginners
10 Weeks, 20 Lessons, Data Science for All!
data-analysis data-science data-visualization pandas python
Last synced: 30 Oct 2024
https://github.com/Lightning-AI/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
ai artificial-intelligence data-science deep-learning machine-learning python pytorch
Last synced: 29 Oct 2024
https://github.com/lightning-ai/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
ai artificial-intelligence data-science deep-learning machine-learning python pytorch
Last synced: 23 Dec 2024
https://github.com/donnemartin/data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
aws big-data caffe data-science deep-learning hadoop kaggle keras machine-learning mapreduce matplotlib numpy pandas python scikit-learn scipy spark tensorflow theano
Last synced: 23 Dec 2024
https://github.com/eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
applied-data-science applied-machine-learning computer-vision data-discovery data-engineering data-quality data-science deep-learning machine-learning natural-language-processing production recsys reinforcement-learning search
Last synced: 23 Nov 2024
https://github.com/camdavidsonpilon/probabilistic-programming-and-bayesian-methods-for-hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
bayesian-methods data-science jupyter-notebook mathematical-analysis pymc statistics
Last synced: 23 Dec 2024
https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
bayesian-methods data-science jupyter-notebook mathematical-analysis pymc statistics
Last synced: 25 Oct 2024
https://github.com/okulbilisim/awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
analytics awesome-list data-mining data-science data-scientists data-visualization deep-learning hacktoberfest machine-learning science
Last synced: 16 Nov 2024
https://github.com/d2l-ai/d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
book computer-vision data-science deep-learning gaussian-processes hyperparameter-optimization jax kaggle keras machine-learning mxnet natural-language-processing notebook python pytorch recommender-system reinforcement-learning tensorflow
Last synced: 23 Dec 2024
https://github.com/eriklindernoren/ml-from-scratch
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
data-mining data-science deep-learning deep-reinforcement-learning genetic-algorithm machine-learning machine-learning-from-scratch
Last synced: 23 Dec 2024
https://github.com/eriklindernoren/ML-From-Scratch
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
data-mining data-science deep-learning deep-reinforcement-learning genetic-algorithm machine-learning machine-learning-from-scratch
Last synced: 26 Oct 2024
https://github.com/plotly/dash
Data Apps & Dashboards for Python. No JavaScript Required.
bioinformatics charting dash data-science data-visualization finance flask gui-framework jupyter modeling plotly plotly-dash productivity python react rstats technical-computing web-app
Last synced: 23 Dec 2024
https://github.com/fastai/fastbook
The fastai book, published as Jupyter Notebooks
book data-science deep-learning fastai machine-learning notebooks python
Last synced: 23 Dec 2024
https://github.com/matplotlib/matplotlib
matplotlib: plotting with Python
data-science data-visualization gtk matplotlib plotting python qt tk wx
Last synced: 23 Dec 2024
https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code
500 AI Machine learning Deep learning Computer vision NLP Projects with code
artificial-intelligence artificial-intelligence-projects awesome computer-vision computer-vision-project data-science deep-learning deep-learning-project machine-learning machine-learning-projects nlp nlp-projects python
Last synced: 25 Oct 2024
https://github.com/ashishpatel26/500-ai-machine-learning-deep-learning-computer-vision-nlp-projects-with-code
500 AI Machine learning Deep learning Computer vision NLP Projects with code
artificial-intelligence artificial-intelligence-projects awesome computer-vision computer-vision-project data-science deep-learning deep-learning-project machine-learning machine-learning-projects nlp nlp-projects python
Last synced: 25 Sep 2024
https://github.com/qax-os/excelize
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
analytics chart data-science ecma-376 excel excelize formula go golang microsoft office ooxml openxml spreadsheet statistics table vba visualization xlsx xml
Last synced: 23 Dec 2024
https://github.com/ml-tooling/best-of-ml-python
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
automl chatgpt data-analysis data-science data-visualization data-visualizations deep-learning gpt gpt-3 jax keras machine-learning ml nlp python pytorch scikit-learn tensorflow transformer
Last synced: 23 Dec 2024
https://github.com/recommenders-team/recommenders
Best Practices on Recommendation Systems
artificial-intelligence azure data-science deep-learning jupyter-notebook kubernetes machine-learning microsoft operationalization python ranking rating recommendation recommendation-algorithm recommendation-engine recommendation-system recommender tutorial
Last synced: 23 Dec 2024
https://github.com/microsoft/recommenders
Best Practices on Recommendation Systems
artificial-intelligence azure data-science deep-learning jupyter-notebook kubernetes machine-learning microsoft operationalization python ranking rating recommendation recommendation-algorithm recommendation-engine recommendation-system recommender tutorial
Last synced: 06 Dec 2024
https://github.com/afshinea/stanford-cs-229-machine-learning
VIP cheatsheets for Stanford's CS 229 Machine Learning
cheatsheet cs229 data-science deep-learning machine-learning ml-cheatsheet supervised-learning unsupervised-learning
Last synced: 16 Nov 2024
https://github.com/ipython/ipython
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
closember data-science hacktoberfest ipython jupyter notebook python repl spec-0
Last synced: 23 Dec 2024
https://github.com/prefecthq/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
automation data data-engineering data-ops data-science infrastructure ml-ops observability orchestration pipeline prefect python workflow workflow-engine
Last synced: 23 Dec 2024
https://github.com/piskvorky/gensim
Topic Modelling for Humans
data-mining data-science document-similarity fasttext gensim information-retrieval machine-learning natural-language-processing neural-network nlp python topic-modeling word-embeddings word-similarity word2vec
Last synced: 24 Dec 2024
https://github.com/PrefectHQ/prefect
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
automation data data-engineering data-ops data-science infrastructure ml-ops observability orchestration pipeline prefect python workflow workflow-engine
Last synced: 29 Oct 2024
https://github.com/dair-ai/ml-youtube-courses
📺 Discover the latest machine learning / AI courses on YouTube.
ai data-science deep-learning machine-learning natural-language-processing nlp
Last synced: 03 Dec 2024
https://github.com/dair-ai/ML-YouTube-Courses
📺 Discover the latest machine learning / AI courses on YouTube.
ai data-science deep-learning machine-learning natural-language-processing nlp
Last synced: 25 Oct 2024
https://github.com/iterative/dvc
🦉 Data Versioning and ML Experiments
ai data-science data-version-control developer-tools machine-learning reproducibility unstructured-data
Last synced: 23 Dec 2024
https://github.com/Microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
automated-machine-learning automl bayesian-optimization data-science deep-learning deep-neural-network distributed feature-engineering hyperparameter-optimization hyperparameter-tuning machine-learning machine-learning-algorithms mlops model-compression nas neural-architecture-search neural-network python pytorch tensorflow
Last synced: 09 Nov 2024
https://github.com/microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
automated-machine-learning automl bayesian-optimization data-science deep-learning deep-neural-network distributed feature-engineering hyperparameter-optimization hyperparameter-tuning machine-learning machine-learning-algorithms mlops model-compression nas neural-architecture-search neural-network python pytorch tensorflow
Last synced: 29 Sep 2024
https://github.com/virgili0/virgilio
Your new Mentor for Data Science E-Learning.
business-intelligence computer-vision data-science datascience guide guidelines hacktoberfest learning learning-python machine-learning machine-vision nlp path python scikit-learn statistics study studypath tensorflow virgilio
Last synced: 17 Dec 2024
https://github.com/sinaptik-ai/pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql
Last synced: 23 Dec 2024
https://github.com/stefan-jansen/machine-learning-for-trading
Code for Machine Learning for Algorithmic Trading, 2nd edition.
artificial-intelligence data-science deep-learning finance investment investment-strategies machine-learning ml4t-workflow synthetic-data trading trading-agent trading-strategies
Last synced: 23 Dec 2024
https://github.com/virgili0/Virgilio
Your new Mentor for Data Science E-Learning.
business-intelligence computer-vision data-science datascience guide guidelines hacktoberfest learning learning-python machine-learning machine-vision nlp path python scikit-learn statistics study studypath tensorflow virgilio
Last synced: 26 Oct 2024
https://github.com/newTendermint/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 13 Dec 2024
https://github.com/Sinaptik-AI/pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql
Last synced: 29 Oct 2024
https://github.com/mwaskom/seaborn
Statistical data visualization in Python
data-science data-visualization matplotlib pandas python
Last synced: 23 Dec 2024
https://github.com/ydataai/ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
big-data-analytics data-analysis data-exploration data-profiling data-quality data-science deep-learning eda exploration exploratory-data-analysis hacktoberfest html-report jupyter jupyter-notebook machine-learning pandas pandas-dataframe pandas-profiling python statistics
Last synced: 23 Dec 2024
https://github.com/ydataai/pandas-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
big-data-analytics data-analysis data-exploration data-profiling data-quality data-science deep-learning eda exploration exploratory-data-analysis hacktoberfest html-report jupyter jupyter-notebook machine-learning pandas pandas-dataframe pandas-profiling python statistics
Last synced: 14 Dec 2024
https://github.com/rasbt/python-machine-learning-book
The "Python Machine Learning (1st edition)" book code repository and info resource
data-mining data-science logistic-regression machine-learning machine-learning-algorithms neural-network python scikit-learn
Last synced: 17 Dec 2024
https://github.com/dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.
analytics dagster data-engineering data-integration data-orchestrator data-pipelines data-science etl metadata mlops orchestration python scheduler workflow workflow-automation
Last synced: 23 Dec 2024
https://github.com/allenai/allennlp
An open-source NLP research library, built on PyTorch.
data-science deep-learning natural-language-processing nlp python pytorch
Last synced: 29 Sep 2024
https://github.com/ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch
Last synced: 23 Dec 2024
https://github.com/openrefine/openrefine
OpenRefine is a free, open source power tool for working with messy data and improving it
data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata
Last synced: 23 Dec 2024
https://github.com/OpenRefine/OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata
Last synced: 27 Oct 2024
https://github.com/trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
analytics big-data data-science database databases datalake delta-lake distributed-database distributed-systems hadoop hive iceberg java jdbc presto prestodb query-engine sql trino
Last synced: 23 Dec 2024
https://github.com/fastai/numerical-linear-algebra
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
algorithms data-science deep-learning linear-algebra machine-learning numpy python
Last synced: 17 Dec 2024
https://github.com/statsmodels/statsmodels
Statsmodels: statistical modeling and econometrics in Python
count-model data-analysis data-science econometrics forecasting generalized-linear-models hypothesis-testing prediction python regression-models robust-estimation statistics timeseries-analysis
Last synced: 23 Dec 2024
https://github.com/aws/amazon-sagemaker-examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
aws data-science deep-learning examples inference jupyter-notebook machine-learning mlops reinforcement-learning sagemaker training
Last synced: 23 Dec 2024
https://github.com/lexfridman/mit-deep-learning
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
artificial-intelligence data-science deep-learning deep-reinforcement-learning deep-rl deeplearning jupyter-notebooks machine-learning mit neural-networks segmentation self-driving-cars tensorflow tensorflow-tutorials
Last synced: 20 Dec 2024
https://github.com/dair-ai/ml-papers-of-the-week
🔥Highlighting the top ML papers every week.
ai data-science deeplearning machine-learning nlp
Last synced: 03 Dec 2024
https://github.com/great-expectations/great_expectations
Always know what to expect from your data.
cleandata data-engineering data-profilers data-profiling data-quality data-science data-unit-tests datacleaner datacleaning dataquality dataunittest eda exploratory-analysis exploratory-data-analysis exploratorydataanalysis mlops pipeline pipeline-debt pipeline-testing pipeline-tests
Last synced: 23 Dec 2024
https://github.com/tangyudi/ai-learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
algorithm artificial-intelligence caffe cv data-analysis data-mining data-science deep-learning keras machine-learning mathematics matplotlib nlp numpy pandas python pytorch seaborn tensorflow tensorflow2
Last synced: 24 Dec 2024
https://github.com/yorko/mlcourse.ai
Open Machine Learning Course
algorithms data-analysis data-science docker ipynb kaggle-inclass machine-learning math matplotlib numpy pandas plotly python scikit-learn scipy seaborn vowpal-wabbit
Last synced: 23 Dec 2024
https://github.com/tangyudi/Ai-Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
algorithm artificial-intelligence caffe cv data-analysis data-mining data-science deep-learning keras machine-learning mathematics matplotlib nlp numpy pandas python pytorch seaborn tensorflow tensorflow2
Last synced: 14 Nov 2024
https://github.com/modin-project/modin
Modin: Scale your Pandas workflows by changing a single line of code
analytics data-science dataframe datascience distributed modin pandas python sql
Last synced: 29 Oct 2024
https://epistasislab.github.io/tpot/
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
adsp ag066833 aiml alzheimer alzheimers automated-machine-learning automation automl data-science feature-engineering gradient-boosting hyperparameter-optimization machine-learning model-selection nia parameter-tuning python random-forest scikit-learn u01ag066833
Last synced: 13 Nov 2024
https://github.com/epistasislab/tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
adsp ag066833 aiml alzheimer alzheimers automated-machine-learning automation automl data-science feature-engineering gradient-boosting hyperparameter-optimization machine-learning model-selection nia parameter-tuning python random-forest scikit-learn u01ag066833
Last synced: 23 Dec 2024
https://github.com/cleanlab/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
active-learning annotation data-centric-ai data-cleaning data-curation data-labeling data-profiling data-quality data-science data-validation dataops dataquality datasets exploratory-data-analysis labeling llms noisy-labels out-of-distribution-detection outlier-detection weak-supervision
Last synced: 17 Dec 2024
https://github.com/akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock
Last synced: 23 Dec 2024
https://github.com/tflearn/tflearn
Deep learning library featuring a higher-level API for TensorFlow.
data-science deep-learning machine-learning neural-network tensorflow tflearn
Last synced: 23 Dec 2024
https://github.com/microsoft/computervision-recipes
Best Practices, code samples, and documentation for Computer Vision.
artificial-intelligence azure computer-vision convolutional-neural-networks data-science deep-learning image-classification image-processing jupyter-notebook kubernetes machine-learning microsoft object-detection operationalization python similarity tutorial
Last synced: 18 Dec 2024
https://github.com/EpistasisLab/tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
adsp ag066833 aiml alzheimer alzheimers automated-machine-learning automation automl data-science feature-engineering gradient-boosting hyperparameter-optimization machine-learning model-selection nia parameter-tuning python random-forest scikit-learn u01ag066833
Last synced: 25 Oct 2024
https://github.com/Yorko/mlcourse.ai
Open Machine Learning Course
algorithms data-analysis data-science docker ipynb kaggle-inclass machine-learning math matplotlib numpy pandas plotly python scikit-learn scipy seaborn vowpal-wabbit
Last synced: 27 Oct 2024
https://github.com/wandb/wandb
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
ai collaboration data-science data-versioning deep-learning experiment-track hyperparameter-optimization hyperparameter-search hyperparameter-tuning jax keras machine-learning ml-platform mlops model-versioning pytorch reinforcement-learning reproducibility tensorflow
Last synced: 23 Dec 2024
https://github.com/chiphuyen/machine-learning-systems-design
A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems"
data-science machine-learning-production mlops
Last synced: 04 Dec 2024
https://github.com/goplus/gop
The Go+ programming language is designed for engineering, STEM education, and data science. Our vision is to enable everyone to become a builder of the digital world.
data-science engineering golang gop goplus low-code programming-language scientific-computing stem stem-education
Last synced: 23 Dec 2024
https://github.com/alexeygrigorev/data-science-interviews
Data science interview questions and answers
data-science data-science-interviews interview-questions machine-learning
Last synced: 02 Dec 2024
https://github.com/voxel51/fiftyone
Refine high-quality datasets and visual AI models
active-learning artificial-intelligence computer-vision data-centric-ai data-cleaning data-curation data-quality data-science deep-learning developer-tools image-classification machine-learning object-detection python unstructured-data vector-search visualization
Last synced: 30 Oct 2024
https://github.com/dair-ai/ML-Papers-of-the-Week
🔥Highlighting the top ML papers every week.
ai data-science deeplearning machine-learning nlp
Last synced: 27 Oct 2024
https://github.com/marimo-team/marimo
A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
artificial-intelligence dag data-science data-visualization dataflow developer-tools machine-learning notebooks pipeline python reactive web-app
Last synced: 23 Dec 2024
https://github.com/pycaret/pycaret
An open-source, low-code machine learning library in Python
anomaly-detection citizen-data-scientists classification clustering data-science gpu machine-learning ml pycaret python regression time-series
Last synced: 27 Oct 2024
https://github.com/lazyprogrammer/machine_learning_examples
A collection of machine learning examples and tutorials.
data-science deep-learning machine-learning natural-language-processing python reinforcement-learning
Last synced: 23 Dec 2024
https://github.com/xonsh/xonsh
:shell: Python-powered shell. Full-featured and cross-platform.
bash cli command-line console data-engineering data-science devops fish iterm2 python raspberry-pi security-automation shell windows-terminal xonsh zsh
Last synced: 23 Dec 2024
https://github.com/rapidsai/cudf
cuDF - GPU DataFrame Library
arrow cpp cuda cudf dask data-analysis data-science dataframe gpu pandas pydata python rapids
Last synced: 23 Dec 2024
https://github.com/vaexio/vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
bigdata data-science dataframe hdf5 machine-learning machinelearning memory-mapped-file pyarrow python tabular-data visualization
Last synced: 23 Dec 2024
https://github.com/activeloopai/deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
ai computer-vision cv data-science datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops multi-modal python pytorch tensorflow vector-database vector-search
Last synced: 23 Dec 2024
https://github.com/hugoblox/hugo-blox-builder
🚨 GROW YOUR AUDIENCE WITH HUGOBLOX! 🚀 HugoBlox is an easy, fast no-code website builder for researchers, entrepreneurs, data scientists, and developers. Build stunning sites in minutes. 适合研究人员、企业家、数据科学家和开发者的简单快速无代码网站构建器。用拖放功能、可定制模板和内置SEO工具快速创建精美网站!
academic blog blog-engine cms data-science documentation-tool github-pages hugo hugo-theme jupyter netlify open-science page-builder portfolio r rmarkdown rstudio static-site-generator theme website-builder
Last synced: 23 Dec 2024
https://github.com/activeloopai/Hub
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
ai computer-vision cv data-science datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops multi-modal python pytorch tensorflow vector-database vector-search
Last synced: 08 Dec 2024