Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-07-29 13:36:33 UTC
- JSON Representation
https://microsoft.github.io/ML-For-Beginners/
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
data-science education machine-learning machine-learning-algorithms machinelearning machinelearning-python ml python r scikit-learn scikit-learn-python
Last synced: 01 Aug 2024
https://github.com/microsoft/ML-For-Beginners
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
data-science education machine-learning machine-learning-algorithms machinelearning machinelearning-python ml python r scikit-learn scikit-learn-python
Last synced: 31 Jul 2024
https://github.com/keras-team/keras
Deep Learning for humans
data-science deep-learning jax machine-learning neural-networks python pytorch tensorflow
Last synced: 30 Jul 2024
https://github.com/fchollet/keras
Deep Learning for humans
data-science deep-learning jax machine-learning neural-networks python pytorch tensorflow
Last synced: 04 Aug 2024
https://github.com/apache/incubator-superset
Apache Superset is a Data Visualization and Data Exploration Platform
analytics apache apache-superset asf bi business-analytics business-intelligence data-analysis data-analytics data-engineering data-science data-visualization data-viz flask python react sql-editor superset
Last synced: 02 Aug 2024
https://github.com/airbnb/caravel
Apache Superset is a Data Visualization and Data Exploration Platform
analytics apache apache-superset asf bi business-analytics business-intelligence data-analysis data-analytics data-engineering data-science data-visualization data-viz flask python react sql-editor superset
Last synced: 05 Aug 2024
https://github.com/apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
analytics apache apache-superset asf bi business-analytics business-intelligence data-analysis data-analytics data-engineering data-science data-visualization data-viz flask python react sql-editor superset
Last synced: 30 Jul 2024
https://github.com/scikit-learn/scikit-learn
scikit-learn: machine learning in Python
data-analysis data-science machine-learning python statistics
Last synced: 30 Jul 2024
https://github.com/pydata/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
alignment data-analysis data-science flexible pandas python
Last synced: 09 Aug 2024
https://github.com/pandas-dev/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
alignment data-analysis data-science flexible pandas python
Last synced: 31 Jul 2024
https://github.com/apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow apache apache-airflow automation dag data-engineering data-integration data-orchestrator data-pipelines data-science elt etl machine-learning mlops orchestration python scheduler workflow workflow-engine workflow-orchestration
Last synced: 30 Jul 2024
https://github.com/apache/incubator-airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow apache apache-airflow automation dag data-engineering data-integration data-orchestrator data-pipelines data-science elt etl machine-learning mlops orchestration python scheduler workflow workflow-engine workflow-orchestration
Last synced: 05 Aug 2024
https://github.com/GokuMohandas/Made-With-ML
Learn how to design, develop, deploy and iterate on production-grade ML applications.
data-engineering data-quality data-science deep-learning distributed-ml distributed-training llms machine-learning mlops natural-language-processing python pytorch ray
Last synced: 31 Jul 2024
https://github.com/streamlit/streamlit
Streamlit — A faster way to build and share data apps.
data-analysis data-science data-visualization deep-learning developer-tools machine-learning python streamlit
Last synced: 30 Jul 2024
https://github.com/ray-project/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
automl data-science deep-learning deployment distributed hyperparameter-optimization hyperparameter-search java llm-serving machine-learning model-selection optimization parallel python pytorch ray reinforcement-learning rllib serving tensorflow
Last synced: 30 Jul 2024
https://github.com/gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
data-analysis data-science data-visualization deep-learning deploy gradio gradio-interface hacktoberfest interface machine-learning models python python-notebook ui ui-components
Last synced: 30 Jul 2024
https://github.com/explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
ai artificial-intelligence cython data-science deep-learning entity-linking machine-learning named-entity-recognition natural-language-processing neural-network neural-networks nlp nlp-library python spacy text-classification tokenization
Last synced: 30 Jul 2024
https://github.com/spacy-io/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
ai artificial-intelligence cython data-science deep-learning entity-linking machine-learning named-entity-recognition natural-language-processing neural-network neural-networks nlp nlp-library python spacy text-classification tokenization
Last synced: 22 Aug 2024
https://github.com/AMAI-GmbH/AI-Expert-Roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2022
ai ai-roadmap artificial-intelligence data-analysis data-science deep-learning machine-learning neural-network roadmap study-plan
Last synced: 30 Jul 2024
https://github.com/lightning-AI/lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
ai artificial-intelligence data-science deep-learning machine-learning python pytorch
Last synced: 27 Aug 2024
https://github.com/Lightning-AI/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
ai artificial-intelligence data-science deep-learning machine-learning python pytorch
Last synced: 31 Jul 2024
https://github.com/Lightning-AI/lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
ai artificial-intelligence data-science deep-learning machine-learning python pytorch
Last synced: 06 Aug 2024
https://github.com/PyTorchLightning/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
ai artificial-intelligence data-science deep-learning machine-learning python pytorch
Last synced: 08 Aug 2024
https://github.com/camdavidsonpilon/probabilistic-programming-and-bayesian-methods-for-hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
bayesian-methods data-science jupyter-notebook mathematical-analysis pymc statistics
Last synced: 03 Aug 2024
https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
bayesian-methods data-science jupyter-notebook mathematical-analysis pymc statistics
Last synced: 30 Jul 2024
https://github.com/microsoft/Data-Science-For-Beginners
10 Weeks, 20 Lessons, Data Science for All!
data-analysis data-science data-visualization pandas python
Last synced: 31 Jul 2024
https://github.com/donnemartin/data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
aws big-data caffe data-science deep-learning hadoop kaggle keras machine-learning mapreduce matplotlib numpy pandas python scikit-learn scipy spark tensorflow theano
Last synced: 30 Jul 2024
https://github.com/eugeneyan/applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
applied-data-science applied-machine-learning computer-vision data-discovery data-engineering data-quality data-science deep-learning machine-learning natural-language-processing production recsys reinforcement-learning search
Last synced: 31 Jul 2024
https://github.com/okulbilisim/awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
analytics awesome-list data-mining data-science data-scientists data-visualization deep-learning hacktoberfest machine-learning science
Last synced: 03 Aug 2024
https://github.com/eriklindernoren/ML-From-Scratch
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
data-mining data-science deep-learning deep-reinforcement-learning genetic-algorithm machine-learning machine-learning-from-scratch
Last synced: 30 Jul 2024
https://github.com/d2l-ai/d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
book computer-vision data-science deep-learning gaussian-processes hyperparameter-optimization jax kaggle keras machine-learning mxnet natural-language-processing notebook python pytorch recommender-system reinforcement-learning tensorflow
Last synced: 31 Jul 2024
https://github.com/fastai/fastbook
The fastai book, published as Jupyter Notebooks
book data-science deep-learning fastai machine-learning notebooks python
Last synced: 30 Jul 2024
https://github.com/plotly/dash
Data Apps & Dashboards for Python. No JavaScript Required.
bioinformatics charting dash data-science data-visualization finance flask gui-framework julia jupyter modeling plotly plotly-dash productivity python r react rstats technical-computing web-app
Last synced: 30 Jul 2024
https://github.com/matplotlib/matplotlib
matplotlib: plotting with Python
data-science data-visualization gtk hacktoberfest matplotlib plotting python qt tk wx
Last synced: 30 Jul 2024
https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code
500 AI Machine learning Deep learning Computer vision NLP Projects with code
artificial-intelligence artificial-intelligence-projects awesome computer-vision computer-vision-project data-science deep-learning deep-learning-project machine-learning machine-learning-projects nlp nlp-projects python
Last synced: 30 Jul 2024
https://github.com/recommenders-team/recommenders
Best Practices on Recommendation Systems
artificial-intelligence azure data-science deep-learning jupyter-notebook kubernetes machine-learning microsoft operationalization python ranking rating recommendation recommendation-algorithm recommendation-engine recommendation-system recommender tutorial
Last synced: 30 Jul 2024
https://github.com/microsoft/recommenders
Best Practices on Recommendation Systems
artificial-intelligence azure data-science deep-learning jupyter-notebook kubernetes machine-learning microsoft operationalization python ranking rating recommendation recommendation-algorithm recommendation-engine recommendation-system recommender tutorial
Last synced: 05 Aug 2024
https://github.com/Luxurioust/excelize
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
analytics chart data-science ecma-376 excel excelize formula go golang microsoft office ooxml openxml spreadsheet statistics table vba visualization xlsx xml
Last synced: 05 Aug 2024
https://github.com/360EntSecGroup-Skylar/excelize
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
analytics chart data-science ecma-376 excel excelize formula go golang microsoft office ooxml openxml spreadsheet statistics table vba visualization xlsx xml
Last synced: 10 Aug 2024
https://github.com/xuri/excelize
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
analytics chart data-science ecma-376 excel excelize formula go golang microsoft office ooxml openxml spreadsheet statistics table vba visualization xlsx xml
Last synced: 05 Aug 2024
https://github.com/qax-os/excelize
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
analytics chart data-science ecma-376 excel excelize formula go golang microsoft office ooxml openxml spreadsheet statistics table vba visualization xlsx xml
Last synced: 30 Jul 2024
https://github.com/afshinea/stanford-cs-229-machine-learning
VIP cheatsheets for Stanford's CS 229 Machine Learning
cheatsheet cs229 data-science deep-learning machine-learning ml-cheatsheet supervised-learning unsupervised-learning
Last synced: 01 Aug 2024
https://github.com/ipython/ipython
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
closember data-science hacktoberfest ipython jupyter notebook python repl spec-0
Last synced: 31 Jul 2024
https://github.com/ml-tooling/best-of-ml-python
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
automl chatgpt data-analysis data-science data-visualization data-visualizations deep-learning gpt gpt-3 jax keras machine-learning ml nlp python pytorch scikit-learn tensorflow transformer
Last synced: 30 Jul 2024
https://github.com/RaRe-Technologies/gensim
Topic Modelling for Humans
data-mining data-science document-similarity fasttext gensim information-retrieval machine-learning natural-language-processing neural-network nlp python topic-modeling word-embeddings word-similarity word2vec
Last synced: 04 Aug 2024
https://github.com/piskvorky/gensim
Topic Modelling for Humans
data-mining data-science document-similarity fasttext gensim information-retrieval machine-learning natural-language-processing neural-network nlp python topic-modeling word-embeddings word-similarity word2vec
Last synced: 30 Jul 2024
https://github.com/rare-technologies/gensim
Topic Modelling for Humans
data-mining data-science document-similarity fasttext gensim information-retrieval machine-learning natural-language-processing neural-network nlp python topic-modeling word-embeddings word-similarity word2vec
Last synced: 07 Aug 2024
https://github.com/PrefectHQ/prefect
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
automation data data-engineering data-ops data-science infrastructure ml-ops observability orchestration pipeline prefect python workflow workflow-engine
Last synced: 31 Jul 2024
https://github.com/dair-ai/ML-YouTube-Courses
📺 Discover the latest machine learning / AI courses on YouTube.
ai data-science deep-learning machine-learning natural-language-processing nlp
Last synced: 30 Jul 2024
https://github.com/Microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
automated-machine-learning automl bayesian-optimization data-science deep-learning deep-neural-network distributed feature-engineering hyperparameter-optimization hyperparameter-tuning machine-learning machine-learning-algorithms mlops model-compression nas neural-architecture-search neural-network python pytorch tensorflow
Last synced: 01 Aug 2024
https://github.com/microsoft/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
automated-machine-learning automl bayesian-optimization data-science deep-learning deep-neural-network distributed feature-engineering hyperparameter-optimization hyperparameter-tuning machine-learning machine-learning-algorithms mlops model-compression nas neural-architecture-search neural-network python pytorch tensorflow
Last synced: 31 Jul 2024
https://github.com/virgili0/Virgilio
Your new Mentor for Data Science E-Learning.
business-intelligence computer-vision data-science datascience guide guidelines hacktoberfest learning learning-python machine-learning machine-vision nlp path python scikit-learn statistics study studypath tensorflow virgilio
Last synced: 30 Jul 2024
https://github.com/iterative/dvc
🦉 ML Experiments and Data Management with Git
ai collaboration data-science data-version-control developer-tools git hacktoberfest machine-learning python reproducibility
Last synced: 30 Jul 2024
https://github.com/0xnr/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 31 Jul 2024
https://github.com/onurakpolat/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 31 Jul 2024
https://github.com/mwaskom/seaborn
Statistical data visualization in Python
data-science data-visualization matplotlib pandas python
Last synced: 30 Jul 2024
https://github.com/rasbt/python-machine-learning-book
The "Python Machine Learning (1st edition)" book code repository and info resource
data-mining data-science logistic-regression machine-learning machine-learning-algorithms neural-network python scikit-learn
Last synced: 30 Jul 2024
https://github.com/pandas-profiling/pandas-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
big-data-analytics data-analysis data-exploration data-profiling data-quality data-science deep-learning eda exploration exploratory-data-analysis hacktoberfest html-report jupyter jupyter-notebook machine-learning pandas pandas-dataframe pandas-profiling python statistics
Last synced: 03 Aug 2024
https://github.com/ydataai/ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
big-data-analytics data-analysis data-exploration data-profiling data-quality data-science deep-learning eda exploration exploratory-data-analysis hacktoberfest html-report jupyter jupyter-notebook machine-learning pandas pandas-dataframe pandas-profiling python statistics
Last synced: 30 Jul 2024
https://github.com/stefan-jansen/machine-learning-for-trading
Code for Machine Learning for Algorithmic Trading, 2nd edition.
artificial-intelligence data-science deep-learning finance investment investment-strategies machine-learning ml4t-workflow synthetic-data trading trading-agent trading-strategies
Last synced: 01 Aug 2024
https://github.com/allenai/allennlp
An open-source NLP research library, built on PyTorch.
data-science deep-learning natural-language-processing nlp python pytorch
Last synced: 30 Jul 2024
https://github.com/sinaptik-ai/pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql
Last synced: 02 Aug 2024
https://github.com/Sinaptik-AI/pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql
Last synced: 31 Jul 2024
https://github.com/gventuri/pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
ai csv data data-analysis data-science database datalake gpt-3 gpt-4 llm pandas sql
Last synced: 03 Aug 2024
https://github.com/uber/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch
Last synced: 05 Aug 2024
https://github.com/ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch
Last synced: 30 Jul 2024
https://github.com/dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.
analytics dagster data-engineering data-integration data-orchestrator data-pipelines data-science etl metadata mlops orchestration python scheduler workflow workflow-automation
Last synced: 31 Jul 2024
https://github.com/OpenRefine/OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata
Last synced: 31 Jul 2024
https://github.com/lexfridman/mit-deep-learning
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
artificial-intelligence data-science deep-learning deep-reinforcement-learning deep-rl deeplearning jupyter-notebooks machine-learning mit neural-networks segmentation self-driving-cars tensorflow tensorflow-tutorials
Last synced: 30 Jul 2024
https://github.com/trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
analytics big-data data-science database databases datalake delta-lake distributed-database distributed-systems hadoop hive iceberg java jdbc presto prestodb query-engine sql trino
Last synced: 31 Jul 2024
https://github.com/fastai/numerical-linear-algebra
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
algorithms data-science deep-learning linear-algebra machine-learning numpy python
Last synced: 31 Jul 2024
https://github.com/aws/amazon-sagemaker-examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
aws data-science deep-learning examples inference jupyter-notebook machine-learning mlops reinforcement-learning sagemaker training
Last synced: 01 Aug 2024
https://github.com/modin-project/modin
Modin: Scale your Pandas workflows by changing a single line of code
analytics data-science dataframe datascience distributed modin pandas python sql
Last synced: 31 Jul 2024
https://github.com/tflearn/tflearn
Deep learning library featuring a higher-level API for TensorFlow.
data-science deep-learning machine-learning neural-network tensorflow tflearn
Last synced: 31 Jul 2024
https://epistasislab.github.io/tpot/
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
adsp ag066833 aiml alzheimer alzheimers automated-machine-learning automation automl data-science feature-engineering gradient-boosting hyperparameter-optimization machine-learning model-selection nia parameter-tuning python random-forest scikit-learn u01ag066833
Last synced: 02 Aug 2024
https://github.com/epistasislab/tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
adsp ag066833 aiml alzheimer alzheimers automated-machine-learning automation automl data-science feature-engineering gradient-boosting hyperparameter-optimization machine-learning model-selection nia parameter-tuning python random-forest scikit-learn u01ag066833
Last synced: 03 Aug 2024
https://github.com/EpistasisLab/tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
adsp ag066833 aiml alzheimer alzheimers automated-machine-learning automation automl data-science feature-engineering gradient-boosting hyperparameter-optimization machine-learning model-selection nia parameter-tuning python random-forest scikit-learn u01ag066833
Last synced: 30 Jul 2024
https://github.com/rhiever/tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
adsp ag066833 aiml alzheimer alzheimers automated-machine-learning automation automl data-science feature-engineering gradient-boosting hyperparameter-optimization machine-learning model-selection nia parameter-tuning python random-forest scikit-learn u01ag066833
Last synced: 22 Aug 2024
https://github.com/statsmodels/statsmodels
Statsmodels: statistical modeling and econometrics in Python
count-model data-analysis data-science econometrics forecasting generalized-linear-models hypothesis-testing prediction python regression-models robust-estimation statistics timeseries-analysis
Last synced: 30 Jul 2024
https://github.com/Yorko/mlcourse.ai
Open Machine Learning Course
algorithms data-analysis data-science docker ipynb kaggle-inclass machine-learning math matplotlib numpy pandas plotly python scikit-learn scipy seaborn vowpal-wabbit
Last synced: 31 Jul 2024
https://github.com/great-expectations/great_expectations
Always know what to expect from your data.
cleandata data-engineering data-profilers data-profiling data-quality data-science data-unit-tests datacleaner datacleaning dataquality dataunittest eda exploratory-analysis exploratory-data-analysis exploratorydataanalysis mlops pipeline pipeline-debt pipeline-testing pipeline-tests
Last synced: 31 Jul 2024
https://github.com/microsoft/computervision-recipes
Best Practices, code samples, and documentation for Computer Vision.
artificial-intelligence azure computer-vision convolutional-neural-networks data-science deep-learning image-classification image-processing jupyter-notebook kubernetes machine-learning microsoft object-detection operationalization python similarity tutorial
Last synced: 31 Jul 2024
https://github.com/cleanlab/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
active-learning annotation data-analysis data-centric-ai data-cleaning data-curation data-labeling data-profiling data-quality data-science data-validation dataops dataquality datasets labeling llms noisy-labels out-of-distribution-detection outlier-detection weak-supervision
Last synced: 31 Jul 2024
https://github.com/qiniu/qlang
The Go+ programming language is designed for engineering, STEM education, and data science
data-science engineering golang gop goplus low-code programming-language scientific-computing stem stem-education
Last synced: 03 Aug 2024
https://github.com/goplus/gop
The Go+ programming language is designed for engineering, STEM education, and data science
data-science engineering golang gop goplus low-code programming-language scientific-computing stem stem-education
Last synced: 30 Jul 2024
https://github.com/dair-ai/ML-Papers-of-the-Week
🔥Highlighting the top ML papers every week.
ai data-science deeplearning machine-learning nlp
Last synced: 31 Jul 2024
https://github.com/akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock
Last synced: 31 Jul 2024
https://github.com/jindaxiang/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock
Last synced: 03 Aug 2024
https://github.com/pycaret/pycaret
An open-source, low-code machine learning library in Python
anomaly-detection citizen-data-scientists classification clustering data-science gpu machine-learning ml pycaret python regression time-series
Last synced: 31 Jul 2024
https://github.com/lazyprogrammer/machine_learning_examples
A collection of machine learning examples and tutorials.
data-science deep-learning machine-learning natural-language-processing python reinforcement-learning
Last synced: 30 Jul 2024
https://github.com/wandb/wandb
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
collaboration data-science data-versioning deep-learning experiment-track hyperparameter-optimization hyperparameter-search hyperparameter-tuning jax keras machine-learning ml-platform mlops model-versioning pytorch reinforcement-learning reproducibility tensorflow
Last synced: 01 Aug 2024
https://github.com/vaexio/vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
bigdata data-science dataframe hdf5 machine-learning machinelearning memory-mapped-file pyarrow python tabular-data visualization
Last synced: 30 Jul 2024
https://github.com/blue-yonder/tsfresh
Automatic extraction of relevant features from time series:
data-science feature-extraction time-series
Last synced: 30 Jul 2024
https://github.com/yzhao062/pyod
A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)
anomaly anomaly-detection autoencoder data-analysis data-mining data-science deep-learning fraud-detection machine-learning neural-networks novelty-detection out-of-distribution-detection outlier-detection outlier-ensembles outliers python python3 unsupervised-learning
Last synced: 30 Jul 2024
https://github.com/yzhao062/Pyod
A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)
anomaly anomaly-detection autoencoder data-analysis data-mining data-science deep-learning fraud-detection machine-learning neural-networks novelty-detection out-of-distribution-detection outlier-detection outlier-ensembles outliers python python3 unsupervised-learning
Last synced: 01 Aug 2024
https://github.com/xonsh/xonsh
:shell: Python-powered, cross-platform, Unix-gazing shell.
bash cli command-line console data-engineering data-science devops fish hacktoberfest iterm2 prompt python python-shell script security-automation shell terminal windows-terminal xonsh zsh
Last synced: 31 Jul 2024
https://github.com/jackzhenguo/python-small-examples
告别枯燥,致力于打造 Python 实用小例子,更多Python良心教程见 https://ai-jupyter.com
data-science machine-learning python python-gui python-web pytorch tensorflow
Last synced: 31 Jul 2024
https://github.com/HugoBlox/hugo-blox-builder
😍 EASILY BUILD THE WEBSITE YOU WANT - NO CODE, JUST MARKDOWN BLOCKS! 使用块轻松创建任何类型的网站 - 无需代码。 一个应用程序,没有依赖项,没有 JS
academic blog blog-engine cms data-science documentation-tool github-pages hugo hugo-theme jupyter netlify open-science page-builder portfolio r rmarkdown rstudio static-site-generator theme website-builder
Last synced: 30 Jul 2024
https://github.com/catboost/catboost
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
big-data catboost categorical-features coreml cuda data-mining data-science decision-trees gbdt gbm gpu gpu-computing gradient-boosting kaggle machine-learning python r tutorial
Last synced: 30 Jul 2024
https://github.com/activeloopai/Hub
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
ai computer-vision cv data-science data-version-control datalake datasets deep-learning image-processing langchain large-language-models llm machine-learning ml mlops python pytorch tensorflow vector-database vector-search
Last synced: 10 Aug 2024