Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-11-18 00:06:52 UTC
- JSON Representation
https://github.com/explosion/spacy-stanza
💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
corenlp data-science machine-learning natural-language-processing nlp spacy spacy-pipeline stanford-corenlp stanford-machine-learning stanford-nlp stanza
Last synced: 01 Nov 2024
https://github.com/arvkevi/kneed
Knee point detection in Python :chart_with_upwards_trend:
data-analysis data-science elbow-method knee-point python scientific-computing systems
Last synced: 28 Oct 2024
https://github.com/glue-viz/glue
Linked Data Visualizations Across Multiple Files
data-science linked-data python visualization
Last synced: 05 Aug 2024
https://github.com/iterative/mlem
🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞
cli data-science deployment developer-tools git machine-learning mlem model-registry python
Last synced: 30 Oct 2024
https://github.com/erikaduan/r_tips
A repository of R usage tips for data cleaning, data mining, data visualisation, statistical inference and machine learning
data-science data-visualization machine-learning r rstats statistics
Last synced: 13 Aug 2024
https://github.com/pdpipe/pdpipe
Easy pipelines for pandas DataFrames.
data data-science dataframe dataframes pandas pandas-dataframe pipeline
Last synced: 08 Nov 2024
https://github.com/kennethleungty/Failed-ML
Compilation of high-profile real-world examples of failed machine learning projects
ai artificial-intelligence classification computer-vision data-engineering data-quality data-science deep-learning failed-data-science failed-machine-learning failed-ml fml forecasting machine-learning ml natural-language-processing production recsys regression
Last synced: 05 Nov 2024
https://github.com/run-house/runhouse
The fastest way to iterate and deploy AI workloads on your own infra. Unobtrusive, debuggable, PyTorch-like APIs.
api artificial-intelligence aws azure collaboration data-science deployment distributed fastapi gcp infrastructure machine-learning middleware observability python pytorch ray sagemaker serverless
Last synced: 11 Oct 2024
https://github.com/HunterMcGushion/hyperparameter_hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
ai artificial-intelligence catboost data-science deep-learning experimentation feature-engineering hyperparameter-optimization hyperparameter-tuning keras lightgbm machine-learning ml neural-network optimization python rgf scikit-learn sklearn xgboost
Last synced: 11 Nov 2024
https://github.com/huntermcgushion/hyperparameter_hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
ai artificial-intelligence catboost data-science deep-learning experimentation feature-engineering hyperparameter-optimization hyperparameter-tuning keras lightgbm machine-learning ml neural-network optimization python rgf scikit-learn sklearn xgboost
Last synced: 10 Oct 2024
https://github.com/alteryx/evalml
EvalML is an AutoML library written in python.
automl data-science feature-engineering feature-selection hyperparameter-tuning machine-learning model-selection optimization
Last synced: 09 Nov 2024
https://github.com/scikit-mobility/scikit-mobility
scikit-mobility: mobility analysis in Python
complex-systems data-analysis data-science human-mobility mobility-analysis mobility-flows network-science risk-assessment scikit-mobility statistics synthetic-flows
Last synced: 05 Aug 2024
https://github.com/mrsaeeddev/free-ai-resources
🚀 FREE AI Resources - 🎓 Courses, 👷 Jobs, 📝 Blogs, 🔬 AI Research, and many more - for everyone!
ai artificial-intelligence artificial-neural-networks data data-science data-science-learning data-science-projects deep-learning deep-neural-networks hacktoberfest hacktoberfest2020 machine-learning machine-learning-algorithms machinelearning reinforcement-learning research supervised-learning unsupervised-learning
Last synced: 04 Nov 2024
https://github.com/pymc-labs/pymc-marketing
Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
btyd buy-till-you-die clv customer-lifetime-value data-science marketing media-mix-modeling mmm python
Last synced: 30 Oct 2024
https://github.com/mrankitgupta/data-analyst-roadmap
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau
Last synced: 12 Oct 2024
https://github.com/bacalhau-project/bacalhau
Compute over Data framework for public, transparent, and optionally verifiable computation
ai-art ai-data-collection ai-pipeline batch-processing bioinformatics-pipeline data-analysis data-engineering data-science decentralized decentralized-computing distributed gene-sequencing insulators iot logging-framework orchestration-framework p2p video-processing
Last synced: 12 Nov 2024
https://github.com/litaotao/ipython-dashboard
A stand alone, light-weight web server for building, sharing graphs created in ipython. Build for data science, data analysis guys. Aiming at building an interactive visualization, collaborated dashboard, and real-time streaming graph.
dashboard data-science ipython ipython-dashboard notebook visualization
Last synced: 12 Nov 2024
https://github.com/dataproofer/Dataproofer
A proofreader for your data
cli command-line csv data-analysis data-mining data-science excel nodejs spreadsheet
Last synced: 01 Nov 2024
https://github.com/litaotao/IPython-Dashboard
A stand alone, light-weight web server for building, sharing graphs created in ipython. Build for data science, data analysis guys. Aiming at building an interactive visualization, collaborated dashboard, and real-time streaming graph.
dashboard data-science ipython ipython-dashboard notebook visualization
Last synced: 15 Aug 2024
https://github.com/nicolaskruchten/jupyter_pivottablejs
Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
data-analysis data-science interactive jupyter-notebook pivot-chart pivot-tables
Last synced: 10 Oct 2024
https://github.com/BiomedSciAI/causallib
A Python package for modular causal inference analysis and model evaluations
causal causal-inference causal-models causality data-science machine-learning ml
Last synced: 30 Oct 2024
https://github.com/trainingbypackt/data-science-projects-with-python
A Case Study Approach to Successful Data Science Projects Using Python, Pandas, and Scikit-Learn
data-science machine-learning numpy pandas pandas-dataframe python scikit-learn
Last synced: 14 Nov 2024
https://github.com/google/edward2
A simple probabilistic programming language.
bayesian-methods data-science deep-learning machine-learning neural-networks probabilistic-programming statistics tensorflow
Last synced: 26 Oct 2024
https://github.com/nannyml/the-little-book-of-ml-metrics
The book every data scientist needs on their desk.
book classification-metrics clustering-metrics computer-vision-metrics data-science machine-learning machine-learning-evaluation machine-learning-metrics nlp-metrics python ranking-metrics regression-metrics
Last synced: 12 Nov 2024
https://github.com/jphall663/interpretable_machine_learning_with_python
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
accountability data-mining data-science decision-tree fairness fatml gradient-boosting-machine h2o iml interpretability interpretable interpretable-ai interpretable-machine-learning interpretable-ml lime machine-learning machine-learning-interpretability python transparency xai
Last synced: 13 Nov 2024
https://github.com/odpi/opends4all
OpenDS4All project, hosted by LF AI & Data
data-science jupyter-notebooks materials
Last synced: 09 Nov 2024
https://github.com/rweekly/rweekly.org
R Weekly
blog community data-science data-visualization r rweekly statistics visualization weekly
Last synced: 07 Aug 2024
https://github.com/pm4py/pm4py-core
Public repository for the PM4Py (Process Mining for Python) project.
data-mining data-science machine-learning process-mining python
Last synced: 10 Nov 2024
https://github.com/pm4py/pm4py-source
Public repository for the PM4Py (Process Mining for Python) project.
data-mining data-science machine-learning process-mining python
Last synced: 07 Aug 2024
https://github.com/fmind/mlops-python-package
Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
automation data-pipelines data-science machine-learning mlflow mlops pandera pydantic python
Last synced: 13 Nov 2024
https://github.com/faktionai/awesome-ai-usecases
A list of awesome and proven Artificial Intelligence use cases and applications
Last synced: 13 Oct 2024
https://github.com/fastai/fastai2
Temporary home for fastai v2 while it's being developed
data-science deep-learning fastai jupyter machine-learning nbdev python pytorch
Last synced: 07 Aug 2024
https://github.com/janpfeifer/gonb
GoNB, a Go Notebook Kernel for Jupyter
data-science go golang gonb jupyter jupyter-notebook jupyter-notebook-kernel
Last synced: 22 Oct 2024
https://github.com/yzhao062/combo
(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
aggregation data-mining data-science ensemble-learning machine-learning machine-learning-pipelines model-combination pipeline-framework python
Last synced: 13 Nov 2024
https://github.com/TrainingByPackt/Data-Science-Projects-with-Python
A Case Study Approach to Successful Data Science Projects Using Python, Pandas, and Scikit-Learn
data-science machine-learning numpy pandas pandas-dataframe python scikit-learn
Last synced: 08 Nov 2024
https://github.com/krish-adi/barfi
Python Flow Based Programming environment that provides a graphical programming environment.
ai-ml data-science dataflow-programming flow-based-programming framework graphical-programming jupyter jupyter-notebook ml python streamlit
Last synced: 10 Oct 2024
https://github.com/aeturrell/coding-for-economists
This repository hosts the code behind the online book, Coding for Economists.
book data-science econometrics economics economics-models jupyter-notebook learning python research vscode
Last synced: 07 Nov 2024
https://github.com/rstojnic/lazydata
Lazydata: Scalable data dependencies for Python projects
data-science datamanagement machine-learning python
Last synced: 29 Oct 2024
https://github.com/cerndb/dist-keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
apache-spark data-parallelism data-science deep-learning distributed-optimizers hadoop keras machine-learning optimization-algorithms tensorflow
Last synced: 28 Sep 2024
https://github.com/Squarespace/datasheets
Read data from, write data to, and modify the formatting of Google Sheets
data data-analytics data-science dataframe google pandas python
Last synced: 26 Oct 2024
https://github.com/milescranmer/symbolicregression.jl
Distributed High-Performance Symbolic Regression in Julia
automl data-science distributed-systems equation-discovery evolutionary-algorithms explainable-ai genetic-algorithm interpretable-ml julia machine-learning sciml symbolic symbolic-computation symbolic-regression
Last synced: 12 Nov 2024
https://github.com/chris-greening/instascrape
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
beginner-friendly data-mining data-science instagram instagram-data instagram-scraper lightweight python python-scraper python3 webscraping
Last synced: 06 Nov 2024
https://github.com/blue-yonder/turbodbc
Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
data-science database exasol numpy odbc pep249 pyodbc python python-database-api speedup
Last synced: 15 Oct 2024
https://github.com/erezsh/preql
An interpreted relational query language that compiles to SQL.
data-science database python query sql
Last synced: 15 Nov 2024
https://github.com/oneapi-src/onedal
oneAPI Data Analytics Library (oneDAL)
ai-inference ai-machine-learning ai-training analytics big-data cpp data-analysis data-science hacktoberfest machine-learning machine-learning-algorithms oneapi onedal swrepo
Last synced: 17 Nov 2024
https://github.com/lgienapp/aquarel
Styling matplotlib made easy
data-science data-visualization matplotlib plotting theme theme-development theming visualization
Last synced: 30 Oct 2024
https://github.com/benedekrozemberczki/datasets
A repository of pretty cool datasets that I collected for network science and machine learning research.
benchmark community-detection data-science dataset deepwalk dimensionality-reduction gcn gnn graph-convolution graph-embedding graph-neural-network graph2vec link-prediction machine-learning network-analysis network-embedding network-science node-classification node-embedding node2vec
Last synced: 14 Nov 2024
https://github.com/oneapi-src/oneDAL
oneAPI Data Analytics Library (oneDAL)
ai-inference ai-machine-learning ai-training analytics big-data cpp data-analysis data-science hacktoberfest machine-learning machine-learning-algorithms oneapi onedal swrepo
Last synced: 25 Oct 2024
https://github.com/sebkrantz/collapse
Advanced and Fast Data Transformation in R
cran data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data r rstats scientific-computing statistics time-series weighted weights
Last synced: 13 Nov 2024
https://github.com/SebKrantz/collapse
Advanced and Fast Data Transformation in R
cran data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data r rstats scientific-computing statistics time-series weighted weights
Last synced: 11 Nov 2024
https://github.com/tuangauss/DataScienceProjects
The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.
data-science data-visualization statistics
Last synced: 01 Nov 2024
https://github.com/sforaidl/kd_lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
algorithm-implementations benchmarking data-science deep-learning-library knowledge-distillation machine-learning model-compression pruning pytorch quantization
Last synced: 29 Oct 2024
https://github.com/erezsh/Preql
An interpreted relational query language that compiles to SQL.
data-science database python query sql
Last synced: 29 Oct 2024
https://github.com/Kotlin/kandy
Kotlin plotting library.
data-science graphics jupyter-notebooks kotlin plot
Last synced: 07 Nov 2024
https://github.com/JuliaStats/GLM.jl
Generalized linear models in Julia
data-science glm julia regression statistical-models statistics
Last synced: 12 Nov 2024
https://github.com/jadianes/data-science-your-way
Ways of doing Data Science Engineering and Machine Learning in R and Python
data-frame data-science data-science-engineering exploratory-data-analysis jupyter machine-learning notebook python r tutorial
Last synced: 16 Nov 2024
https://github.com/DiskFrame/disk.frame
Fast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data
data data-science large-dataset manipulation-data medium-data r
Last synced: 25 Oct 2024
https://github.com/juliastats/glm.jl
Generalized linear models in Julia
data-science glm julia regression statistical-models statistics
Last synced: 12 Oct 2024
https://github.com/alegonz/baikal
A graph-based functional API for building complex scikit-learn pipelines.
data-science graph-based machine-learning python scikit-learn
Last synced: 15 Nov 2024
https://github.com/graspologic-org/graspologic
Python package for graph statistics
data-science graph graph-statistics machine-learning networks python
Last synced: 12 Nov 2024
https://github.com/dataprofessor/streamlit_freecodecamp
Build 12 Data Apps in Python with Streamlit
data-science exploratory-data-analysis machine-learning python streamlit
Last synced: 10 Oct 2024
https://github.com/JacksonWuxs/DaPy
Easy-to-use data analysis / manipulation framework for humans
analysis data-analysis data-science efficiency pypi python statistical-reports
Last synced: 31 Oct 2024
https://github.com/github/codespaces-jupyter
Explore machine learning and data science with Codespaces
codespaces data-science jupyter-notebook machine-learning
Last synced: 07 Oct 2024
https://github.com/ploomber/jupysql
Better SQL in Jupyter. 📊
bigquery clickhouse data-engineering data-science duckdb hive jupyter mysql polars postgres presto python redshift snowflake spark-sql sql sqlite trino tsql
Last synced: 29 Sep 2024
https://github.com/kkulma/climate-change-data
:earth_africa: A curated list of APIs, open data and ML/AI projects on climate change
climate climate-analysis climate-change climate-data data data-science datascience hacktoberfest python r resources rstats
Last synced: 15 Nov 2024
https://github.com/plotly/dash-cytoscape
Interactive network visualization in Python and Dash, powered by Cytoscape.js
bioinformatics biopython computational-biology cytoscape cytoscapejs dash data-science graph-theory network-graph network-visualization plotly plotly-dash
Last synced: 04 Nov 2024
https://github.com/datacleaner/DataCleaner
The premier open source Data Quality solution
data data-analysis data-science database datacleaner dataquality desktop etl mdm profiling
Last synced: 30 Oct 2024
https://github.com/ahmedfgad/numpycnn
Building Convolutional Neural Networks From Scratch using NumPy
cnn computer-vision conv-layer convnet convolution convolutional-neural-networks data-science filter numpy pygad python relu relu-layer
Last synced: 17 Nov 2024
https://github.com/achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project
Complete-Life-Cycle-of-a-Data-Science-Project
analysis data-analysis data-science dataset deep-learning eda exploratory-data-analysis feature-engineering federated-learning machine-learning nlp-models python python-library pytorch reinforcement-learning scraper supervised-learning transfer-learning unsupervised-learning web-scraping
Last synced: 13 Nov 2024
https://github.com/dmbee/seglearn
Python module for machine learning time series:
data-science machine-learning python time-series
Last synced: 26 Oct 2024
https://dmbee.github.io/seglearn/
Python module for machine learning time series:
data-science machine-learning python time-series
Last synced: 02 Nov 2024
https://github.com/Murgio/Food-Recipe-CNN
food image to recipe with deep convolutional neural networks.
chef classification cnn convolutional-neural-networks cooking-dishes data-science deep-learning dish food food-classification inceptionv3 jupyter-notebook keras machine-learning python3 recipes recognition tsne vgg vgg16
Last synced: 29 Oct 2024
https://github.com/underneathall/pinferencia
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
ai artificial-intelligence computer-vision data-science deep-learning huggingface inference inference-server machine-learning model-deployment model-serving modelserver nlp paddlepaddle predict python pytorch serving tensorflow transformers
Last synced: 14 Nov 2024
https://github.com/rpy2/rpy2
Interface to use R from Python
cffi data-science interoperability python r statistics
Last synced: 17 Nov 2024
https://github.com/DaoSword/Time-Series-Forecasting-and-Deep-Learning
Resources about time series forecasting and deep learning.
data-science deep-learning forecasting machine-learning series-data series-forecasting time-series time-series-forecasting
Last synced: 30 Oct 2024
https://github.com/siznax/wptools
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
api-client commons data-science glam linked-open-data mediawiki mediawiki-api open-data python restbase wikidata wikimedia-commons wikipedia wikipedia-api
Last synced: 14 Oct 2024
https://github.com/GRAAL-Research/poutyne
A simplified framework and utilities for PyTorch
data-science deep-learning keras machine-learning neural-network python pytorch
Last synced: 30 Oct 2024
https://github.com/LearnDataSci/articles
A repository for the source code, notebooks, data, files, and other assets used in the data science and machine learning articles on LearnDataSci
data-analysis data-science data-visualization machine-learning machine-learning-algorithms machinelearning python
Last synced: 07 Nov 2024
https://github.com/starpig1129/ai-data-analysis-mulitagent
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, data analysis, visualization, and report writing. Perfect for researchers and data scientists seeking to enhance their workflow and productivity.
agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python
Last synced: 17 Sep 2024
https://github.com/rushter/heamy
A set of useful tools for competitive data science.
data-science machine-learning stacking
Last synced: 13 Nov 2024
https://github.com/firmai/pandapy
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)
algorithmic-trading arrays data-science data-structures finance machine-learning numpy pandas structured-data
Last synced: 04 Nov 2024
https://github.com/Lackoftactics/facebook_data_analyzer
Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
conversation data-science data-visualization english-language facebook facebook-data facebook-data-analyzer ruby ruby-gem scraping script statistics
Last synced: 04 Aug 2024
https://github.com/csinva/csinva.github.io
Slides, paper notes, class notes, blog posts, and research on ML 📉, statistics 📊, and AI 🤖.
ai artificial-intelligence awesome blog computational-neuroscience data-science deep-learning jekyll-themes machine-learning machine-learning-tutorials ml neuroscience notes python pytorch research slides statistics stats website
Last synced: 28 Sep 2024
https://github.com/WecoAI/aideml
AIDE: the state-of-the-art machine learning engineer agent, generating machine learning solution code from natural language descriptions.
ai data-science llm machine-learning
Last synced: 12 Nov 2024
https://github.com/youssefhosni/efficient-python-for-data-scientists
Writing clean and optimized Python code
data-science numpy pandas python
Last synced: 14 Nov 2024
https://github.com/firmai/deltapy
DeltaPy - Tabular Data Augmentation (by @firmai)
augmentation data-augmentation data-science feature-engineering feature-extraction finance machine-learning tabular-data time-series
Last synced: 05 Nov 2024
https://github.com/justmarkham/pycon-2019-tutorial
Data Science Best Practices with pandas
data-science pandas python tutorial vizualisation
Last synced: 13 Nov 2024
https://github.com/bradleyboehmke/data-science-learning-resources
A collection of data science and machine learning resources that I've found helpful (I only post what I've read!)
Last synced: 14 Oct 2024
https://hdi-project.github.io/ATM/
Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).
automl data-science distributed-computing hyperparameter-optimization machine-learning
Last synced: 18 Nov 2024
https://github.com/mszell/geospatialdatascience
Course materials for: Geospatial Data Science
course-materials data-science geospatial geospatial-analysis geospatial-data geospatial-visualization gis openstreetmap osmnx python street-networks teaching-materials
Last synced: 13 Nov 2024
https://github.com/HDI-Project/ATM
Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).
automl data-science distributed-computing hyperparameter-optimization machine-learning
Last synced: 06 Aug 2024
https://github.com/MilesCranmer/SymbolicRegression.jl
Distributed High-Performance Symbolic Regression in Julia
automl data-science distributed-systems equation-discovery evolutionary-algorithms explainable-ai genetic-algorithm interpretable-ml julia machine-learning sciml symbolic symbolic-computation symbolic-regression
Last synced: 13 Nov 2024
https://github.com/youssefHosni/Efficient-Python-for-Data-Scientists
Writing clean and optimized Python code
data-science numpy pandas python
Last synced: 27 Oct 2024
https://github.com/RunLLM/aqueduct
Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.
ai data data-science kubernetes llm llms machine-learning ml ml-infrastructure ml-monitoring mlops orchestration python python3
Last synced: 09 Nov 2024
https://github.com/aqueducthq/aqueduct
Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.
ai data data-science kubernetes llm llms machine-learning ml ml-infrastructure ml-monitoring mlops orchestration python python3
Last synced: 17 Aug 2024
https://github.com/HoloClean/holoclean
A Machine Learning System for Data Enrichment.
data-enrichment data-science inference-engine machine-learning pytorch
Last synced: 12 Nov 2024
https://github.com/insitro/redun
Yet another redundant workflow engine
aws bioinformatics data-engineering data-science docker etl gcp ml python workflow-engine
Last synced: 02 Nov 2024
https://github.com/ashishpatel26/Amazing-Feature-Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
data-analysis data-mining data-science data-scientists data-visualization deep-learning feature-engineering feature-extraction feature-scaling feature-selection features machine-learning scikit-learn
Last synced: 07 Nov 2024