Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2025-06-27 00:07:36 UTC
- JSON Representation
https://github.com/iphysresearch/DataSciComp
A collection of popular Data Science Challenges/Competitions || Countdown timers to keep track of the entry deadlines.
challenge competition data-challenge data-science data-science-competitions project
Last synced: 02 Apr 2025
https://github.com/iphysresearch/datascicomp
A collection of popular Data Science Challenges/Competitions || Countdown timers to keep track of the entry deadlines.
challenge competition data-challenge data-science data-science-competitions project
Last synced: 17 Jan 2025
https://github.com/enzoampil/fastquant
fastquant โ Backtest and optimize your ML trading strategies with only 3 lines of code!
algotrading backtesting cryptocurrency data-science financial-data-science machine-learning quantitative-finance stocks trading-strategies
Last synced: 14 May 2025
https://github.com/mlr-org/mlr
Machine Learning in R
classification clustering cran data-science feature-selection hyperparameters-optimization imbalance-correction learners machine-learning mlr multilabel-classification predictive-modeling r r-package regression stacking statistics survival-analysis tuning tutorial
Last synced: 14 May 2025
https://github.com/climbsrocks/auto_ml
[UNMAINTAINED] Automated machine learning for analytics & production
analytics artificial-intelligence automated-machine-learning automl data-science deep-learning deeplearning feature-engineering gradient-boosting hyperparameter-optimization keras lightgbm machine-learning machine-learning-library machine-learning-pipelines production-ready python scikit-learn tensorflow xgboost
Last synced: 14 May 2025
https://github.com/jadianes/spark-py-notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
big-data bigdata data-analysis data-science ipython ipython-notebook machine-learning mllib notebook pyspark python spark
Last synced: 15 May 2025
https://github.com/ClimbsRocks/auto_ml
[UNMAINTAINED] Automated machine learning for analytics & production
analytics artificial-intelligence automated-machine-learning automl data-science deep-learning deeplearning feature-engineering gradient-boosting hyperparameter-optimization keras lightgbm machine-learning machine-learning-library machine-learning-pipelines production-ready python scikit-learn tensorflow xgboost
Last synced: 07 Apr 2025
https://github.com/github/covid19-dashboard
A site that displays up to date COVID-19 stats, powered by fastpages.
altair analytics covid-19 covid-data covid19 data-science data-visualisation fastai fastpages github-actions github-pages jupyter matplotlib nteract papermill pymc3 python
Last synced: 21 Jan 2025
https://github.com/justmarkham/dat8
General Assembly's 2015 Data Science course in Washington, DC
clustering course data-analysis data-cleaning data-science data-visualization decision-trees ensemble-learning jupyter-notebook linear-regression logistic-regression machine-learning model-evaluation naive-bayes natural-language-processing pandas python regular-expressions scikit-learn web-scraping
Last synced: 15 May 2025
https://github.com/joaomilho/Enterprise
๐ฆ The Enterpriseโข programming language
ajax artificial-intelligence cloud crypto data-science disruptive-technology docker enterprise enterprise-development enterprise-services enterprise-software growth jvm kubernetes language money progressive-web-app quantum redux
Last synced: 13 Mar 2025
https://github.com/joaomilho/enterprise
๐ฆ The Enterpriseโข programming language
ajax artificial-intelligence cloud crypto data-science disruptive-technology docker enterprise enterprise-development enterprise-services enterprise-software growth jvm kubernetes language money progressive-web-app quantum redux
Last synced: 08 Apr 2025
https://github.com/iamtodor/data-science-interview-questions-and-answers
Data science interview questions with answers. Not ideally (yet)
data-science interview-preparation interview-questions machine-learning
Last synced: 22 Mar 2025
https://github.com/keras-team/keras-contrib
Keras community contributions
data-science deep-learning keras machine-learning neural-networks tensorflow theano
Last synced: 20 Jan 2025
https://github.com/kantord/just-dashboard
:bar_chart: :clipboard: Dashboards using YAML or JSON files
big-data business-intelligence chart csv d3 d3js dashboard data data-driven data-engineering data-science data-visualization gist github-gist json just-dashboard yaml
Last synced: 15 May 2025
https://github.com/moj-analytical-services/splink
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
data-matching data-science deduplicate-data deduplication duckdb em-algorithm entity-resolution fuzzy-matching record-linkage spark uk-gov-data-science
Last synced: 13 May 2025
https://github.com/alinebastos/dev-practice
Practice your skills with these ideas.
back-end backend challenge css css3 data-science development front-end front-end-development frontend frontend-practice frontend-skills game git hackathons hacktoberfest javascript practice vim
Last synced: 15 May 2025
https://github.com/microsoft/responsible-ai-toolbox
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.
data-analysis data-science data-visualization error-analysis explainability explainable-ai explainable-ml fairness fairness-ai fairness-ml interpretability jupyter machine-learning machinelearning ml responsible-ai ui visualization widget widgets
Last synced: 13 May 2025
https://github.com/mlrun/mlrun
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.
data-engineering data-science experiment-tracking kubernetes machine-learning mlops mlops-workflow model-serving python workflow
Last synced: 13 May 2025
https://github.com/nubank/fklearn
fklearn: Functional Machine Learning
data-analysis data-science machine-learning ml python
Last synced: 13 May 2025
https://github.com/axelderomblay/mlbox
MLBox is a powerful Automated Machine Learning python library.
auto-ml automated-machine-learning automl classification data-science deep-learning distributed drift encoding kaggle keras lightgbm machine-learning optimization pipeline prediction preprocessing regression stacking xgboost
Last synced: 15 May 2025
https://github.com/AxeldeRomblay/MLBox
MLBox is a powerful Automated Machine Learning python library.
auto-ml automated-machine-learning automl classification data-science deep-learning distributed drift encoding kaggle keras lightgbm machine-learning optimization pipeline prediction preprocessing regression stacking xgboost
Last synced: 26 Apr 2025
https://github.com/tomasonjo/blogs
Jupyter notebooks that support my graph data science blog posts at https://bratanic-tomaz.medium.com/
data-science graph graph-algorithms neo4j
Last synced: 12 Apr 2025
https://github.com/hi-primus/optimus
:truck: Agile Data Preparation Workflows madeย easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
big-data-cleaning bigdata cudf dask dask-cudf data-analysis data-cleaner data-cleaning data-cleansing data-exploration data-extraction data-preparation data-profiling data-science data-transformation data-wrangling machine-learning pyspark spark
Last synced: 14 May 2025
https://github.com/google/uncertainty-baselines
High-quality implementations of standard and SOTA methods on a variety of tasks.
bayesian-methods data-science deep-learning machine-learning neural-networks probabilistic-programming statistics tensorflow
Last synced: 14 May 2025
https://github.com/safe-graph/graph-fraud-detection-papers
A curated list of graph-based fraud, anomaly, and outlier detection papers & resources
academic-publications anomaly-detection awsome-list data-mining data-science dataset deep-learning fraud-detection graph-algorithms graph-convolutional-networks graph-neural-networks machine-learning outlier-detection papers security spam-detection survey
Last synced: 26 Mar 2025
https://github.com/h2oai/h2o-tutorials
Tutorials and training material for the H2O Machine Learning Platform
data-science deep-learning h2o machine-learning python r tutorial
Last synced: 14 May 2025
https://github.com/man-group/ArcticDB
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading
Last synced: 12 Mar 2025
https://github.com/scottfreellc/alphapy
Python AutoML for Trading Systems and Sports Betting
backtesting classification cryptocurrency data-science deep-learning iex keras machine-learning pandas portfolio predictive-analytics python regression scikit-learn sports stocks time-series-analysis trading trading-platform trading-strategies
Last synced: 14 May 2025
https://github.com/microsoft/RD-Agent
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through our open source R&D automation tool RD-Agent, which lets AI drive data-driven AI.
agent ai automation data-mining data-science development llm research
Last synced: 09 Feb 2025
https://github.com/capitalone/dataprofiler
What's in your data? Extract schema, statistics and entities from datasets
avro csv data-analysis data-labels data-science dataprofiling dataset gdpr graph-data machine-learning network-data nlp npi pandas pii privacy python security sensitive-data tabular-data
Last synced: 14 May 2025
https://github.com/sepandhaghighi/pycm
Multi-class confusion matrix library in Python
accuracy ai artificial-intelligence classification confusion-matrix data data-analysis data-mining data-science deep-learning deeplearning evaluation machine-learning mathematics matrix ml multiclass-classification neural-network statistical-analysis statistics
Last synced: 13 May 2025
https://github.com/capitalone/DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
avro csv data-analysis data-labels data-science dataprofiling dataset gdpr graph-data machine-learning network-data nlp npi pandas pii privacy python security sensitive-data tabular-data
Last synced: 02 Apr 2025
https://github.com/swanhubx/swanlab
โก๏ธSwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / LLaMA Factory / Swift / Ultralytics / veRL / MMEngine / Keras etc.
data-science deep-learning jax logging machine-learning mlops model-versioning python pytorch tensorboard tensorflow tracking transformers visualization
Last synced: 13 May 2025
https://github.com/khuyentran1401/efficient_python_tricks_and_tools_for_data_scientists
Efficient Python Tricks and Tools for Data Scientists
Last synced: 14 May 2025
https://github.com/ScottfreeLLC/AlphaPy
Python AutoML for Trading Systems and Sports Betting
backtesting classification cryptocurrency data-science deep-learning iex keras machine-learning pandas portfolio predictive-analytics python regression scikit-learn sports stocks time-series-analysis trading trading-platform trading-strategies
Last synced: 05 May 2025
https://github.com/csinva/imodels
Interpretable ML package ๐ for concise, transparent, and accurate predictive modeling (sklearn-compatible).
ai artificial-intelligence bayesian-rule-list data-science explainable-ai explainable-ml imodels interpretability machine-learning ml optimal-classification-tree python rule-learning rulefit rules scikit-learn statistics supervised-learning
Last synced: 13 May 2025
https://github.com/mlreef/mlreef
The collaboration workspace for Machine Learning
artificial-intelligence data-science deep-learning deeplearning machine-learning machine-learning-algorithms mlops mlops-environment models mxnet pytorch reproducibility tensorflow
Last synced: 15 May 2025
https://github.com/CamDavidsonPilon/lifetimes
Lifetime value in Python
data-science python statistics
Last synced: 06 May 2025
https://github.com/MLReef/mlreef
The collaboration workspace for Machine Learning
artificial-intelligence data-science deep-learning deeplearning machine-learning machine-learning-algorithms mlops mlops-environment models mxnet pytorch reproducibility tensorflow
Last synced: 15 Mar 2025
https://github.com/code-kern-ai/refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
active-learning annotations artificial-intelligence data-centric-ai data-labeling data-science deep-learning human-in-the-loop labeling labeling-tool machine-learning natural-language-processing neural-search nlp python spacy supervised-learning text-annotation text-classification transformers
Last synced: 14 May 2025
https://github.com/dltk/dltk
Deep Learning Toolkit for Medical Image Analysis
cnn data-science deep-learning deep-neural-networks dltk dltk-model-zoo machine-learning medical medical-image-processing medical-imaging ml neural-network neural-networks neuroimaging python tensorflow
Last synced: 08 Apr 2025
https://github.com/DLTK/DLTK
Deep Learning Toolkit for Medical Image Analysis
cnn data-science deep-learning deep-neural-networks dltk dltk-model-zoo machine-learning medical medical-image-processing medical-imaging ml neural-network neural-networks neuroimaging python tensorflow
Last synced: 23 Apr 2025
https://github.com/denizyuret/knet.jl
Koรง University deep learning framework.
data-science deep-learning julia knet machine-learning neural-networks
Last synced: 14 May 2025
https://github.com/denizyuret/Knet.jl
Koรง University deep learning framework.
data-science deep-learning julia knet machine-learning neural-networks
Last synced: 04 May 2025
https://github.com/khuyentran1401/Efficient_Python_tricks_and_tools_for_data_scientists
Efficient Python Tricks and Tools for Data Scientists
Last synced: 29 Apr 2025
https://github.com/demidovakatya/vvedenie-mashinnoe-obuchenie
:memo: ะะพะดะฑะพัะบะฐ ัะตััััะพะฒ ะฟะพ ะผะฐัะธะฝะฝะพะผั ะพะฑััะตะฝะธั
collections data-mining data-science deep-learning machine-learning mooc neural-networks nlp russian university
Last synced: 23 Mar 2025
https://github.com/ebay/tsv-utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
cli command-line csv d data-mining data-science delimited-files dlang reservoir-sampling sampling shuffle statistics tabular-data tsv uniq
Last synced: 25 Mar 2025
https://github.com/eBay/tsv-utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
cli command-line csv d data-mining data-science delimited-files dlang reservoir-sampling sampling shuffle statistics tabular-data tsv uniq
Last synced: 14 Apr 2025
https://github.com/sfirke/janitor
simple tools for data cleaning in R
data-analysis data-cleaning data-science dirty-data excel pivot-tables r spss tabulations tidyverse
Last synced: 13 May 2025
https://github.com/googlecloudplatform/data-science-on-gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning
Last synced: 14 Apr 2025
https://github.com/skrub-data/skrub
Machine learning with dataframes
data data-analysis data-cleaning data-preparation data-preprocessing data-science data-wrangling dataframe dataframes dirty-data machine-learning
Last synced: 13 May 2025
https://github.com/kyleskom/nba-machine-learning-sports-betting
NBA sports betting using machine learning
ai data-science deep-learning gambling gpt keras llm machine-learning nba nba-analytics nba-prediction neural-network python sports sports-analytics sports-betting sports-data tensorflow
Last synced: 14 May 2025
https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting
NBA sports betting using machine learning
ai data-science deep-learning gambling gpt keras llm machine-learning nba nba-analytics nba-prediction neural-network python sports sports-analytics sports-betting sports-data tensorflow
Last synced: 27 Mar 2025
https://modeloriented.github.io/DALEX/
moDel Agnostic Language for Exploration and eXplanation
black-box dalex data-science explainable-ai explainable-artificial-intelligence explainable-ml explanations explanatory-model-analysis fairness iml interpretability interpretable-machine-learning machine-learning model-visualization predictive-modeling responsible-ai responsible-ml xai
Last synced: 20 Nov 2024
https://github.com/ModelOriented/DALEX
moDel Agnostic Language for Exploration and eXplanation
black-box dalex data-science explainable-ai explainable-artificial-intelligence explainable-ml explanations explanatory-model-analysis fairness iml interpretability interpretable-machine-learning machine-learning model-visualization predictive-modeling responsible-ai responsible-ml xai
Last synced: 14 Mar 2025
https://github.com/modeloriented/dalex
moDel Agnostic Language for Exploration and eXplanation
black-box dalex data-science explainable-ai explainable-artificial-intelligence explainable-ml explanations explanatory-model-analysis fairness iml interpretability interpretable-machine-learning machine-learning model-visualization predictive-modeling responsible-ai responsible-ml xai
Last synced: 14 May 2025
https://github.com/quixio/quix-streams
Python Streaming DataFrames for Kafka
data-engineering data-intensive-applications data-science event-driven-architecture kafka machine-learning python real-time-data-processing stream-processing stream-processor streaming-data streaming-data-pipelines streaming-data-processing time-series-data
Last synced: 13 May 2025
https://github.com/lastancientone/deep_learning_machine_learning_stock
Deep Learning and Machine Learning stocks represent promising opportunities for both long-term and short-term investors and traders.
algorithms data-science deep-learning feature-engineering feature-extraction feature-selection features-extraction financial-engineering machine-learning neural-network prediction stock-analysis stock-data stock-market stock-prediction stock-price-prediction stock-prices stock-trading technical-analysis trading
Last synced: 16 May 2025
https://github.com/sb-ai-lab/lightautoml
Fast and customizable framework for automatic ML model creation (AutoML)
automated-machine-learning automatic-machine-learning automl automl-algorithms binary-classification data-science kaggle lama machine-learning multiclass-classification nlp python regression
Last synced: 14 May 2025
https://github.com/miteshputhran/speech-emotion-analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 16 May 2025
https://github.com/kotartemiy/pygooglenews
If Google News had a Python library
data-science google news python rss
Last synced: 14 May 2025
https://github.com/ropensci/drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
data-science drake high-performance-computing makefile peer-reviewed pipeline r r-package reproducibility reproducible-research ropensci rstats workflow
Last synced: 13 May 2025
https://github.com/ebhy/budgetml
Deploy a ML inference service on a budget in less than 10 lines of code.
api data-science deployment fastapi inference machine-learning mlops
Last synced: 15 May 2025
https://github.com/ahmetozlu/tensorflow_object_counting_api
๐ The TensorFlow Object Counting API is an open source framework built on top of TensorFlow and Keras that makes it easy to develop object counting systems!
computer-vision data-science deep-learning deep-neural-networks image-processing machine-learning object-counting object-counting-api object-detection object-detection-api object-detection-label object-detection-pipelines opencv pedestrian-counting shelf-management shelf-navigation tensorflow tensorflow-api tensorflow-object-detection-api vehicle-counting
Last synced: 15 May 2025
https://github.com/patmartin/dex
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization
Last synced: 16 May 2025
https://github.com/PatMartin/Dex
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization
Last synced: 04 May 2025
https://github.com/GoogleCloudPlatform/data-science-on-gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning
Last synced: 27 Nov 2024
https://github.com/ikatsov/tensor-house
A collection of reference Jupyter notebooks and demo AI/ML applications for enterprise use cases: marketing, pricing, supply chain, smart manufacturing, and more.
ai customer-analysis data-science deep-learning llm machine-learning marketing models personalization reinforcement-learning supply-chain
Last synced: 08 Apr 2025
https://github.com/skforecast/skforecast
Time series forecasting with machine learning models
arima autoregressive-forecasting backtesting-forecasters data-science direct-forecasting exogenous-predictors forecasting lightgbm lstm-neural-networks machine-learning multi-series-forecasting multi-step-forecasting multiple-time-series-forecasting probabilistic-forecasting python quantile-forecasting sarimax scikit-learn time-series xgboost
Last synced: 13 May 2025
https://github.com/opendatadiscovery/odd-platform
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
alerting bigdata data-catalog data-discovery data-engineering data-exploration data-governance data-lineage data-observability data-pipelines data-platform data-profiling data-quality data-science datacatalog lineage metadata metadata-management observability oss
Last synced: 15 May 2025
https://github.com/MiteshPuthran/Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 27 Mar 2025
https://github.com/MITESHPUTHRANNEU/Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 14 Dec 2024
https://github.com/jrfiedler/causal_inference_python_code
Python code for part 2 of the book Causal Inference: What If, by Miguel Hernรกn and James Robins
causal-inference causality data-science python
Last synced: 16 May 2025
https://github.com/nok/sklearn-porter
Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
data-science machine-learning scikit-learn sklearn
Last synced: 15 May 2025
https://github.com/starpig1129/datagen
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.
agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python
Last synced: 14 May 2025
https://github.com/alan-turing-institute/clevercsv
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3
Last synced: 13 May 2025
https://github.com/scikit-learn-contrib/MAPIE
A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
classification confidence-intervals conformal-prediction data-science python regression sklearn
Last synced: 01 May 2025
https://github.com/scikit-learn-contrib/mapie
A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
classification confidence-intervals conformal-prediction data-science python regression sklearn
Last synced: 13 May 2025
https://github.com/gboeing/ppde642
USC urban data science course series in Python
cities city-government coding course course-materials data-science jupyter network-analysis python spatial-analysis statistics syllabus transport transportation urban-analytics urban-data-science urban-informatics urban-planning urbanism usc
Last synced: 15 May 2025
https://github.com/starpig1129/AI-Data-Analysis-MultiAgent
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.
agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python
Last synced: 02 May 2025
https://github.com/crazyhottommy/getting-started-with-genomics-tools-and-resources
Unix, R and python tools for genomics and data science
bioinformatics cancer-genomics data-science
Last synced: 14 May 2025
https://github.com/reiinakano/xcessiv
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.
automated-machine-learning data-science ensemble-learning hyperparameter-optimization machine-learning scikit-learn stacked-ensembles
Last synced: 15 May 2025
https://github.com/davidadsp/generative_deep_learning_2nd_edition
The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
chatgpt dalle2 data-science deep-learning diffusion-models generative-adversarial-network gpt-3 machine-learning python stable-diffusion tensorflow
Last synced: 15 May 2025
https://github.com/alan-turing-institute/CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3
Last synced: 26 Mar 2025
https://github.com/rocketlaunchr/dataframe-go
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
data-science dataframe dataframes go golang machine-learning pandas pandas-dataframe python statistics
Last synced: 15 May 2025
https://github.com/davidADSP/Generative_Deep_Learning_2nd_Edition
The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
chatgpt dalle2 data-science deep-learning diffusion-models generative-adversarial-network gpt-3 machine-learning python stable-diffusion tensorflow
Last synced: 01 May 2025
https://github.com/mandiant/ThreatPursuit-VM
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
analytics cyber data-science fireeye intelligence intelligence-analysis malware mandiant threat threathunting threatintelligence virtual-machine
Last synced: 21 Nov 2024
https://github.com/mandiant/threatpursuit-vm
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
analytics cyber data-science fireeye intelligence intelligence-analysis malware mandiant threat threathunting threatintelligence virtual-machine
Last synced: 23 Feb 2025
https://github.com/fireeye/ThreatPursuit-VM
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
analytics cyber data-science fireeye intelligence intelligence-analysis malware mandiant threat threathunting threatintelligence virtual-machine
Last synced: 05 Dec 2024
https://github.com/ramiawar/dataline
Chat with your data - AI data analysis and visualization on CSV, Postgres, MySQL, Snowflake, SQLite...
ai chart data-science data-visualization llm sql
Last synced: 14 May 2025
https://github.com/nivu/ai_all_resources
A curated list of Best Artificial Intelligence Resources
artificial-intelligence convolutional-neural-networks data-science decision-trees deep-learning gan kmeans knn machine-learning mathematics neural-networks python random-forest regression reinforcement-learning rnn statistics statquest support-vector-machine tensorflow
Last synced: 15 Jun 2025
https://github.com/logicalclocks/hopsworks
Hopsworks - Data-Intensive AI platform with a Feature Store
aws azure data-science feature-engineering feature-management feature-store gcp governance hopsworks kserve machine-learning ml mlops model-serving pyspark python serverless
Last synced: 14 May 2025
https://github.com/fmind/mlops-python-package
Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
automation data-pipelines data-science machine-learning mlflow mlops pandera pydantic python
Last synced: 14 May 2025
https://github.com/sb-ai-lab/LightAutoML
Fast and customizable framework for automatic ML model creation (AutoML)
automated-machine-learning automatic-machine-learning automl automl-algorithms binary-classification data-science kaggle lama machine-learning multiclass-classification nlp python regression
Last synced: 08 May 2025
https://github.com/elixir-explorer/explorer
Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
data-science dataframes elixir rust
Last synced: 14 May 2025
https://github.com/business-science/free_r_tips
Free R-Tips is a FREE Newsletter provided by Business Science. It comes with bite-sized code tutorials every week.
data-science newsletter tips tips-and-tricks
Last synced: 25 Mar 2025
https://github.com/devamoghs/machine-learning-with-python
Small scale machine learning projects to understand the core concepts . Give a Star ๐If it helps you. BONUS: Interview Bank coming up..!
beginner-friendly data-science deep-learning exercises machine-learning practice-project python python-3 scikit-learn
Last synced: 14 May 2025
https://github.com/annoviko/pyclustering
pyclustering is a Python, C++ data mining library.
algorithms c-plus-plus clustering data-mining data-science machine-learning neural-networks oscillatory-networks python python3
Last synced: 14 May 2025
https://github.com/devAmoghS/Machine-Learning-with-Python
Small scale machine learning projects to understand the core concepts . Give a Star ๐If it helps you. BONUS: Interview Bank coming up..!
beginner-friendly data-science deep-learning exercises machine-learning practice-project python python-3 scikit-learn
Last synced: 27 Nov 2024