Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-11-14 00:06:27 UTC
- JSON Representation
https://github.com/axelderomblay/mlbox
MLBox is a powerful Automated Machine Learning python library.
auto-ml automated-machine-learning automl classification data-science deep-learning distributed drift encoding kaggle keras lightgbm machine-learning optimization pipeline prediction preprocessing regression stacking xgboost
Last synced: 10 Oct 2024
https://github.com/h2oai/h2o-tutorials
Tutorials and training material for the H2O Machine Learning Platform
data-science deep-learning h2o machine-learning python r tutorial
Last synced: 10 Oct 2024
https://github.com/iamtodor/data-science-interview-questions-and-answers
Data science interview questions with answers. Not ideally (yet)
data-science interview-preparation interview-questions machine-learning
Last synced: 14 Oct 2024
https://github.com/hi-primus/optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
big-data-cleaning bigdata cudf dask dask-cudf data-analysis data-cleaner data-cleaning data-cleansing data-exploration data-extraction data-preparation data-profiling data-science data-transformation data-wrangling machine-learning pyspark spark
Last synced: 12 Oct 2024
https://github.com/CamDavidsonPilon/lifetimes
Lifetime value in Python
data-science python statistics
Last synced: 13 Nov 2024
https://github.com/sepandhaghighi/pycm
Multi-class confusion matrix library in Python
accuracy ai artificial-intelligence classification confusion-matrix data data-analysis data-mining data-science deep-learning deeplearning evaluation machine-learning mathematics matrix ml multiclass-classification neural-network statistical-analysis statistics
Last synced: 15 Oct 2024
https://github.com/MLReef/mlreef
The collaboration workspace for Machine Learning
artificial-intelligence data-science deep-learning deeplearning machine-learning machine-learning-algorithms mlops mlops-environment models mxnet pytorch reproducibility tensorflow
Last synced: 27 Oct 2024
https://github.com/mlreef/mlreef
The collaboration workspace for Machine Learning
artificial-intelligence data-science deep-learning deeplearning machine-learning machine-learning-algorithms mlops mlops-environment models mxnet pytorch reproducibility tensorflow
Last synced: 29 Oct 2024
https://github.com/denizyuret/Knet.jl
Koç University deep learning framework.
data-science deep-learning julia knet machine-learning neural-networks
Last synced: 13 Nov 2024
https://github.com/DLTK/DLTK
Deep Learning Toolkit for Medical Image Analysis
cnn data-science deep-learning deep-neural-networks dltk dltk-model-zoo machine-learning medical medical-image-processing medical-imaging ml neural-network neural-networks neuroimaging python tensorflow
Last synced: 10 Nov 2024
https://github.com/google/uncertainty-baselines
High-quality implementations of standard and SOTA methods on a variety of tasks.
bayesian-methods data-science deep-learning machine-learning neural-networks probabilistic-programming statistics tensorflow
Last synced: 15 Oct 2024
https://github.com/capitalone/DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
avro csv data-analysis data-labels data-science dataprofiling dataset gdpr graph-data machine-learning network-data nlp npi pandas pii privacy python security sensitive-data tabular-data
Last synced: 03 Nov 2024
https://github.com/denizyuret/knet.jl
Koç University deep learning framework.
data-science deep-learning julia knet machine-learning neural-networks
Last synced: 15 Oct 2024
https://github.com/dltk/dltk
Deep Learning Toolkit for Medical Image Analysis
cnn data-science deep-learning deep-neural-networks dltk dltk-model-zoo machine-learning medical medical-image-processing medical-imaging ml neural-network neural-networks neuroimaging python tensorflow
Last synced: 14 Oct 2024
https://github.com/khuyentran1401/Efficient_Python_tricks_and_tools_for_data_scientists
Efficient Python Tricks and Tools for Data Scientists
Last synced: 11 Nov 2024
https://github.com/capitalone/dataprofiler
What's in your data? Extract schema, statistics and entities from datasets
avro csv data-analysis data-labels data-science dataprofiling dataset gdpr graph-data machine-learning network-data nlp npi pandas pii privacy python security sensitive-data tabular-data
Last synced: 09 Oct 2024
https://github.com/demidovakatya/vvedenie-mashinnoe-obuchenie
:memo: Подборка ресурсов по машинному обучению
collections data-mining data-science deep-learning machine-learning mooc neural-networks nlp russian university
Last synced: 14 Oct 2024
https://github.com/eBay/tsv-utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
cli command-line csv d data-mining data-science delimited-files dlang reservoir-sampling sampling shuffle statistics tabular-data tsv uniq
Last synced: 08 Nov 2024
https://github.com/ebay/tsv-utils
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
cli command-line csv d data-mining data-science delimited-files dlang reservoir-sampling sampling shuffle statistics tabular-data tsv uniq
Last synced: 15 Oct 2024
https://github.com/safe-graph/graph-fraud-detection-papers
A curated list of graph-based fraud, anomaly, and outlier detection papers & resources
academic-publications anomaly-detection awsome-list data-mining data-science dataset deep-learning fraud-detection graph-algorithms graph-convolutional-networks graph-neural-networks machine-learning outlier-detection papers security spam-detection survey
Last synced: 15 Oct 2024
https://github.com/code-kern-ai/refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
active-learning annotations artificial-intelligence data-centric-ai data-labeling data-science deep-learning human-in-the-loop labeling labeling-tool machine-learning natural-language-processing neural-search nlp python spacy supervised-learning text-annotation text-classification transformers
Last synced: 14 Oct 2024
https://github.com/khuyentran1401/efficient_python_tricks_and_tools_for_data_scientists
Efficient Python Tricks and Tools for Data Scientists
Last synced: 15 Oct 2024
https://github.com/csinva/imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
ai artificial-intelligence bayesian-rule-list data-science explainable-ai explainable-ml imodels interpretability machine-learning ml optimal-classification-tree python rule-learning rulefit rules scikit-learn statistics supervised-learning
Last synced: 13 Oct 2024
https://github.com/ModelOriented/DALEX
moDel Agnostic Language for Exploration and eXplanation
black-box dalex data-science explainable-ai explainable-artificial-intelligence explainable-ml explanations explanatory-model-analysis fairness iml interpretability interpretable-machine-learning machine-learning model-visualization predictive-modeling responsible-ai responsible-ml xai
Last synced: 25 Oct 2024
https://github.com/modeloriented/dalex
moDel Agnostic Language for Exploration and eXplanation
black-box dalex data-science explainable-ai explainable-artificial-intelligence explainable-ml explanations explanatory-model-analysis fairness iml interpretability interpretable-machine-learning machine-learning model-visualization predictive-modeling responsible-ai responsible-ml xai
Last synced: 15 Oct 2024
https://modeloriented.github.io/DALEX/
moDel Agnostic Language for Exploration and eXplanation
black-box dalex data-science explainable-ai explainable-artificial-intelligence explainable-ml explanations explanatory-model-analysis fairness iml interpretability interpretable-machine-learning machine-learning model-visualization predictive-modeling responsible-ai responsible-ml xai
Last synced: 04 Aug 2024
https://github.com/sfirke/janitor
simple tools for data cleaning in R
data-analysis data-cleaning data-science dirty-data excel pivot-tables r spss tabulations tidyverse
Last synced: 15 Oct 2024
https://github.com/ropensci/drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
data-science drake high-performance-computing makefile peer-reviewed pipeline r r-package reproducibility reproducible-research ropensci rstats workflow
Last synced: 10 Oct 2024
https://github.com/ebhy/budgetml
Deploy a ML inference service on a budget in less than 10 lines of code.
api data-science deployment fastapi inference machine-learning mlops
Last synced: 11 Oct 2024
https://github.com/PatMartin/Dex
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization
Last synced: 13 Nov 2024
https://github.com/patmartin/dex
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization
Last synced: 12 Nov 2024
https://github.com/ahmetozlu/tensorflow_object_counting_api
🚀 The TensorFlow Object Counting API is an open source framework built on top of TensorFlow and Keras that makes it easy to develop object counting systems!
computer-vision data-science deep-learning deep-neural-networks image-processing machine-learning object-counting object-counting-api object-detection object-detection-api object-detection-label object-detection-pipelines opencv pedestrian-counting shelf-management shelf-navigation tensorflow tensorflow-api tensorflow-object-detection-api vehicle-counting
Last synced: 15 Oct 2024
https://github.com/mlrun/mlrun
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.
data-engineering data-science experiment-tracking kubernetes machine-learning mlops mlops-workflow model-serving python workflow
Last synced: 09 Nov 2024
https://github.com/dagworks-inc/hamilton
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering
Last synced: 11 Oct 2024
https://github.com/MiteshPuthran/Speech-Emotion-Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 30 Oct 2024
https://github.com/googlecloudplatform/data-science-on-gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning
Last synced: 07 Oct 2024
https://github.com/GoogleCloudPlatform/data-science-on-gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning
Last synced: 07 Aug 2024
https://github.com/scikit-learn-contrib/MAPIE
A scikit-learn-compatible module to estimate prediction intervals and control risks based on conformal predictions.
classification confidence-intervals conformal-prediction data-science python regression sklearn
Last synced: 12 Nov 2024
https://github.com/nok/sklearn-porter
Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
data-science machine-learning scikit-learn sklearn
Last synced: 13 Oct 2024
https://github.com/kotartemiy/pygooglenews
If Google News had a Python library
data-science google news python rss
Last synced: 14 Oct 2024
https://github.com/reiinakano/xcessiv
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.
automated-machine-learning data-science ensemble-learning hyperparameter-optimization machine-learning scikit-learn stacked-ensembles
Last synced: 13 Oct 2024
https://github.com/bytewax/bytewax
Python Stream Processing
data-engineering data-processing data-science dataflow machine-learning python rust stream-processing streaming-data
Last synced: 15 Oct 2024
https://github.com/gboeing/ppde642
USC urban data science course series with Python and Jupyter
cities city-government coding course course-materials data-science jupyter network-analysis python spatial-analysis statistics syllabus transport transportation urban-analytics urban-data-science urban-informatics urban-planning urbanism usc
Last synced: 11 Oct 2024
https://github.com/alan-turing-institute/CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3
Last synced: 29 Oct 2024
https://github.com/ikatsov/tensor-house
A collection of reference Jupyter notebooks and demo AI/ML applications for enterprise use cases: marketing, pricing, supply chain, smart manufacturing, and more.
ai customer-analysis data-science deep-learning llm machine-learning marketing models personalization reinforcement-learning supply-chain
Last synced: 14 Oct 2024
https://github.com/miteshputhran/speech-emotion-analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice
Last synced: 29 Oct 2024
https://github.com/microsoft/responsible-ai-toolbox
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.
data-analysis data-science data-visualization error-analysis explainability explainable-ai explainable-ml fairness fairness-ai fairness-ml interpretability jupyter machine-learning machinelearning ml responsible-ai ui visualization widget widgets
Last synced: 11 Oct 2024
https://github.com/lastancientone/deep_learning_machine_learning_stock
Deep Learning and Machine Learning stocks represent promising opportunities for both long-term and short-term investors and traders.
algorithms data-science deep-learning feature-engineering feature-extraction feature-selection features-extraction financial-engineering machine-learning neural-network prediction stock-analysis stock-data stock-market stock-prediction stock-price-prediction stock-prices stock-trading technical-analysis trading
Last synced: 14 Oct 2024
https://github.com/jrfiedler/causal_inference_python_code
Python code for part 2 of the book Causal Inference: What If, by Miguel Hernán and James Robins
causal-inference causality data-science python
Last synced: 29 Oct 2024
https://github.com/alan-turing-institute/clevercsv
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3
Last synced: 22 Oct 2024
https://github.com/mandiant/threatpursuit-vm
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
analytics cyber data-science fireeye intelligence intelligence-analysis malware mandiant threat threathunting threatintelligence virtual-machine
Last synced: 14 Oct 2024
https://github.com/mandiant/ThreatPursuit-VM
Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
analytics cyber data-science fireeye intelligence intelligence-analysis malware mandiant threat threathunting threatintelligence virtual-machine
Last synced: 04 Aug 2024
https://github.com/sb-ai-lab/LightAutoML
Fast and customizable framework for automatic ML model creation (AutoML)
automated-machine-learning automatic-machine-learning automl automl-algorithms binary-classification data-science kaggle lama machine-learning multiclass-classification nlp python regression
Last synced: 15 Nov 2024
https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting
NBA sports betting using machine learning
ai data-science deep-learning gambling keras machine-learning nba nba-analytics nba-prediction neural-network python sports sports-analytics sports-betting sports-data tensorflow
Last synced: 30 Oct 2024
https://github.com/business-science/free_r_tips
Free R-Tips is a FREE Newsletter provided by Business Science. It comes with bite-sized code tutorials every week.
data-science newsletter tips tips-and-tricks
Last synced: 15 Oct 2024
https://github.com/sb-ai-lab/lightautoml
Fast and customizable framework for automatic ML model creation (AutoML)
automated-machine-learning automatic-machine-learning automl automl-algorithms binary-classification data-science kaggle lama machine-learning multiclass-classification nlp python regression
Last synced: 15 Oct 2024
https://github.com/kyleskom/nba-machine-learning-sports-betting
NBA sports betting using machine learning
ai data-science deep-learning gambling keras machine-learning nba nba-analytics nba-prediction neural-network python sports sports-analytics sports-betting sports-data tensorflow
Last synced: 15 Oct 2024
https://github.com/devamoghs/machine-learning-with-python
Small scale machine learning projects to understand the core concepts . Give a Star 🌟If it helps you. BONUS: Interview Bank coming up..!
beginner-friendly data-science deep-learning exercises machine-learning practice-project python python-3 scikit-learn
Last synced: 10 Oct 2024
https://github.com/devAmoghS/Machine-Learning-with-Python
Small scale machine learning projects to understand the core concepts . Give a Star 🌟If it helps you. BONUS: Interview Bank coming up..!
beginner-friendly data-science deep-learning exercises machine-learning practice-project python python-3 scikit-learn
Last synced: 07 Aug 2024
https://github.com/rocketlaunchr/dataframe-go
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
data-science dataframe dataframes go golang machine-learning pandas pandas-dataframe python statistics
Last synced: 14 Oct 2024
https://github.com/annoviko/pyclustering
pyclustering is a Python, C++ data mining library.
algorithms c-plus-plus clustering data-mining data-science machine-learning neural-networks oscillatory-networks python python3
Last synced: 15 Oct 2024
https://github.com/skrub-data/skrub
Prepping tables for machine learning
data data-analysis data-cleaning data-preparation data-preprocessing data-science data-wrangling dirty-data machine-learning
Last synced: 15 Oct 2024
https://github.com/scikit-learn-contrib/mapie
A scikit-learn-compatible module for estimating prediction intervals.
classification confidence-intervals conformal-prediction data-science python regression sklearn
Last synced: 10 Oct 2024
https://github.com/ScottfreeLLC/AlphaPy
Python AutoML for Trading Systems and Sports Betting
backtesting classification cryptocurrency data-science deep-learning iex keras machine-learning pandas portfolio predictive-analytics python regression scikit-learn sports stocks time-series-analysis trading trading-platform trading-strategies
Last synced: 13 Nov 2024
https://github.com/crazyhottommy/getting-started-with-genomics-tools-and-resources
Unix, R and python tools for genomics and data science
bioinformatics cancer-genomics data-science
Last synced: 15 Oct 2024
https://github.com/deepwisdom/autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.
ai artificial-intelligence autodl autodl-challenge automated-machine-learning automl big-data data-science deeplearning feature-engineering full-automl lightgbm machine-learning model-selection multi-label nas python pytorch resnet tensorflow
Last synced: 22 Oct 2024
https://github.com/scottfreellc/alphapy
Python AutoML for Trading Systems and Sports Betting
backtesting classification cryptocurrency data-science deep-learning iex keras machine-learning pandas portfolio predictive-analytics python regression scikit-learn sports stocks time-series-analysis trading trading-platform trading-strategies
Last synced: 10 Oct 2024
https://github.com/skforecast/skforecast
Time series forecasting with machine learning models
arima autoregressive-forecasting backtesting-forecasters data-science direct-forecasting exogenous-predictors forecasting lightgbm lstm-neural-networks machine-learning multi-series-forecasting multi-step-forecasting multiple-time-series-forecasting probabilistic-forecasting python quantile-forecasting sarimax scikit-learn time-series xgboost
Last synced: 02 Nov 2024
https://github.com/DeepWisdom/AutoDL
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.
ai artificial-intelligence autodl autodl-challenge automated-machine-learning automl big-data data-science deeplearning feature-engineering full-automl lightgbm machine-learning model-selection multi-label nas python pytorch resnet tensorflow
Last synced: 03 Aug 2024
https://github.com/deepfence/FlowMeter
⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐
awesome data-science data-science-projects forensics-tools hacktoberfest infosectools machine-learning machine-learning-projects machinelearning machinelearningproject network-analysis network-security packet-analyser pcap security security-tools tcpdump-like
Last synced: 01 Nov 2024
https://github.com/JuliaStats/Distributions.jl
A Julia package for probability distributions and associated functions.
data-science julia probability-distributions statistics
Last synced: 15 Nov 2024
https://github.com/xorbitsai/xorbits
Scalable Python DS & ML, in an API compatible & lightning fast way.
data-science distributed-systems lightgbm machine-learning ml numpy pandas python scalable xgboost
Last synced: 11 Oct 2024
https://github.com/davidadsp/generative_deep_learning_2nd_edition
The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
chatgpt dalle2 data-science deep-learning diffusion-models generative-adversarial-network gpt-3 machine-learning python stable-diffusion tensorflow
Last synced: 11 Nov 2024
https://github.com/Shujian2015/FreeML
A List of Data Science/Machine Learning Resources (Mostly Free)
data-science deep-learning machine-learning natural-language-processing
Last synced: 13 Nov 2024
https://github.com/deepfence/flowmeter
⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐
awesome data-science data-science-projects forensics-tools hacktoberfest infosectools machine-learning machine-learning-projects machinelearning machinelearningproject network-analysis network-security packet-analyser pcap security security-tools tcpdump-like
Last synced: 26 Sep 2024
https://github.com/man-group/arcticdb
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading
Last synced: 15 Oct 2024
https://github.com/man-group/ArcticDB
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading
Last synced: 24 Oct 2024
https://github.com/elixir-explorer/explorer
Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
data-science dataframes elixir rust
Last synced: 14 Nov 2024
https://github.com/shujian2015/freeml
A List of Data Science/Machine Learning Resources (Mostly Free)
data-science deep-learning machine-learning natural-language-processing
Last synced: 15 Oct 2024
https://github.com/juliastats/distributions.jl
A Julia package for probability distributions and associated functions.
data-science julia probability-distributions statistics
Last synced: 15 Oct 2024
https://github.com/qri-io/qri
you're invited to a data party!
data-science dataset golang hacktoberfest hacktoberfest2021 ipfs opendata p2p qri service trust web3
Last synced: 06 Nov 2024
https://github.com/opendatadiscovery/odd-platform
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
alerting bigdata data-catalog data-discovery data-engineering data-exploration data-governance data-lineage data-observability data-pipelines data-platform data-profiling data-quality data-science datacatalog lineage metadata metadata-management observability oss
Last synced: 14 Oct 2024
https://github.com/sajal2692/data-science-portfolio
Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
data-science keras machine-learning nlp pandas portfolio python scikit-learn
Last synced: 10 Oct 2024
https://github.com/datumbox/datumbox-framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
big-data data-science java machine-learning nlp statistics
Last synced: 15 Oct 2024
https://github.com/novak-99/mlpp
A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
cpp data-science deep-learning machine-learning
Last synced: 30 Oct 2024
https://github.com/novak-99/MLPP
A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
cpp data-science deep-learning machine-learning
Last synced: 27 Oct 2024
https://github.com/okfn-brasil/querido-diario
📰 Diários oficiais brasileiros acessíveis a todos | 📰 Brazilian government gazettes, accessible to everyone.
civic-tech data-science governments-gazettes govtech hacktoberfest open-data politics scraping spider
Last synced: 14 Oct 2024
https://github.com/TeoMeWhy/teomerefs
Guia de referências técnicas para carreira em dados
data data-science machine-learning python
Last synced: 29 Oct 2024
https://github.com/teomewhy/teomerefs
Guia de referências técnicas para carreira em dados
data data-science machine-learning python
Last synced: 14 Oct 2024
https://github.com/makcedward/nlp
:memo: This repository recorded my NLP journey.
ai data-science deep-learning machine-learning nlp
Last synced: 12 Nov 2024
https://github.com/pro1code1hack/your-journey-to-fluent-python
Your Journey To Fluent Python
advanced-programming asyncio beginner-programming coding data-science education exercises functions learning learning-python oop oop-principles projects python python-3 python3 roadmap senior software-engineering tutorials
Last synced: 12 Oct 2024
https://github.com/moj-analytical-services/splink
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
data-matching data-science deduplicate-data deduplication duckdb em-algorithm entity-resolution fuzzy-matching record-linkage spark uk-gov-data-science
Last synced: 12 Oct 2024
https://github.com/areed1192/sigma_coding_youtube
This is a collection of all the code that can be found on my YouTube channel Sigma Coding.
data-science google-maps-api m-language mlanguage office-applications outlook-vba power-bi power-query powerpoint-vba python python-tutorials python-windows vba vba-excel win32 win32com word-vba yelp-fusion-api
Last synced: 30 Oct 2024
https://github.com/squaredtechnologies/thread
AI-powered Jupyter Notebook — use local AI to generate and edit code cells, automatically fix errors, and chat with your data
ai analysis analytics data-science jupyter jupyter-notebook jupyter-notebooks jupyterhub jupyterlab ollama python react reactjs
Last synced: 09 Nov 2024
https://github.com/logicalclocks/hopsworks
Hopsworks - Data-Intensive AI platform with a Feature Store
aws azure data-science feature-engineering feature-management feature-store gcp governance hopsworks kserve machine-learning ml mlops model-serving pyspark python serverless
Last synced: 29 Oct 2024
https://github.com/mrkn/pycall.rb
Calling Python functions from the Ruby language
data-science pycall python ruby rubydatascience rubyml
Last synced: 09 Oct 2024
https://github.com/rhiever/datacleaner
A Python tool that automatically cleans data sets and readies them for analysis.
automation data-science machine-learning python
Last synced: 15 Oct 2024
https://github.com/daochenzha/data-centric-AI
A curated, but incomplete, list of data-centric AI resources.
ai artificial-intelligence data-centric data-centric-ai data-centric-machine-learning data-curation data-engineering data-quality data-science machine-learning
Last synced: 30 Oct 2024
https://github.com/daochenzha/data-centric-ai
A curated, but incomplete, list of data-centric AI resources.
ai artificial-intelligence data-centric data-centric-ai data-centric-machine-learning data-curation data-engineering data-quality data-science machine-learning
Last synced: 14 Oct 2024
https://github.com/nfstream/nfstream
NFStream: a Flexible Network Data Analysis Framework.
artificial-intelligence cybersecurity data-analysis data-mining data-science dataset-generation deep-packet-inspection machine-learning ndpi netflow network-analysis network-monitoring network-security packet-analyser packet-capture pcap python traffic-analysis traffic-classification
Last synced: 15 Oct 2024