Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2025-01-04 00:07:05 UTC
- JSON Representation
https://github.com/pbiecek/breakdown
Model Agnostics breakDown plots
data-science iml interpretability machine-learning visual-explanations xai
Last synced: 31 Dec 2024
https://github.com/mc2-project/secure-xgboost
Secure collaborative training and inference for XGBoost.
collaborative-learning data-science enclave machine-learning privacy security xgboost
Last synced: 09 Nov 2024
https://github.com/pbiecek/breakDown
Model Agnostics breakDown plots
data-science iml interpretability machine-learning visual-explanations xai
Last synced: 11 Nov 2024
https://github.com/nischalshrestha/Unravel
A fluent code explorer for R. 🔍
data-science datawrangling dplyr r rstats shiny tidyr tidyverse
Last synced: 04 Dec 2024
https://github.com/wyattowalsh/data-science-notes
Open-source project hosted at https://makeuseofdata.com to crowdsource a robust collection of notes related to data science (math, visualization, modeling, etc)
calculus classification compilation crowdsourcing data-science first-timers first-timers-only jupyter-book linear-algebra machine-learning modeling probability regression simulation statistics up-for-grabs visualization
Last synced: 15 Nov 2024
https://github.com/lettier/lda-topic-modeling
A PureScript, browser-based implementation of LDA topic modeling.
bayesian bulma bulma-css clustering data-science functional-programming gibbs-sampling latent-dirichlet-allocation lda machine-learning machine-learning-algorithms natural-language-processing nlp nlp-machine-learning purescript reactive reactive-programming text-mining thermite topic-modeling
Last synced: 14 Oct 2024
https://github.com/pink-gorilla/notebook
Web based Clojure notebook application/-library.
clojure clojurescript codemirror data-science gorilla-notebook gorilla-repl pink-gorilla re-frame reagent vega
Last synced: 02 Jan 2025
https://github.com/target/data-validator
A tool to validate data, built around Apache Spark.
data-science data-validation hacktoberfest
Last synced: 05 Nov 2024
https://github.com/jeroenjanssens/python-polars-the-definitive-guide
Scripts and datasets for the O'Reilly book Python Polars: The Definitive Guide
data-science oreilly oreilly-books polars polars-dataframe python
Last synced: 31 Dec 2024
https://github.com/facultyai/lens
Summarise and explore Pandas DataFrames
dask data-exploration data-science data-visualisation dataframe pandas
Last synced: 08 Nov 2024
https://github.com/jay-johnson/sci-pype
A Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, training, testing, analyzing, extracting, importing, and archiving. This repository can run from a docker container or from the repository.
data-science devops-for-data-science docker docker-compose ipython ipython-notebook jupyter jupyter-notebook jupyter-themes machine-learning machine-learning-api predictive python red10 redis s3 seaborn stock-price-prediction xgb xgboost
Last synced: 11 Oct 2024
https://github.com/oracle/macest
Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores
confidence-estimation data-science machine-learning python
Last synced: 19 Dec 2024
https://github.com/TexteaInc/funix
Building web apps without manually creating widgets
app-builder data-science frontend machine-learning
Last synced: 04 Dec 2024
https://github.com/tlverse/sl3
💪 🤔 Modern Super Learning with Machine Learning Pipelines
data-science ensemble-learning ensemble-model machine-learning model-selection r r-package regression stacking statistics
Last synced: 11 Nov 2024
https://github.com/scottshambaugh/monaco
Quantify uncertainty and sensitivities in your computer models with an industry-grade Monte Carlo library.
data-science monaco monte-carlo python scientific-computing sensitivity-analysis simulation statistics uncertainty-analysis uncertainty-quantification
Last synced: 05 Nov 2024
https://github.com/pratapvardhan/notebooks
A collection of Jupyter/IPython notebooks
data-science exploratory-analysis ipython-notebook jupyter-notebook python
Last synced: 11 Oct 2024
https://github.com/danmorales/cursods_profdanilo
Códigos Python com diferentes aplicações como técnicas de machine learning e deep learning, fundamentos de estatística, problemas de regressão de classificação. Os vídeos com as explicações teóricas estão disponíveis no meu canal do YouTube
aprendizado-de-maquina ciencia-de-dados data-science deep-learning keras-classification-models keras-layer keras-models keras-neural-networks machine-learning machine-learning-algorithms numpy pandas-dataframe pandas-python scikit-learn scikitlearn-machine-learning scipy tensorflow tensorflow-tutorials
Last synced: 12 Nov 2024
https://github.com/G-Research/fasttrackml
Experiment tracking server focused on speed and scalability
ai apache-spark data-science data-visualization experiment-tracking machine-learning metadata metadata-tracking metrics ml mlflow mlflow-tracking-server mlops pytorch tensorboard tensorflow visualization
Last synced: 02 Dec 2024
https://github.com/xiyanghu/OSDT
Optimal Sparse Decision Trees
accelerate acceleration-model algorithm algorithm-optimization data-mining data-science interpretable-ml machine-learning ml-system mlsys neurips python python3
Last synced: 30 Oct 2024
https://github.com/omegaml/omegaml
MLOps simplified. One platform, all the functionality you need. Swiss made
artificial-intelligence celery data-science deploy docker-image jupyter-notebook machine-learning mlops productized pytorch scikit-learn tensorflow
Last synced: 01 Jan 2025
https://github.com/pityka/nspl
scala plotting (charting, graphing) library
canvas2d charts data-science data-visualization plots scala scala-js scientific-visualization visualization
Last synced: 28 Dec 2024
https://github.com/g-research/fasttrackml
Experiment tracking server focused on speed and scalability
ai apache-spark data-science data-visualization experiment-tracking machine-learning metadata metadata-tracking metrics ml mlflow mlflow-tracking-server mlops pytorch tensorboard tensorflow visualization
Last synced: 29 Dec 2024
https://github.com/batermj/data_sciences_campaign
【数据科学家系列课程】
algorithm-analysis algorithm-challenges algorithm-visualisation algorithms architectural-patterns architecture-visualization architectures artificial-intelligence automatic-machine-learning automation data-science design-patterns foundation-framework machine-learning-algorithms mathematical-statistics mathematics product-design robotics startup system-architecture
Last synced: 12 Nov 2024
https://github.com/anna-geller/dataflow-ops
Project demonstrating how to automate Prefect 2.0 deployments to AWS ECS Fargate
analytics analytics-engineering automation aws cicd data data-engineering data-engineering-infrastructure data-engineering-pipeline data-science dataflow dataflow-ops infrastructure-as-code observability orchestration pipeline prefect python serverless
Last synced: 28 Oct 2024
https://github.com/mdeff/ntds_2016
Material for the EPFL master course "A Network Tour of Data Science", edition 2016.
data-science education epfl graphs machine-learning neural-networks
Last synced: 21 Nov 2024
https://github.com/PetoLau/TSrepr
TSrepr: R package for time series representations
data-analysis data-mining data-mining-algorithms data-science r r-package representation time-series time-series-analysis time-series-classification time-series-clustering time-series-data-mining time-series-representations
Last synced: 05 Nov 2024
https://github.com/ieshreya/data-science-resources
Free self-taught educational resources for Data Science! I'm currently learning Data Science. I build this repository for helping myself. But if it helps you anyhow, feel free to star it!
computer-science data-science python resources
Last synced: 14 Oct 2024
https://github.com/alan-turing-institute/environmental-ds-book
A computational notebook community for open environmental data science 🌎
climate-change community-project data-science ecosystem-modeling environmental-monitoring
Last synced: 13 Nov 2024
https://github.com/datakitchen/data-observability-installer
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
data data-engineering data-observability data-profiling data-quality data-reliability data-science datachecker datacleaner datacleaning dataops dataquality datatesting datavalidation mssql pipeline-tests postgresql redshift self-hosted snowflake
Last synced: 28 Dec 2024
https://github.com/wlandau/targets-tutorial
Short course on the targets R package
data-science make pipeline r r-package reproducibility reproducible-research rstats targets workflow
Last synced: 27 Oct 2024
https://github.com/ideos/gloe
A general-purpose library designed to guide developers in expressing their code as a flow.
clean-code data-science flow functional-programming machine-learning python typing
Last synced: 30 Oct 2024
https://github.com/IlyaGusev/tgcontest
Telegram Data Clustering contest solution by Mindful Squirrel
classification clustering cpp data-science document-similarity fasttext machine-learning nlp
Last synced: 04 Nov 2024
https://github.com/AnonCatalyst/Coeus-OSINT-ToolBox
Coeus 🌐 is an OSINT ToolBox empowering users with tools for effective intelligence gathering from open sources. From social media monitoring 📱 to data analysis 📊, it offers a centralized platform for seamless OSINT investigations.
data-science data-visualization database forensic-analysis forensics forensics-tools framework information-retrieval infosec osint osint-framework osint-python osint-resources osint-tool osint-toolkit people-search reconnaissance
Last synced: 13 Nov 2024
https://github.com/outerbounds/dsbook
Code samples for the Effective Data Science Infrastructure book
data-science infrastructure machine-learning
Last synced: 10 Nov 2024
https://github.com/giswqs/manjaro-linux
Shell scripts for setting up Manjaro Linux for Python programming and deep learning
data-science deep-learning gis kde manjaro manjaro-linux notebook-jupyter python r remote-sensing shell-scripts tensorflow
Last synced: 30 Dec 2024
https://github.com/alan-turing-institute/scivision
scivision: a framework for scientific image analysis
computer-vision data-science hut23 hut23-1205 image-processing machine-learning scientific-research
Last synced: 30 Dec 2024
https://github.com/dagshub/client
DagsHub client libraries
ai data data-science data-streaming dvc hacktoberfest hacktoberfest2023 keras machine-learning machinelearning mlops python pytorch tensorflow
Last synced: 29 Dec 2024
https://github.com/jkoutsikakis/pytorch-wrapper
Provides a systematic and extensible way to build, train, evaluate, and tune deep learning models using PyTorch.
data-science deep-learning machine-learning neural-network python pytorch pytorch-wrapper tensor
Last synced: 27 Nov 2024
https://github.com/GlobalMaksimum/sadedegel
A General Purpose NLP library for Turkish
acikhack2 ai artificial-intelligence bert binder corpus data-science deep-learning embeddings heroku machine-learning natural-language-processing neural-network neural-networks news-summarizer nlp python
Last synced: 12 Nov 2024
https://github.com/mratsim/arch-data-science
Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
archlinux cuda cudnn data-science deep-learning lightgbm machine-learning mkl mxnet natural-language-processing natural-language-understanding nervana opencv package pandas pytorch scikit-learn spacy tensorflow xgboost
Last synced: 09 Nov 2024
https://github.com/mratsim/Arch-Data-Science
Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
archlinux cuda cudnn data-science deep-learning lightgbm machine-learning mkl mxnet natural-language-processing natural-language-understanding nervana opencv package pandas pytorch scikit-learn spacy tensorflow xgboost
Last synced: 27 Nov 2024
https://github.com/OpenSTEF/openstef
Automated Machine Learning pipelines. Builds the Open Short Term Energy Forecasting package.
data-science energy energy-forecasting forecasting machine-learning python time-series
Last synced: 14 Nov 2024
https://github.com/tidypyverse/tidypandas
A grammar of data manipulation for pandas inspired by tidyverse
data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse
Last synced: 07 Nov 2024
https://github.com/lyltj2010/DataMining
数据挖掘开源书
data-science datamining deeplearning machine-learning
Last synced: 21 Dec 2024
https://github.com/geekplux/timeline-sankey
A project to visualize time range series data using the Sankey diagram.
data-analysis data-science data-visualization sankey sankey-chart sankey-diagram time-series time-series-analysis timeline visualization
Last synced: 27 Oct 2024
https://github.com/eclipse-zenoh-flow/zenoh-flow
zenoh-flow aims at providing a zenoh-based data-flow programming framework for computations that span from the cloud to the device.
autonomous-vehicles data-science dataflow-programming machine-learning robotics ros2 rust-lang
Last synced: 29 Dec 2024
https://github.com/tiledb-inc/tiledb-vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
bioinformatics data-science genomics gwas python spark tiledb variant-calling vcf
Last synced: 31 Dec 2024
https://github.com/cedrickchee/data-science-notebooks
Data science Python notebooks—a collection of Jupyter notebooks on machine learning, deep learning, statistical inference, data analysis and visualization.
data-science deep-learning fastai kaggle keras machine-learning notebooks numpy pandas python pytorch tensorflow
Last synced: 17 Nov 2024
https://github.com/parths007/loan-approval-prediction
Loan Application Data Analysis
accuracy-analysis classification data-analysis data-mining data-science data-visualization juypter logistic-regression machine-learning notebook-jupyter python python3
Last synced: 27 Oct 2024
https://github.com/zetane/zetaforge
Open source AI platform for rapid development of advanced AI and AGI pipelines.
agi ai claude data-science developer-tools gpt kubernetes llm machine-learning ml ml-pipelines mlops python workflow workflow-orchestration zetaforge
Last synced: 29 Dec 2024
https://github.com/asavinov/prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
business-intelligence data-preparation data-preprocessing data-processing data-science data-wrangling feature-engineering map-reduce olap pandas python spark workflow
Last synced: 07 Nov 2024
https://github.com/lsys/lexicalrichness
:smile_cat: :speech_balloon: A module to compute textual lexical richness (aka lexical diversity).
data-mining data-science information-retrieval lexical-analysis lexical-analyzer linguistic-analysis natural-language natural-language-processing nlp python
Last synced: 01 Jan 2025
https://github.com/sb-ai-lab/hypex
Fast and customizable framework for automatic and quick Causal Inference in Python
ab-testing causal-inference causalinference data-science faiss kaggle matching python statistics
Last synced: 28 Dec 2024
https://github.com/layerai-archive/sdk
Metadata store for Production ML
collaboration data-science data-versioning deep-learning experiment-tracking hyperparameter-optimization hyperparameter-tuning keras machine-learning mlops model-versioning python pytorch reinforcement-learning sklearn tensorflow
Last synced: 26 Sep 2024
https://github.com/mkearney/tweetbotornot2
🔍🐦🤖 Detect Twitter Bots!
bot-detection bot-detector classification data-science machine-learning r r-package rstats rtweet twitter twitter-api twitter-bot-detection twitter-bots xgboost
Last synced: 15 Nov 2024
https://github.com/firmai/business-analytics-and-mathematics-python-book
Advanced Business Analytics and Mathematics with Python (by @firmai)
analytics business data-analysis data-science mathematics python
Last synced: 19 Nov 2024
https://github.com/longxingtan/data-competitions
Data competition experience and solutions
data-competition data-mining data-science industry-4 kddcup spatio-temporal tianchi-competition time-series
Last synced: 28 Nov 2024
https://github.com/markvanderloo/simputation
Making imputation easy
data-science imputation officialstatistics r rstats
Last synced: 11 Nov 2024
https://github.com/synthesized-io/fairlens
Identify bias and measure fairness of your data
bias data data-analysis data-science fairness pandas python statistics
Last synced: 15 Nov 2024
https://github.com/slai-labs/get-beam
Run GPU inference and training jobs on serverless infrastructure that scales with you.
artificial-intelligence cloud-computing cost-optimization data-science deep-learning distributed-computing gpu-acceleration gpu-computing hpc llm-serving llm-training machine-learning ml-infrastructure mlops python serverless serverless-architectures
Last synced: 09 Nov 2024
https://github.com/darribas/gds_course
Geographic Data Science, the course
course data-science educational gds-course geographic-data-science gis
Last synced: 27 Oct 2024
https://github.com/soda-inria/carte
Repository for CARTE: Context-Aware Representation of Table Entries
classification data-science graph-transformer machine-learning regression transformers
Last synced: 28 Dec 2024
https://github.com/ResponsiblyAI/responsibly
Toolkit for Auditing and Mitigating Bias and Fairness of Machine Learning Systems 🔎🤖🧰
artificial-intelligence audit bias bias-correction bias-finder bias-reduction data-science ethics fairness fairness-ai fairness-awareness-model fairness-ml fairness-testing machine-bias machine-learning natural-language-processing python
Last synced: 09 Nov 2024
https://github.com/aws-samples/cloud-experiments
Open innovation with 60 minute cloud experiments on AWS
amazon-athena amazon-comprehend amazon-rekognition amazon-s3 amazon-sagemaker aws-cloud aws-glue data-science machine-learning notebooks
Last synced: 27 Nov 2024
https://github.com/TradeMaster-NTU/fintech-literature
Fintech literature, including journal, conference, book and useful links
artificial-intelligence data-science machine-learning natural-language-processing quantitative-finance reinforcement-learning
Last synced: 16 Dec 2024
https://github.com/delsner/flask-angular-data-science
Repository for a data science starter app using Flask, Angular and Docker. https://medium.com/@dvelsner/deploying-a-simple-machine-learning-model-in-a-modern-web-application-flask-angular-docker-a657db075280
angular data-science docker flask machine-learning python sklearn typescript
Last synced: 10 Oct 2024
https://github.com/maxim5/hyper-engine
Python library for Bayesian hyper-parameters optimization
bayesian-optimization big-data convolutional-neural-networks data-science deep-learning gaussian-processes hyperparameter-optimization machine-learning model-selection neural-network optimization-algorithms python random-search tensorflow
Last synced: 05 Nov 2024
https://github.com/davisidarta/topometry
Systematically learn and evaluate manifolds from high-dimensional data
clustering data-science data-visualization dimensionality-reduction graph graph-layout hypothesis-generation laplace-beltrami machine-learning manifold-learning scikit-learn single-cell visualization
Last synced: 19 Dec 2024
https://github.com/akgold/do4ds
A book on DevOps for Data Scientists with CRC Press.
data-science devops it python r
Last synced: 10 Nov 2024
https://github.com/microsoft/coml
Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.
automated-machine automl copilot data-science hyperparameter-optimization jupyter jupyter-lab large-language-models llm machine-learning
Last synced: 28 Dec 2024
https://github.com/Nelson-numerical-software/nelson
The Nelson Programming Language
cpp17 data-science data-structures interpreter mathematical-functions matlab matrix-functions nelson octave programming-language scientific-computing scilab
Last synced: 17 Nov 2024
https://github.com/DataKitchen/data-observability-installer
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
data data-engineering data-observability data-profiling data-quality data-reliability data-science datachecker datacleaner datacleaning dataops dataquality datatesting datavalidation mssql pipeline-tests postgresql redshift self-hosted snowflake
Last synced: 13 Nov 2024
https://github.com/bioconductor/genomicdatacommons
Provide R access to the NCI Genomic Data Commons portal.
api-client bioconductor bioinformatics cancer core-services data-science genomics nci r tcga vignette
Last synced: 29 Dec 2024
https://github.com/aershov24/machine-learning-ds-interview-questions
🔴 1704 Machine Learning, Data Science & Python Interview Questions (ANSWERED) To Kill Your Next ML & DS Interview. Get All Answers + PDFs on MLStack.Cafe. Post your ML Jobs 👉
algorithms-and-data-structures data-analysis data-science interview-practice interview-preparation interview-questions machine-learning machine-learning-algorithms machinelearning
Last synced: 18 Nov 2024
https://github.com/sangaline/reverse-engineering-the-hacker-news-ranking-algorithm
An analysis of historical Hacker News data to determine the ranking algorithm
analysis data-science hacker-news
Last synced: 07 Nov 2024
https://github.com/nuclio/nuclio-jupyter
Nuclio Function Automation for Python and Jupyter
data-science jupyter kubernetes nuclio python
Last synced: 28 Dec 2024
https://github.com/uc-r/uc-r.github.io
Main repository for R programming courses @ University of Cincinnati, courses and tutorials that focus on data wrangling, exploration, visualization, and analysis with R.
classroom data-science data-wrangling machine-learning r tutorial tutorial-code visualization
Last synced: 30 Oct 2024
https://github.com/khuyentran1401/machine-learning-pipeline
Example machine learning pipeline with MLflow and Hydra
data-science hydra machine-learning machine-learning-pipeline mlflow
Last synced: 26 Nov 2024
https://github.com/n3mo/data-science
Data science tooling for Racket
data-science racket sentiment-analysis statistics text-processing
Last synced: 18 Nov 2024
https://github.com/habedi/practicalmachinelearning
A collection of open-source and free machine learning resources
anomaly-detection data-analysis data-mining data-science data-science-resourses datasets deep-learning deep-neural-networks graph-algorithms graph-mining jupyter-notebook kaggle machine-learning pandas python python-machine-learning scikit-learn self-learning zeppelin-notebook
Last synced: 31 Dec 2024
https://github.com/zjuearthdata/geochemistrypi
an open-sourced highly automated machine learning Python framework for data-driven geochemistry discovery
dash data-science fastapi flaml geochemistry mlflow nodejs ray reactjs scikit-learn typer
Last synced: 01 Jan 2025
https://github.com/svenkreiss/databench
Data analysis tool.
data-science data-visualization python
Last synced: 26 Dec 2024
https://github.com/svilupp/PromptingTools.jl
Streamline your life using PromptingTools.jl, the Julia package that simplifies interacting with large language models.
data-science generative-ai julia
Last synced: 28 Oct 2024
https://github.com/palashio/nylon
An intelligent, flexible grammar of machine learning.
auto-ml data-science grammar machine-learning
Last synced: 07 Nov 2024
https://github.com/rogerfitz/tutorials
Git Repo for Articles on Ergo Sum blog and the youtube channel https://www.youtube.com/channel/UCiie9CN--dazA7iT2sry5FA
algorithmia data-science draft-kings fan-duel fivethirtyeight google-maps-api ocr python sports tech text-to-speech visualizations
Last synced: 08 Nov 2024
https://github.com/stanfordnlp/edu-convokit
Edu-ConvoKit: An Open-Source Framework for Education Conversation Data
data data-analysis data-science education language natural-language-processing
Last synced: 01 Jan 2025
https://github.com/mmkim1210/geneticsmakie.jl
🧬High-performance genetics- and genomics-related data visualization using Makie.jl
bioinformatics cairomakie colocalization data-science fine-mapping genetics genomics gwas julia julia-language linkage locuszoom makie multi-ethnic multivariate openmendel phewas qtl v2f visualization
Last synced: 17 Dec 2024
https://github.com/Dumbris/trunklucator
Python module for data scientists for quick creating annotation projects.
active-learning annotation annotation-tool data-science machine-learning nlp
Last synced: 04 Nov 2024
https://github.com/bcgov/bcdata
An R package for searching & retrieving data from the B.C. Data Catalogue
bcdc citz data-science env r r-package rstats
Last synced: 28 Dec 2024
https://github.com/GDSL-UL/san
Spatial Modelling for Data Scientists
book cross-validation data-science geographically-weighted-regression maps moran-i multilevel-models r r-spatial spatial-analysis spatial-econometrics
Last synced: 04 Dec 2024
https://github.com/seandavi/geoquery
The bridge between the NCBI Gene Expression Omnibus and Bioconductor
bioconductor bioinformatics data-science genomics ncbi-geo r rstats
Last synced: 04 Jan 2025
https://github.com/lettier/interactiveknn
Interactive K-Nearest Neighbors machine learning algorithm in JavaScript.
ai classification data-analysis data-science gui html5 interactive-knearest-neighbors javascript k-nearest-neighbor k-nearest-neighbors k-nearest-neighbours knn machine-learning machine-learning-algorithms nearest-neighbor-search scikit-learn statistics visualization
Last synced: 30 Oct 2024
https://github.com/dspinellis/alexandria3k
Local relational access to openly-available publication data sets
bibliometric-analysis crossref data-science orcid scientometrics
Last synced: 03 Jan 2025
https://github.com/beneath-hq/beneath
Beneath is a serverless real-time data platform ⚡️
analytics beneath data-engineering data-pipelines data-science data-warehouse dataops developer-tools etl go kubernetes mlops python sql streaming
Last synced: 04 Nov 2024
https://github.com/XpressAI/xircuits
Simple visual programming environment for jupyterlab
data-science jupyterlab python
Last synced: 10 Oct 2024
https://github.com/gagolews/datawranglingpy
Minimalist Data Wrangling with Python (Open-Access Textbook)
data-analysis data-science data-visualisation data-wrangling jupyter machine-learning matplotlib modelling numpy pandas python python3 scikit-learn scipy scipy-stats seaborn statistics
Last synced: 01 Jan 2025
https://github.com/sportsdataverse/sportsdataverse-py
sportsdataverse python package
cfb-data college-basketball college-football data-science espn hockey nba nba-stats nfl nflfastr nhl nhl-api python sports sports-analytics sports-data sports-stats sportsdataverse wnba womens-basketball
Last synced: 28 Dec 2024
https://github.com/ropensci/gittargets
Data version control for reproducible analysis pipelines in R with {targets}.
data-science data-version-control data-versioning r r-package reproducibility reproducible-research rstats targets workflow
Last synced: 19 Dec 2024
https://github.com/andrea-ballatore/open-geo-data-education
Open Geospatial Datasets for GIS Education: This is a repository of open geospatial datasets to be used in an educational context. I created these files over years of teaching Geographic Data Science and GIS. All original datasets are freely available online with open data licenses (see the dataset attribution for details). All the datasets in this repository have been selected, cleaned, harmonised, and repackaged for GIS exercises in a higher-education context. This is a pretty time-intensive process that other educators can hopefully avoid by using these versions.
data-science geojson geospatial-data geospatial-datasets gis gis-data gis-education tsv
Last synced: 27 Oct 2024