Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2024-11-19 00:06:52 UTC
- JSON Representation
https://github.com/dfinke/psduckdb
PSDuckDB is a PowerShell module that provides seamless integration with DuckDB, enabling efficient execution of analytical SQL queries directly from the PowerShell environment.
data-analysis data-science duckdb powershell sql
Last synced: 27 Oct 2024
https://github.com/pjaselin/cubist
A Python package for fitting Quinlan's Cubist regression model
data-science machine-learning python regression scikit-learn
Last synced: 14 Nov 2024
https://github.com/mlr-org/mlr3torch
Deep learning framework for the mlr3 ecosystem based on torch
data-science deep-learning machine-learning mlr3 r r-package torch
Last synced: 06 Nov 2024
https://github.com/florents-tselai/greek-wines-analysis
Scraper, Data and Analysis for "Analyzing 1000+ Greek Wines with Python"
beautifulsoup data-science pandas python seaborn web-scraping
Last synced: 31 Oct 2024
https://github.com/pityka/saddle
SADDLE: Scala Data Library
data data-science dataframe linear-algebra matrix numpy pandas scala
Last synced: 28 Sep 2024
https://github.com/arturoeanton/go-notebook
Go-Notebook is inspired by Jupyter Project (link) in order to document Golang code.
data data-science data-visualization documentation gobook godoc golang golang-examples golang-notebook golang-tools gomacro jupyter notebook notebook-jupyter plot repl shell-go
Last synced: 12 Oct 2024
https://github.com/apreshill/data-vis-labs-2018
Principles & Practice of Data Visualization, CS631 Spring 2018
data-science data-visualization education rstats teaching
Last synced: 15 Nov 2024
https://github.com/mathworks/climatedatastore
Climate Data Store Toolbox for MATLAB
climate climate-data data-analysis data-science matlab matlab-toolbox
Last synced: 16 Nov 2024
https://github.com/njanakiev/openstreetmap-data-science
Data Science with OpenStreetMap
data-science openstreetmap python
Last synced: 06 Nov 2024
https://github.com/leemengtw/gist-evernote
A Python application that sync Github Gists and save them to Evernote notebook as screenshots.
data-science evernote gists github github-graphql jupyter-notebook pet-project python selenium sync
Last synced: 07 Aug 2024
https://github.com/ivanbongiorni/tensorflow2.0_notebooks
Implementation of a series of Neural Network architectures in TensorFow 2.0
autoencoder autograph batch-gradient-descent classifier cnn-classifier convolutional-neural-networks data-science deep-learning dimensionality-reduction forecast-model lstm machine-learning neural-network python python-3 rnn rnn-tensorflow tensorflow tensorflow-tutorials tensorflow2
Last synced: 23 Oct 2024
https://github.com/jules32/rmarkdown-website-tutorial
Tutorial for creating websites w/ R Markdown
data-science rmarkdown rstats teaching tutorial
Last synced: 06 Nov 2024
https://github.com/leandromineti/ml-knowledge-graph
Organizing concepts related to machine learning and artificial intelligence
artificial-intelligence calculus data-science deep-learning knowledge-graph linear-algebra machine-learning statistical-learning
Last synced: 05 Nov 2024
https://github.com/lkuffo/data-viz
Más de 50 ejemplos de visualizaciones y análisis de datos en Matplotlib, Pandas, Seaborn, Plotly, Bokeh y Networkx
data-analysis data-science dataviz geoviz jupyter jupyter-notebook matplotlib networkx pandas plotly python seaborn
Last synced: 05 Nov 2024
https://github.com/tirendazacademy/chatgpt-with-examples
This repo contains ChatGPT tutorials about data science, machine learning, deep learning, Python. We show how to use Chat GPT with examples.
chat-gpt chatgpt chatgpt-api chatgpt-python chatgpt3 data-science deep-learning machine-learning
Last synced: 08 Nov 2024
https://github.com/jmari/ipharo
Pharo Smaltalk kernel for Jupyter
data-science jupyter-notebook pharo pharo-smalltalk smalltalk
Last synced: 09 Oct 2024
https://github.com/tdeboissiere/cookiecutter-deeplearning
Project folder structure for doing and sharing deep learning work.
Last synced: 05 Nov 2024
https://github.com/jmari/iPharo
Pharo Smaltalk kernel for Jupyter
data-science jupyter-notebook pharo pharo-smalltalk smalltalk
Last synced: 17 Nov 2024
https://github.com/dMLTquant/openbb_sdk_exporation
Explore OpenBB SDK without having to install anything on your local machine. You just need a GitHub and a GitPod account.
algorithmic-trading data-science financial-data jupyter notebook openbb python
Last synced: 01 Nov 2024
https://github.com/opengeos/aws-open-data
A list of open datasets on AWS
amazon-web-services aws data-science deep-learning geospatial machine-learning open-data
Last synced: 11 Nov 2024
https://github.com/ipeirotis/introduction-to-python
Notes for the "Introduction to Programming for Data Science" class
data-science for-beginners python python3
Last synced: 17 Nov 2024
https://github.com/gaelforget/ClimateModels.jl
Julia interface to climate models + tracked workflow framework
atmosphere climate cmip data data-science earth-observation ecco git interface ipcc julia mitgcm modeling ocean parameters workflow
Last synced: 08 Aug 2024
https://github.com/leriomaggio/python-data-science
Lecture notes and materials for Python Data Science course
data-science jupyter-notebooks machine-learning materials python-tutorials
Last synced: 29 Oct 2024
https://github.com/ammsa/dtcleaner
DTCleaner: data cleaning using multi-target decision trees.
data-cleaning data-mining data-preprocessing data-quality data-science data-wrangling
Last synced: 28 Oct 2024
https://github.com/idouble/pandas-python-data-analysis-playground
🐍 Data Analysis with the Pandas Library & Notes 📊📈
analysis csv csv-files data data-analysis data-science data-visualization dataframe examples library pandas pandas-dataframe pandas-library pandas-python python
Last synced: 09 Nov 2024
https://github.com/MitchellHarrison/python-twitch-chatbot
A custom, 100% Python Twitch Chatbot that stores chat/viewership data in a PostgreSQL database.
analytics bot chat-bot data-analysis data-science database insights postgresql python-chat-application python-chatbot python-irc-bot python-postgresql python-twitch-api python-twitch-bot twitch twitch-api twitch-bot twitch-data twitch-irc
Last synced: 16 Nov 2024
https://github.com/tejzpr/ordered-concurrently
Ordered-concurrently a library for concurrent processing with ordered output in Go. Process work concurrently and returns output in a channel in the order of input. It is useful in concurrently processing items in a queue, and get output in the order provided by the queue.
concurrent concurrent-data-structure data-pipeline data-science golang golang-library ordered parallel parallel-computing
Last synced: 26 Oct 2024
https://github.com/jhwohlgemuth/pwsh-prelude
PowerShell “standard” library for supercharging your productivity. Provides a powerful cross-platform scripting environment enabling efficient analysis and sustainable science in myriad contexts.
applied-mathematics cli cli-app data-science hacktoberfest library mathematics powershell powershell-module statistics text-processing text-to-speech user-interface
Last synced: 27 Oct 2024
https://github.com/nodestream-proj/nodestream
A Declarative framework for Building, Maintaining, and Analyzing Graph Data
api athena aws cli data-engineering data-lake data-science declarative etl framework graph graphql kafka knowledge-graph neo4j python s3 security visualization yaml
Last synced: 10 Oct 2024
https://github.com/hunar4321/reweight-gpt
Reweight GPT - a simple neural network using transformer architecture for next character prediction
algorithms data-science gpt language-model machine-learning nerual-networks numpy pytorch
Last synced: 14 Nov 2024
https://github.com/paulk-asert/groovy-data-science
Some Data Science examples using Groovy
beakerx commons-math constraint-programming data-science deep-learning groovy image-recognition kmeans-clustering linear-programming linear-regression mxnet natural natural-language-processing spark
Last synced: 01 Nov 2024
https://github.com/sametcopur/ruleopt
Optimization-Based Rule Learning for Classification
data-science explainable-ai linear-programming machine-learning machine-learning-library python
Last synced: 05 Nov 2024
https://github.com/ak-coram/cl-duckdb
Common Lisp CFFI wrapper around the DuckDB C API
c-bindings common-lisp data-science duckdb lisp olap parquet sql
Last synced: 13 Nov 2024
https://github.com/ActuariesInstitute/cookbook
Data and analytics cookbook for actuaries
actuarial analytics data-science hacktoberfest
Last synced: 08 Aug 2024
https://github.com/tstreamdoth/instacart-market-basket-analysis
Use Instacart public dataset to report which products are often shopped together. 🍋🍉🥑🥦
data-analysis data-science instacart market-basket-analysis
Last synced: 28 Oct 2024
https://github.com/root-11/tablite
multiprocessing enabled out-of-memory data analysis library for tabular data.
data-analysis data-science datatype disk etl excel filereader pandas pivot-tables python table tabular-data
Last synced: 11 Oct 2024
https://github.com/rafzamb/sknifedatar
sknifedatar is a package that serves primarily as an extension to the modeltime 📦 ecosystem. In addition to some functionalities of spatial data and visualization.
data data-analysis data-science data-visualization forecasting r statistics time-series
Last synced: 05 Aug 2024
https://github.com/megagonlabs/ruler
Data Programming by Demonstration (DPBD) for Document Classification
data-labeling data-programming data-science machine-learning training-data weak-supervision
Last synced: 10 Nov 2024
https://github.com/aachartmodel/aachartkit-swift-pro
📈📊👑👑👑AAChartKit-Swift-Pro is a professional version of AAChartKit-Swift, it is an elegant and friendly chart framework for iOS, iPadOS, macOS. AAChartKit-Swift-Pro is a more powerful data visualization framework that supports more types beautiful chart like bellcurve, bullet, columnpyramid, cylinder, dependencywheel, heatmap, histogram, networkgraph, organization, packedbubble, pareto, sankey, series, solidgauge, streamgraph, sunburst, tilemap, timeline, treemap, variablepie, variwide, vector, venn, windbarb, wordcloud, xrange charts and so on.
aacharts chart charting-library data-science data-visualization framework highcharts hybrid ios ipados macos plot swift webview
Last synced: 07 Nov 2024
https://github.com/microsoft/automated-explanations
Explain a black-box module in natural language.
artificial-intelligence automated-interpretability data-science explanation fmri fmri-data-analysis gpt gpt4 huggingface interpretability language-model large-language-models machine-learning mechanistic-interpretability neuroscience xai
Last synced: 07 Oct 2024
https://github.com/scrapinghub/page_clustering
A simple algorithm for clustering web pages, suitable for crawlers
Last synced: 10 Nov 2024
https://github.com/glevv/obscure_stats
A small collection of lesser-known statistical measures
data-analysis data-science descriptive-statistics math numpy python robust-statistics scipy statistical-analysis statistics
Last synced: 14 Nov 2024
https://github.com/datakitchen/dataops-observability
DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.
data data-engineering data-observability data-science dataops pipleine-monitoring
Last synced: 06 Nov 2024
https://github.com/benjaminmbrown/real-time-data-viz-d3-crossfilter-websocket-tutorial
Tutorial on real-time data visualization. Python websocket server & d3.js + crossfilter.js frontend
crossfilter d3 d3js data-science data-visualization dcjs tutorial websockets
Last synced: 06 Aug 2024
https://github.com/chandrikadeb7/coursera_ibm_data_science_professional_certificate
This repo consists of the lecture PDFs and quiz solutions of all the courses under the IBM Data Science Professional Certificate specialization course of Coursera.
coursera coursera-assignment coursera-data-science coursera-solutions coursera-specialization data-science ibm ibm-data-science jupyter-notebook lecture-pdfs professional-certificates quiz-solutions solutions specialization
Last synced: 09 Nov 2024
https://github.com/meiyulee/MathAI
免費數字驅動的數學模型人工智能 | 為你的數字規律建立數學模型 | C語言免安裝軟體
ai artifical-intelligence bigdata chatgpt data-science dataanalytics datadriven math-ai mathai mathematical-modelling mathematics mathgpt numerical-computation numerical-methods portable regression regression-analysis regression-models science statistics-modeling
Last synced: 09 Nov 2024
https://github.com/mlabonne/blog
https://mlabonne.github.io/blog/
blog data-science graph-neural-networks linear-programming reinforcement-learning
Last synced: 16 Nov 2024
https://github.com/lamres/capm_shiny
Demo project of creating an interactive analytical tool for stock market using CAPM.
capm data-science r shiny shinyapps stock-market stocks time-series
Last synced: 13 Aug 2024
https://github.com/sayakpaul/benchmarking-and-mli-experiments-on-the-adult-dataset
Contains benchmarking and interpretability experiments on the Adult dataset using several libraries
data-science fastai h2oai interpretable-machine-learning machine-learning microsoft-interpret tensorflow
Last synced: 22 Oct 2024
https://github.com/dfinke/PSDuckDB
PSDuckDB is a PowerShell module that provides seamless integration with DuckDB, enabling efficient execution of analytical SQL queries directly from the PowerShell environment.
data-analysis data-science duckdb powershell sql
Last synced: 23 Aug 2024
https://github.com/ww-tech/primrose
Primrose modeling framework for simple production models
dag data-science datascience deployment machine-learning primrose python workflows
Last synced: 18 Nov 2024
https://github.com/viodotcom/ppca_rs
Python+Rust implementation of the Probabilistic Principal Component Analysis model
data-science dimensionality-reduction em-algorithm linear-algebra machine-learning machine-learning-algorithms maximum-likelihood maximum-likelihood-estimation missing-data missing-values pca pca-analysis python rust
Last synced: 14 Nov 2024
https://github.com/amitkaps/art-data-science
The Art of Data Science
data-analysis data-science data-visualisation problem-solving workshop-materials
Last synced: 06 Nov 2024
https://github.com/alan-turing-institute/grace
Graph Representation Analysis for Connected Embeddings
computer-vision data-science feature-extraction graphical-models image-processing latent-representations machine-learning neural-networks object-detection
Last synced: 13 Nov 2024
https://github.com/machinable-org/machinable
A modular system for machinable research code
convention-over-configuration data-science framework-agnostic machine-learning python-3 research-and-development
Last synced: 30 Oct 2024
https://github.com/ai4os/DEEPaaS
A REST API to serve machine learning and deep learning models
aiohttp artificial-intelligence data-science deep-hybrid-datacloud deep-learning deepaas-api h2020 http machine-learning machinelearning neural-network neural-networks rest-api restful-api
Last synced: 05 Aug 2024
https://github.com/ivan-bilan/nlp-and-data-science-spotlights
Regular spotlights of underrated NLP and Data Science GitHub repositories
data-science deep-learning natural-language-processing nlp spotlight
Last synced: 08 Nov 2024
https://github.com/scicloj/scicloj-data-science-handbook
Clojure data science handbook - journal style examples of data science
clojure clojurescript data-science notebook scicloj
Last synced: 15 Nov 2024
https://mlabonne.github.io/blog/
https://mlabonne.github.io/blog/
blog data-science graph-neural-networks linear-programming reinforcement-learning
Last synced: 03 Sep 2024
https://github.com/dataprofessor/youtube
Collection of YouTube videos on data science on the Data Professor YouTube channel.
bioinformatics data-professor data-science data-science-toolbox dataprofessor datascience machine-learning machinelearning python python-data-science r scikit-learn streamlit youtube youtube-channel
Last synced: 11 Oct 2024
https://github.com/wri-dssg-omdena/policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
active-learning bert data-science document-classification environmental huggingface incentives landscape-restoration lda machine-learning nlp policy sbert scraping scrapy sentence-transformers spyder text-classification topic transformers
Last synced: 30 Oct 2024
https://github.com/kalebu/real-time-vehicle-dection-python
A python project that does real-time vehicle detection using a trained car-cascade Model
article computervision data-science machine-learning object-detection python python-3 python-datascience python-tanzania pythonprojects vehicle-dection-python vehicle-detection
Last synced: 09 Nov 2024
https://github.com/braph-software/BRAPH-2
BRAPH 2.0 is a comprehensive software package for the analysis and visualization of brain connectivity data, offering flexible customization, rich visualization capabilities, and a platform for collaboration in neuroscience research.
biomedical-engineering brain-connectivity-analysis brain-research computational-neuroscience connectomics data-analysis data-science data-visualization deep-learning graph-theory machine-learning matlab network-analysis neuroimaging neuroscience open-source reproducible-research research-tools scientific-software toolbox
Last synced: 12 Nov 2024
https://github.com/mainakrepositor/parkinsons-detector
Detect the onset of possible risk of Parkinson's disease with the help of clinical data using Machine Learning Models.
data-science data-visualization decision-tree-classifier medical-application mini-project parkinsons-disease python-3 random-forest-classifier slider-component streamlit-webapp
Last synced: 12 Nov 2024
https://github.com/virgesmith/ukcensusapi
UK Census Data queries and downloads from python or R
Last synced: 27 Oct 2024
https://github.com/mlsanigeria/ai-hacktober-mlsa
Contributing to cutting-edge open-source projects in Machine Learning hosted by MLSA Nigeria
artificial-intelligence data-science hacktoberfest machine-learning microsoft-azure mlsa open-source python
Last synced: 06 Nov 2024
https://github.com/IMSoley/cs-study-plan
📚👨🎓 Resources I'm using everyday to develop my skills to become a self-taught good programmer ...
artificial-intelligence computer-science data-science data-structures-and-algorithms higher-education machine-learning web-development
Last synced: 04 Aug 2024
https://github.com/center-for-threat-informed-defense/sightings_ecosystem
Sightings Ecosystem gives cyber defenders visibility into what adversaries actually do in the wild. With your help, we are tracking MITRE ATT&CK® techniques observed to give defenders real data on technique prevalence.
ctid cyber-threat-intelligence cybersecurity data-science data-visualization mitre-attack
Last synced: 07 Nov 2024
https://github.com/bfgray3/cattonum
Encode Categorical Features (unmaintained)
categorical-features cran data-science dummies encoding machine-learning r r-package rstats supervised-learning
Last synced: 13 Aug 2024
https://github.com/nadhirfr/rf-ids
Machine Learning Based - Intrusion Detection System
cic-ids-2018 data-science ddos ddos-detection ddos-mitigation django-framework intrusion-detection intrusion-detection-system machine-learning machinelearning random-forest sflow sflow-rt software-defined-network
Last synced: 11 Oct 2024
https://github.com/noaa-mdl/grib2io
Python interface to the NCEP G2C Library for reading and writing GRIB2 messages.
atmospheric-science data-science grib2 grib2-decoder grib2-encoder grib2-tables meteorology ncep ncep-grib2 ndfd-grib2 numpy python python3 weather weather-data
Last synced: 11 Nov 2024
https://github.com/martinfleis/sdsc21-workshop
Materials for SDSC 2021 Workshop
Last synced: 28 Oct 2024
https://github.com/dermatologist/fhiry
FHIR to pandas dataframe for data analytics, AI and ML!
analytics data-analysis data-science ehealth fhir hacktoberfest jupyter-notebook pandas python pytorch tensorflow
Last synced: 10 Oct 2024
https://github.com/adityashrm21/bike-sharing-demand-kaggle
Top 5th percentile solution to the Kaggle knowledge problem - Bike Sharing Demand
bikesharing boxplots data-science datascience decision-trees feature-engineering feature-extraction kaggle kaggle-competition kaggle-dataset kaggle-scripts log-transformation machine-learning r random-forest
Last synced: 11 Nov 2024
https://github.com/ELToulemonde/dataPreparation
Data preparation for data science projects.
data-preparation data-preprocessing data-science date-conversion r speed variable-elimination variable-selection
Last synced: 13 Aug 2024
https://github.com/machinecurve/extra_keras_datasets
📃🎉 Additional datasets for tensorflow.keras
data-science dataset datasets deep-learning emnist-digits emnist-letters iris iris-classification iris-dataset keras keras-datasets keras-tensorflow lowercase-handwritten-letters machine-learning neural-networks svhn tensorflow
Last synced: 01 Nov 2024
https://github.com/saschagrunert/kubeflow-data-science-on-steroids
The blog post about Kubeflow, including all materials
blog blog-post cloud-native data-science datascience kubeflow kubernetes talk
Last synced: 27 Oct 2024
https://github.com/tjmahr/madr_pipelines
Slides and materials for my talk to the Madison R Users Group
data-science dplyr magrittr presentation r
Last synced: 12 Nov 2024
https://github.com/fbruzzesi/timebasedcv
Time based splits for cross validation
cross-validation data-science python time-series time-series-analysis
Last synced: 15 Nov 2024
https://github.com/ahammadmejbah/ai-cheat-sheet
The replication of human intellectual processes by machines, most notably computer systems, is referred to as artificial intelligence (AI for short). Expert systems, natural language processing, voice recognition, and machine vision are all examples of specific uses of artificial intelligence.
cheatsheet data-science deep-learning machine-learning neural-networks
Last synced: 11 Nov 2024
https://github.com/maneprajakta/honours-in-data-science
Resources and Implementation Of Assignment For Honours In Data Science
assignment-solutions data-science honours resources sppu
Last synced: 27 Oct 2024
https://github.com/tirthajyoti/mlr
Multiple linear regression with statistical inference, residual analysis, direct CSV loading, and other features
analytics data-analytics data-science linear-regression machine-learning modeling predictive-modeling python regression scikit-learn statiscal-learning statistical-analysis statistics
Last synced: 12 Oct 2024
https://github.com/Daniel-Mietchen/datascience
Keeping track of activities around research data
data-science data-sharing open-data open-science research research-data research-data-management research-funding science-policy
Last synced: 09 Aug 2024
https://github.com/Smat26/Roman-Urdu-Dataset
Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources
data-science dataset hindi hindi-language natural-language-processing nlp urdu urdu-language urdu-nlp
Last synced: 18 Nov 2024
https://github.com/epistasislab/rebate
Relief Based Algorithms of ReBATE implemented in Python with Cython optimization. This repository is no longer being updated. Please see scikit-rebate.
cython data-science feature-selection
Last synced: 16 Nov 2024
https://github.com/racinmat/mal-analysis
github repo for MyAnimeList analysis. Also links to the MAL dataset.
analysis anime crawling data-science kaggle-dataset mal scraped-data
Last synced: 06 Nov 2024
https://github.com/BorisNikulin/discord-chat-analysis
Text analysis of a discord chat group
data-analysis data-mining data-science data-visualization
Last synced: 05 Aug 2024
https://github.com/thudm/kdd-industrial-papers
A list of recent industrial papers in KDD'16–'18
data-mining data-science kdd paper-list
Last synced: 14 Nov 2024
https://github.com/lastancientone/mathematics_for_machine_learning
Learn mathematics behind machine learning and explore different mathematics in machine learning.
algebra applications calculus curves data-science deep-learning geometry linear-algebra linear-systems machine-learning math mathematics matrix matrix-calculations probability software statistics vector
Last synced: 06 Nov 2024
https://github.com/openghg/openghg
A cloud platform for greenhouse gas (GHG) data analysis and collaboration.
analysis cloud collaboration data-science greenhouse-gas
Last synced: 14 Nov 2024
https://github.com/m-clark/data-processing-and-visualization
This document forms the basis of several workshops/talks that get into everyday programming with R, but also includes mirrored code in Python as Jupyter notebooks.
data-processing data-science datatable dplyr ggplot2 htmlwidgets jupyter-notebooks machine-learning model-criticism modeling numpy pandas programming programming-exercises python r tidyverse visualization workshop workshops
Last synced: 08 Aug 2024
https://github.com/petersontylerd/mlmachine
mlmachine accelerates machine learning experimentation
data-analysis data-science data-visualization machine-learning python
Last synced: 13 Nov 2024
https://github.com/giswqs/learning-r
R Tutorials
data-science gis r statistics tutorials
Last synced: 02 Nov 2024
https://github.com/timkpaine/perspective-parquet
Parquet file reader and editor in Jupyterlab, built with `perspective` for pivoting, filtering, aggregating, etc
data-science data-visualization datavisualization dataviz jupyter jupyterlab jupyterlab-extension jupyterlab-extensions parquet parquet-viewer perspective pivot-tables
Last synced: 27 Oct 2024
https://github.com/kwokhing/yandexcatboost-python-demo
Demo on the capability of Yandex CatBoost gradient boosting classifier on a fictitious IBM HR dataset obtained from Kaggle. Data exploration, cleaning, preprocessing and model tuning are performed on the dataset
catboost data-analysis data-preprocessing data-science feature-selection gradient-boosting gradient-boosting-classifier one-hot-encode pandas pearson-correlation python python27 seaborn variance-analysis visualization yandex-catboost
Last synced: 12 Oct 2024
https://github.com/iesahin/xvc
A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)
command-line-tool data data-engineering data-pipelines data-science devops machine-learning machine-learning-engineering mlops rust
Last synced: 11 Nov 2024
https://github.com/explosion/vscode-prodigy
🧬 A VS Code extension for annotating data with Prodigy
annotation-tool data-annotation data-labeling data-labeling-tools data-science labeling-tool nlp prodigy spacy vscode vscode-extension
Last synced: 07 Oct 2024
https://github.com/ericmjl/pyds-cli
Helping you manage your data science projects sanely.
data-science workflow-automation
Last synced: 31 Oct 2024