Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2026-07-03 00:07:42 UTC
- JSON Representation
https://github.com/kennethleungty/tensorflow-transfer-learning-image-classification
Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification
artificial-intelligence data-science deep-learning image-classification machine-learning tensorflow transfer-learning
Last synced: 05 Oct 2025
https://github.com/peleiden/pelutils
Utility module for Python
data-science logging machine-learning parsing profiling
Last synced: 14 Jan 2026
https://github.com/dhhruv/kisaani
"Kisaani" is an application that takes required parameters intelligently or from the database of the location (from the cloud) and provides the list of best crops suited for that land. The application should also be able to collect the outcome after cultivation and apply correction as appropriate for further advisories. The details of the crops for the region and conditions are provided. Applications should be interactive, user friendly for farmers (provide local language support) and should provide support in real time.
crop crop-recommendation data-science ieee ieee-hackathon machine-learning
Last synced: 07 Mar 2026
https://github.com/virajbhutada/spotify-track-analysis-and-recommendation
Experience a comprehensive exploration of Spotify's musical landscape seamlessly transitioned from Tableau visualizations to SQL analysis. Dive into track inventory, streaming metrics, and sonic trends via interactive dashboards, while leveraging SQL queries for deeper insights into KPIs and cross-platform rankings.
audio-analysis data-analysis data-analytics data-science data-visualization eda machine-learning-library ml-models mysql recommendation-system spotify spotify-data spotify-dataset sql-database sql-server streaming-metrics tableau tableau-public trends-analysis
Last synced: 28 Apr 2025
https://github.com/valohai/minihai
An open-source application for running notebooks server-side
data-science jupyter jupyter-notebook machine-learning notebook
Last synced: 02 Aug 2025
https://github.com/devinterview-io/cost-function-interview-questions
๐ฃ Cost Function interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews cost-function cost-function-interview-questions cost-function-questions cost-function-tech-interview data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 08 Jan 2026
https://github.com/invia-flights/blitzly
Lightning-fast way to get plots with Plotly โก๏ธ
data-analysis data-science plotly plotting-in-python python visualization
Last synced: 14 Jan 2026
https://github.com/ahammadmejbah/glossary-of-artificial-intelligence
A "Glossary of Artificial Intelligence" is a concise reference resource defining key terms, concepts, and terminology related to AI. It provides explanations and definitions to help individuals understand and navigate the field of artificial intelligence, making it a valuable tool for both beginners and experts in the AI domain.
artificial-intelligence data data-science deep-learning deep-learning-algorithms detection image-processing machine-learning python
Last synced: 25 Jun 2025
https://github.com/teddyoweh/disease-data-webscrape-analysis
Scraped a table of disease & symptoms data from a website, and turned it to a dataframe, then extracted to a csv fie
data-analytics data-science data-visualization webscraping
Last synced: 09 Apr 2025
https://github.com/siarheidudko/firebase-admin-cli
Cli for firebase
authentication cli data-manipulation data-migratation data-science firebase firebase-admin firebase-auth firebase-authentication firebase-database firebase-firestore firebase-firestore-database firebase-realtime-database firebase-storage firebase-tools firebase-ui firestore google-cloud-storage rtdb storage
Last synced: 27 Feb 2026
https://github.com/lintangwisesa/javascript-on-jupyterlab
JavaScript (Node.js) kernel inside Jupyter Lab (Jupyter Notebook)
data-science javascript jupyter-lab jupyter-notebook
Last synced: 26 Apr 2025
https://github.com/just-krivi/real-estate-market-analysis
Streamlit web app using custom ML models (multiple linear regression and one-to-many multiclass kernel SVM) for predicting real estate prices; Scraping and analyzing real estate listings in Serbia
data-science docker gradient-descent machine-learning multiclass-support-vector-machine multiple-linear-regression postgresql python scrapy stramlit svm webscraping
Last synced: 04 Oct 2025
https://github.com/yoshoku/numo-openblas
Numo::OpenBLAS builds and uses OpenBLAS as a background library for Numo::Linalg
data-science machine-learning numo openblas ruby
Last synced: 25 Apr 2025
https://github.com/montanaz0r/mit-6.0002-course
My solutions to the assignments of MIT 6.0002 course.
data-science mit python python3
Last synced: 12 Aug 2025
https://github.com/inseefrlab/grandedim
Codes correspondant au document de travail "L'รฉconomรฉtrie en grande dimension"
data-science econometrics high-dimensional-data publication r statistics
Last synced: 13 Jun 2025
https://github.com/alessandrocorradini/microsoft-data-science-professional-program
Repository for the Microsoft Data Science Professional Program
data-analysis data-science data-visualization datascience excel machine-learning machinelearning microsoft microsoft-professional-certificate microsoft-professional-program python python3 sql t-sql
Last synced: 13 Jul 2025
https://github.com/pegah-ardehkhani/cancer-patients-survival-analysis
Survival Analysis of Lung Cancer Patients
cancer-patients cox-model cox-proportional-hazard cox-proportional-hazards coxph-model data-science kaplan-meier kaplan-meier-plot kaplanmeierfitter log-rank-test nelsonaalenfitter python survival survival-analysis survival-models survival-prediction
Last synced: 24 Apr 2025
https://github.com/thetallprogrammer/stock-contender-app
Welcome to Stock Contender โ an AI-powered tool designed to assist your market analysis. This tool is not an investment advisor and does not guarantee profits. Invest at your own risk. Stay updated with my latest developments.
artificial-intelligence chat-gpt data-science financial-data-analysis financial-technology fintech investment-analysis machine-learning openai openai-api python stock-market stock-prediction stock-trading
Last synced: 05 Sep 2025
https://github.com/devinterview-io/neural-networks-interview-questions
๐ฃ Neural Networks interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions neural-networks neural-networks-interview-questions neural-networks-questions neural-networks-tech-interview software-engineer-interview technical-interview-questions
Last synced: 07 Jan 2026
https://github.com/eftekin/data-science-adventures
๐ This repository contains my data science journey, showcasing projects, code snippets, and resources as I learn and explore the world of data science using Python.
Last synced: 28 Feb 2025
https://github.com/zachbateman/evogression
Python Machine Learning using an evolutionary regression algorithm. More intuitive with higher transparency than a neural network while providing much greater power and high-dimensionality capabilities than more simplistic regression techniques.
artificial-intelligence data-science machine-learning neural-network python regression
Last synced: 12 Jun 2025
https://github.com/coatless-textbooks/statistical-concepts-with-shiny-apps
Quarto book illustrating various statistical concepts using Shinylive.
data-science quarto quarto-book r-shiny r-shinylive statistics webr
Last synced: 12 Jun 2025
https://github.com/sanvishal/Exoplanet-Explore
An Interactive data visualization of Exoplanets
animation d3js data-analysis data-science exoplanet python space visualization
Last synced: 14 Apr 2025
https://github.com/sharatsawhney/character_segmentation
A detailed Research project on Character-Segmentation using Neural Networks!
data-science deep-learning deep-neural-networks keras keras-layer keras-models keras-neural-networks matplotlib neural-network numpy opencv-python
Last synced: 02 Apr 2025
https://github.com/alexeatscake/gigaanalysis
A toolbox for processing data that can be expressed as a dependent and independent variable.
condensed-matter-physics data-science matplotlib numpy physics scipy
Last synced: 03 Jul 2025
https://github.com/giswqs/timelapse
An interactive streamlit web app for creating satellite timelapse
data-science dataviz earthengine geopython python satellite streamlit
Last synced: 12 May 2025
https://github.com/ryanrudes/wikimedia
A dataset comprised of over 40 million images sourced from Wikimedia Commons
computer-vision data-science data-scraping dataset datasets deep-learning gans image images machine-learning wikimedia wikimedia-commons
Last synced: 13 Sep 2025
https://github.com/kqc-real/streamlit
MC-Tests in deutscher Sprache
agiles-projektmanagement data-science deep-learning mathematische-grundlagen
Last synced: 23 Jan 2026
https://github.com/leriomaggio/develer-data-science
Deep dive into Data Science with Python @ Develer
data-science deep-learning keras keras-tensorflow lecture-notes machine-learning numpy python python3 scikit-learn tutorial
Last synced: 21 Jul 2025
https://github.com/waylonwalker/kedro-auto-catalog
Kedro catalog create with default configuration
data data-science kedro kedro-catalog kedro-hook kedro-plugin
Last synced: 12 Jun 2025
https://github.com/felipexw/knn-java-library
Just a simple implementation of K-Nearest Neighbour algorithm.
data-science k-nearest-neighbor-classifier k-nearest-neighbours knn knn-algorithm knn-classification knn-classifier machine-learning supervised-learning supervised-learning-algorithms supervised-learning-classifiers supervised-machine-learning
Last synced: 13 Nov 2025
https://github.com/m-taghizadeh/python-webinar
Our goal in this webinar is to provide a quick and practical training as your first step to becoming a professional Python programmer, so that after watching this training you will be able to gain a very good knowledge of programming with Python and using Python in artificial intelligence, machine learning, deep learning, data mining, and backend programming using Flask and Django
artificial-intelligence convolutional-neural-networks data-science deep-learning django flask machine-learning python
Last synced: 08 Sep 2025
https://github.com/public-health-scotland/technical-docs
Technical documentation, including guidance and best practice for Public Health Scotland (PHS)
data-science documentation git github python r
Last synced: 14 Apr 2025
https://github.com/mine-cetinkaya-rundel/feedback-at-scale
Slides and sample learnr tutorial for rstudio::global(2021) talk
data-science gradethis learnr rstats tutorial
Last synced: 11 Feb 2026
https://github.com/ramonhpr/knot-lib-python
API to get data from cloud and make some data analytics
data-science iot iot-framework web
Last synced: 26 Jun 2026
https://github.com/dataship/python-dataship
Lightweight tools for reading, writing and storing data, locally and over the internet for python
column-store data-science machine-learning numpy pandas
Last synced: 23 Apr 2025
https://github.com/codelibs/docker-fione
Docker for Fione
ai automl data-science machine-learning
Last synced: 23 Mar 2025
https://github.com/timetoai/timediffusion_forecasting
Research Project on time-series forecasting
data-science deep-learning machine-learning pytorch time-series time-series-forecasting
Last synced: 07 Mar 2026
https://github.com/zmoooooritz/stapy
An easy to use SensorThings API Client written in Python
api cli data-science database ogc python sensor sensor-data sensorthings sensorthings-api
Last synced: 17 Jan 2026
https://github.com/nicodupont/mooc
All my finished Moocs on the subject of the data science mainly
data-analysis data-science data-visualization datacamp jupyter-notebook machine-learning mooc pandas python sas sql
Last synced: 28 Apr 2025
https://github.com/faical-allou/clustering_od
K-means Clustering Algorithm in pure Python 3.5 (solved with Lloyds algorithm)
cluster clustering-algorithm data-science k-means k-means-clustering kmeans-clustering python
Last synced: 26 Jul 2025
https://github.com/he7d3r/maratona-behind-the-code-2021
data-science ibm-cloud machine-learning maratona
Last synced: 26 Oct 2025
https://github.com/jeafreezy/rsgis
A python package for basic to advanced GIS operations.
analysis data-science gis python
Last synced: 12 Apr 2025
https://github.com/nsembleai/nsvision
nsvision is the image data pre and post processing and data augmentation library. It provides utilities for working with image data.
data-science docker image image-classification image-manipulation image-processing jupyter library normalization numpy object-detection opencv opencv-python pillow python python-3 python-library python3 reduce-image-dimensions split-data
Last synced: 22 Feb 2026
https://github.com/surajv311/udemy_course_resources
List of course resources from my Udemy Course : "Numpy for Data Science" 2020
arrays data-science numpy numpy-tutorial python3 udemy udemy-course
Last synced: 16 May 2025
https://github.com/ul-mds/gecko
Python library for the generation and mutation of realistic personal identification data at scale
data-science numpy pandas python record-linkage
Last synced: 24 Apr 2025
https://github.com/memgonzales/pisa-2018-analysis
Jupyter notebook presenting the process of data preparation, research question formulation, data analysis, and data modeling with the goal of extracting insights from the 2018 PISA Dataset
data-cleaning data-modeling data-science data-visualization exploratory-data-analysis jupyter-notebook matplotlib numpy oecd-data pandas pisa scipy statistical-inference
Last synced: 13 Jun 2025
https://github.com/rubentea16/cheat-sheet-road-map
Data Science Cheatsheet and NLP Road Map
data-science machine-learning python
Last synced: 04 Feb 2026
https://github.com/lfrench03/ganaderia-en-cuba
Based on the data provided by the National Office of Statistics and Information ONEI and other alternative trusted sources mentioned in the references, our main objective is to present a detailed vision of how livestock farming has evolved in Cuba during the period until 2022.
cuba data-science dataproduct ganaderia streamlit streamlit-application timeline
Last synced: 26 Jul 2025
https://github.com/stink-po/boxoffice_api
Unofficial Python API for Box Office Mojo
data-science dataset movies-and-cinemas scraper
Last synced: 07 Sep 2025
https://github.com/tchlux/util
My machine learning, optimization, and data science utilities package.
data-science machine-learning numerical-optimization python-utilities splines statistics visualization
Last synced: 02 May 2026
https://github.com/divyanshugit/66daysofdata
This repo contains the source code for a static webpage where you can find out answers to Machine Learning Interview questions.
data-science interview-questions machine-learning
Last synced: 31 Jan 2026
https://github.com/canbula/datascience
Repository for Data Science course given by Assoc. Prof. Dr. Bora Canbula at Computer Engineering Department of Manisa Celal Bayar University.
data-science machine-learning matplotlib numpy pandas python python3 scikit-learn seaborn
Last synced: 04 Apr 2026
https://github.com/nemeslaszlo/social-media-analysis-based-on-covid-19-with-sentiment-analysis-ner-and-information-extraction
This repository contains the social media data scraper and the notebooks of this analysis. Where we analise the Social Media posts - tweets with Sentiment Analysis then we analyse this results with Named Entity Recognition (NER) and Information Extraction methods to get a more accurate and detailed picture of this sentiment results.
bert data-science data-visualization information-extraction keras named-entity-recognition nltk reccurent-neural-network tensorflow textblob
Last synced: 25 Jul 2025
https://github.com/stefanpeidli/gonet
A students Project on GO
data-science go machine-learning neural-network students
Last synced: 26 Mar 2025
https://github.com/alexcj10/diwali-sales-analysis
This repository contains an analysis of Diwali sales data to uncover trends and patterns in customer behavior. The project aims to provide insights into customer demographics, purchasing habits, and product preferences during the Diwali season.
analysis data-science diwali jupyter-notebook matplotlib numpy pandas python sales seaborn
Last synced: 15 Apr 2025
https://github.com/devinterview-io/model-evaluation-interview-questions
๐ฃ Model Evaluation interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions model-evaluation model-evaluation-interview-questions model-evaluation-questions model-evaluation-tech-interview software-engineer-interview technical-interview-questions
Last synced: 08 Jan 2026
https://github.com/anaregdesign/vectorize-openai
Tabular calculation with LLM, Spark UDF Builder
apache-spark data-engineering data-science llm machinelearning openai openai-api pandas python spark-sql spark-udf
Last synced: 14 Mar 2025
https://github.com/mdh266/crimetime
Python web application for exploring and forecasting crime rates in NYC
data-science docker flask-application forecasting-crime-rates geospatial-analysis pandas python statsmodels time-series-analysis
Last synced: 30 Jul 2025
https://github.com/stappit/blog
I often post solutions to textbook exercises, including: Bayesian Data Analysis (BDA) by Gelman et al; Causal Inference in Statistics Primer (CISP) by Pearl et al; Purely Functional Data Structures (PFDS) by Okasaki.
bayesian-data-analysis blog data-analysis data-science gelman hakyll haskell pearl purely-functional-data-structures solutions stan static-site statistical-inference statistics
Last synced: 14 Mar 2025
https://github.com/toros-astro/corral
The Powerful Pipeline Framework
astronomy data-science database framework oop pipeline python python3
Last synced: 27 Jul 2025
https://github.com/reddyprasade/dataset-for-ml-and-data-science
Freely Available Data Sets For Real world Problems
data-science dataset datasets machine-learning machinelearningdataset python
Last synced: 18 Sep 2025
https://github.com/jamesquinlan/intro-python
Introduction to Programming and Data Science with Python
data-science nlp python python-3
Last synced: 18 Aug 2025
https://github.com/sondosaabed/data-analyst-nanodegree
I aquired a full scholarship from Google Launchpad. Advanced data wrangling skills to work with messy, complex real-world datasets. Highly customized visualizations using the Matplotlib Python library
data-science dataanalysis datawrangling nanodegree python udacity-nanodegree
Last synced: 09 Apr 2025
https://github.com/tiagoantao/virtual-core
A data science core based on Docker containers
Last synced: 14 Mar 2026
https://github.com/brunocampos01/predict-which-customers-a-call-center-should-contact
Predict which customers should a call-center call for greater assertiveness in a sale
analytics call-center call-center-analytics challenge correlation data-engineering data-science dataset keyrus linear-regression linear-regression-models machine-learning polynomial-regression pt-br python random-forest random-forest-classifier
Last synced: 03 Sep 2025
https://github.com/archie-cm/churn-analysis-ecommerce-customer
The objective of this project to is to predict customer churn, loss opportunity and provide recommendations to the business team so the company can implement a customer persona in retention strategy and can monitoring throught dashboard interactive.
data-science feature-engineering machine-learning python scikit-learn
Last synced: 23 Apr 2025
https://github.com/aaaastark/false-data-injection-attack
False Data Injection Attack (FDIA) with Long Sort Term Memory (LSTM) Model using Python
adversarial-attacks data-science data-visualization deep-learning false-data-injection false-data-injection-attack injection-attack keras lstm lstm-neural-networks lstm-sentiment-analysis machine-learning matplotlib numpy pandas python seaborn sklearn tensorflow time-series-analysis
Last synced: 24 Sep 2025
https://github.com/techn0man1ac/toxiccommentclassification
This project aims to develop a model capable of identifying and classifying different levels of toxicity in comments, using the power of BERT(Bidirectional Encoder Representations from Transformers) for text analysis.
analysis bert-model classifying data-science docker machine-learning python streamlit text-classification transformers-models
Last synced: 18 Aug 2025
https://github.com/tsg405/sql-for-data-science----coursera
This Repo contains - Starter files, Coursework, Programming Assignments for the course --> SQL for Data Science from University of California, Davis [COURSERA]
california chinook-database coursera data-science query-language quiz sql sqlite ucdavis-datalab yelp-dataset
Last synced: 14 Apr 2025
https://github.com/rafaelpermec/live-broker-api
Um estudo sobre raspagem de dados em back-end, simulando uma corretora que realiza aรงรตes de compra e venda de ativos e fluxo de caixa de clientes em tempo real.
authentication authorization backend-api cheerio data-science express helmet jwt-authentication mysql nodejs typescript web-scraping
Last synced: 19 Apr 2025
https://github.com/arbox/learning-scala-for-data-science
Data Science: Scala for brave and impatient
big-data bigdata data-science datascience scala spark
Last synced: 10 Mar 2026
https://github.com/iamyajat/whatsapp-chat-analyzer-api
An API to analyse WhatsApp chats and generate insights
data-analysis data-science fastapi python whatsapp
Last synced: 17 Oct 2025
https://github.com/westlake-ai/dmt-learn
An Explainable Deep Network for Dimension Reduction
data-science dimension-reduction python
Last synced: 07 Aug 2025
https://github.com/ruivieira/nim-mentat
A Nim library for data science and machine learning
data-science library machine-learning nim scientific-computing
Last synced: 10 Aug 2025
https://github.com/frauddi/dataspot
Find data concentration patterns and hotspots. Built for fraud detection and risk analysis.
anomalies anomalies-detection data-analysis data-science fraud-detection hotspots pattern-mining python
Last synced: 09 Apr 2026
https://github.com/WaylonWalker/kedro-auto-catalog
Kedro catalog create with default configuration
data data-science kedro kedro-catalog kedro-hook kedro-plugin
Last synced: 24 Mar 2025
https://github.com/bhattbhavesh91/auto-sklearn-tutorial
Small tutorial on auto-sklearn which is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
auto-ml auto-sklearn automl data-science machine-learning python tutorial
Last synced: 27 Oct 2025
https://github.com/marwandebbiche/tsgen
Time Series Generator
data-science time-series time-series-analysis
Last synced: 06 Apr 2025
https://github.com/npow/awesome-metaflow
Every Metaflow extension worth knowing, curated and organized
awesome-list data-science extensions machine-learning metaflow mlops python workflow-orchestration
Last synced: 23 May 2026
https://github.com/devinterview-io/curse-of-dimensionality-interview-questions
๐ฃ Curse of Dimensionality interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews curse-of-dimensionality curse-of-dimensionality-interview-questions curse-of-dimensionality-questions curse-of-dimensionality-tech-interview data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 19 Feb 2026
https://github.com/timkong21/polyp-segmentation
Polyp segmentation tool utilizing U-Net for accurate medical image analysis, designed to enhance early detection and diagnosis of colorectal cancer. Features a user-friendly Streamlit web app for easy image processing and analysis, leveraging the Kvasir-SEG dataset for improved healthcare outcomes.
aws-s3 cancer-detection colonoscopy computer-vision data-augmentation data-science deep-learning diagnostics healthcare machine-learning medical-application medical-image-analysis medical-image-processing medical-image-segmentation opencv polyp-segmentation python streamlit tensorflow u-net
Last synced: 14 Apr 2025
https://github.com/mohidex/data-pipeline-on-gcp
The Real-time Ecommerce Data Collection and Processing project empowers businesses with real-time insights by efficiently extracting, processing, and storing ecommerce data from multiple sources. Combining Golang and Python, this cutting-edge solution streamlines data handling from diverse ecommerce websites.
beautifulsoup data-engineer data-pipeline data-science database datastore dependency-injection firebase firestore gcp go golang google google-cloud pubsub python solid-principles storage web-scraping
Last synced: 14 Apr 2025
https://github.com/navdeep-g/sdss-2019
Interpretable Machine Learning with rsparkling
data-science h2o-3 machine-learning r rsparkling spark sparklyr xai
Last synced: 07 Apr 2025
https://github.com/miguelgfierro/miguelgfierro
I help people understand and apply AI
ai data-science machine-learning
Last synced: 03 Jan 2026
https://github.com/zMoooooritz/stapy
An easy to use SensorThings API Client written in Python
api cli data-science database ogc python sensor sensor-data sensorthings sensorthings-api
Last synced: 15 May 2025
https://github.com/tomaztk/datasetr
Generate datasets for R projects
data data-frame data-science r-language r-programming sample sample-data sample-data-generator
Last synced: 16 May 2025
https://github.com/vbyan/deeva
๐Deeva - your smart analytics companion for Object Detection datasets
data data-science data-visualization datasets deeva machine-learning object-detection plotly python statistics streamlit visualization
Last synced: 26 Jun 2025
https://github.com/ruban2205/data-science-introduction
Welcome to the Data Science Introduction repository! This repository is designed to provide an introduction to the field of data science, covering various topics and techniques commonly used in the industry.
classification-algorithm data-science data-visualization decision-tree-classifier exploratory-data-analysis knn knn-classification python simple-linear-regression
Last synced: 11 Jul 2025
https://github.com/ZackAkil/friendlier-data-labelling
Code resources for generating a google form for labelling data.
data-science google google-apps-script google-forms google-sheets machine-learning
Last synced: 04 Apr 2025
https://github.com/robinthibaut/project_template
Template for Python scientific projects
data-science python science-research template template-project vcs
Last synced: 14 Aug 2025
https://github.com/yisaienkov/tinysets
The project aims to collect various datasets for tasks such as classification, clustering, object detection... The purpose of this datasets is quick checking models and algorithms performance.
algorithms classification data data-science dataset datasets kaggle kaggle-dataset lego lego-minifigures lego-sets object-detection pypi python regression text-classification tinysets
Last synced: 14 Apr 2025
https://github.com/ucla-biostat-203b/2024winter
biostatistics data-science machine-learning
Last synced: 26 Feb 2025
https://github.com/macropin/random-name-generator
Generate random male and female names with real-world probability.
data-science python random-generation test-data-generator
Last synced: 17 Jul 2025
https://github.com/abtinz/machine-learning-with-python
Machine Learning with Python in Jupiter
data-mining data-science fuzzy-logic machine-learning matplotlib numpy pandas preprocessing regression
Last synced: 29 Jul 2025
https://github.com/markziemann/5pillars
Five pillars of computational reproducibility
bioinformatics computational-biology data-science journal-article reproducible-research
Last synced: 18 Feb 2026
https://github.com/pchtsp/pytups
Powerful dictionaries and tuple lists for data wrangling
data data-science dictionaries optimization tuples
Last synced: 14 Apr 2025
https://github.com/judftteam/aiida-jutools
Tools for simplifying daily work with the AiiDA workflow engine
aiida computational-materials-science computational-science data-science density-functional-theory dft forschungszentrum-juelich high-throughput judft materials-informatics materials-science pandas provenance toolkit utility workflow
Last synced: 26 Jan 2026