Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
- GitHub: https://github.com/topics/data-science
- Wikipedia: https://en.wikipedia.org/wiki/Data_science
- Related Topics: data-analysis, data-mining, machine-learning, big-data, data-visualization,
- Aliases: datasciences, data-science-project, data-science-algorithm,
- Last updated: 2026-07-03 00:07:42 UTC
- JSON Representation
https://github.com/adamvvu/snapshot_ensemble
Train TensorFlow Keras models with cosine annealing and save an ensemble of models with no additional computational expense.
data-science deep-learning keras machine-learning python tensorflow
Last synced: 28 Oct 2025
https://github.com/jcolechanged/josh.meanings
A k means implementation in Clojure which supports clustering on larger than memory but smaller than storage datasets.
assumption-free-k-mc clojure-library clustering data-science k-mc k-means k-means-clustering k-means-parallel k-means-plus-plus machine-learning medium-data
Last synced: 13 Apr 2025
https://github.com/stefanrmmr/kaggle_twitter_airline_sentiment
Kaggle Twitter US Airline Sentiment, Implementation of a Tweet Text Sentiment Analysis Model, using custom trained Word Embeddings and LSTM-Deep learning [TUM-Data Analysis&ML summer 2021] @adrianbruenger @stefanrmmr
data-science deep-learning kaggle-airline-dataset kaggle-sentiment-analysis kaggle-us-airlines lstm-neural-networks python sentiment-analysis skipgram text-sentiment-classification tweepy tweet-classification tweet-sentiment-analysis twitter twitter-sentiment-analysis us-airline-dataset word2vec
Last synced: 19 Mar 2025
https://github.com/zai-kun/2d-chess-pieces-detection
YOLO11n model for detecting 2d chess board and pieces
ai chess data-science machine-learning onnxruntime python yolov11
Last synced: 19 Jul 2025
https://github.com/crissyro/base-of-ds
This repository serves as a foundation for projects in Data Science and Machine Learning.
clustering-algorithm data-science data-visualization machine-learning
Last synced: 04 Jul 2025
https://github.com/wlandau/targets-intro
Introduction to the {targets} R package
data-science high-performance-computing make pipeline r r-package r-targetopia reproducibility reproducible-research rstats targets workflow
Last synced: 20 Mar 2025
https://github.com/weecology/retriever-recipes
data data-retrieval data-science dataset datasets hacktobefest
Last synced: 17 Feb 2026
https://github.com/devinterview-io/chatgpt-interview-questions
๐ฃ ChatGPT interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions chatgpt chatgpt-interview-questions chatgpt-questions chatgpt-tech-interview coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 05 May 2025
https://github.com/eikevons/pandas-paddles
Access the parent Pandas data frame in loc[], iloc[], assign(), and others Pandas helpers
data-analysis data-exploration data-science pandas pandas-dataframe pandas-library pandas-loc
Last synced: 16 Jun 2025
https://github.com/hugo-strang/silhouette-upper-bound
An upper bound of the Average Silhouette Width.
cluster-analysis clustering clustering-evaluation data-mining data-science machine-learning python python3 silhouette-coefficient silhouette-score upper-bound
Last synced: 14 Dec 2025
https://github.com/praveen1664/chatbot
This is a chatbot written in python & getting inputs directly from sql database
chatbot data-science database json nlp python3 sqlite sqlite3
Last synced: 11 Jul 2025
https://github.com/blmoore/summerdatachallenge
My entry for: http://summerdatachallenge.com (I came 3rd)
analytics data-science london r real-estate rstats
Last synced: 30 Apr 2025
https://github.com/chiraag-kakar/pubg
What's the best strategy to win in PUBG? Should you sit in one spot and hide your way into victory, or do you need to be the top shot? Let's let the data do the talking!
data-science feature-engineering machine-learning-algorithms project pubg-api random-forest
Last synced: 07 May 2025
https://github.com/leonard-seydoux/scientific-computing-for-geophysical-problems
Lecture notes and Jupyter notebooks for geophysical problems
ambient-noise data-science geomagnetism inverse-problems ipgp jupyter labs lecture-notes master notebooks python seismic seismology
Last synced: 10 Apr 2025
https://github.com/mikehlee/pydelt
Generalized functional derivative and integral calculus methods for noisy data
applied-mathematics applied-statistics calculus data-science feature-engineering feature-extractor signal-processing time-series time-series-analysis vector-calculus
Last synced: 01 Jul 2026
https://github.com/geco-bern/r_proj_template
GECO R project template
data-science developer-tools development r template
Last synced: 30 Oct 2025
https://github.com/nationalparkservice/qckit
QCkit provides useful functions for data quality control and manipulation including updating data to DarwinCore standards, unit conversions, and data flagging.
darwin-core data-quality data-science npsdataverse quality-control r r-package rstats
Last synced: 22 Jun 2025
https://github.com/vedadiyan/genql
GenQL is a generic querying language fully written in Go
data-analysis data-mapping data-processing data-science data-translation json json-data sql
Last synced: 22 Jun 2025
https://github.com/rahul-jha98/restauranttrends.stats
Visualise the trends in food and restaurant choices of customers in a city by scraping data from Zomato.
data-analysis data-science visualization vuejs zomato zomato-api zomato-scraper
Last synced: 08 Jul 2025
https://github.com/marwan116/supreme-task
A prefect extension that builds on top of the task decorator to reduce negative engineering!
data-ops data-science infrastructure ml-ops orchestration prefect python workflow
Last synced: 23 Jun 2025
https://github.com/devinterview-io/light-gbm-interview-questions
๐ฃ LightGBM interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation light-gbm light-gbm-interview-questions light-gbm-questions light-gbm-tech-interview machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions software-engineer-interview technical-interview-questions
Last synced: 11 Jan 2026
https://github.com/neo4j-graph-examples/contact-tracing
Contact Tracing graph for pandemic spread e.g. COVID-19 based on http://blog.bruggen.com/search/label/contact%20tracing
contact-tracing covid-data covid19 data-science dataset example-data graphdb healthcare neo4j neo4j-approved
Last synced: 18 Jul 2025
https://github.com/tjpalanca/tjcloud
TJ Palanca's Personal Cloud
chromebooks cloud data-science docker kubernetes kubernetes-cluster rstudio terraform
Last synced: 13 Apr 2025
https://github.com/yusufcinarci/web-scraping-projects
In these project files, I will host the web scraping examples that I will make day by day.
data-analysis data-science jupyter-notebook python web-scraping
Last synced: 01 May 2025
https://github.com/semasuka/income-classification
Predicting if an individual make more than 50K using different features
aws-s3 binary-classification data-analysis data-science data-visualization eda finance-analytics machine-learning precision python random-forest-classifier scikit-learn streamlit
Last synced: 14 Jul 2025
https://github.com/krishnaura45/stresssense
Estimation of Stress Levels Using PPG Signals
data-science datadriven feature-engineering machine-learning mental-health research-project signal-processing stress
Last synced: 13 Apr 2025
https://github.com/iguptashubham/online-retail-sales
This Power BI dashboard, designed for marketing strategists, analyzes sales trends and customer behavior. It provides key insights empowering them to identify sales opportunities and optimize marketing campaigns, ultimately boosting business sales.
dashboard data data-analysis data-analysis-project data-analysis-project-powerbi data-analysis-python data-project data-science powerbi project
Last synced: 19 Mar 2026
https://github.com/Nelson-Gon/mde
mde: Missing Data Explorer
data-analysis data-cleaning data-exploration data-science datacleaner datacleaning exploratory-data-analysis missing missing-data missing-value-treatment missing-values missingness omit r r-package r-stats recode replace rstats statistics
Last synced: 30 Jul 2025
https://github.com/visokio/omniscope-custom-blocks
Public repository for custom blocks for Omniscope
business-intelligence data-science dataanalytics datapreparation python rstats
Last synced: 06 Apr 2026
https://github.com/armanx200/gold-price-prediction
๐ Predicting the future adjusted closing price of Gold ETF using machine learning! ๐โจ
arman-kianian data-science data-visualization finance gold-price-prediction machine-learning prediction-models python random-forest regression stock-market time-series-analysis
Last synced: 30 Apr 2025
https://github.com/lucasrodes/pyphoon
Tools for Digital Typhoon DL/ML Project
data-science dataset environment machine-learning tropical-cyclone
Last synced: 18 Mar 2025
https://github.com/hunterdii/iriswise
IrisWise is a machine learning application for predicting Iris flower species. Built with Streamlit, this app provides a user-friendly interface to input flower measurements and receive predictions using various models, including K-Nearest Neighbors, (Random Forest, SVM, and Logistic Regression) **(Working On It...)**.
classifier-model data-science flowers-recognition iris-dataset iris-recognition knn-classification machine-learning pickle python python3 streamlit streamlit-webapp
Last synced: 21 Feb 2026
https://github.com/zaman-hamza/citadel-datathon
My submission to the 2022 East Coast Datathon. The event started on the 21st of March and ended on the 28th, lasting about a whole week. I was in a team of two where we analyzed the non-conventional indicators and instigators of traffic.
citadel data-science data-visualization datathon
Last synced: 10 Apr 2025
https://github.com/davidssmith/rawarray.jl
Raw array (RA) file format for simple, robust, and user-friendly N-dimensional array storage
bytes complex-numbers data-science file-format julia large-dataset large-files ra-format rawarray scientific-computing storage
Last synced: 10 Sep 2025
https://github.com/wazzabeee/twitter-sentiment-analysis-pyspark
Comparative study of classification algorithms implemented in PySpark on the Sentiment 140 dataset.
apache-spark data data-science gcp google-cloud logistic-regression naive-bayes-classifier natural-language-processing nlp nlp-machine-learning pyspark python python3 sentiment-analysis sentiment-classification sentiment140-dataset sentimental-analysis spark tweet twitter
Last synced: 06 May 2025
https://github.com/gjtorikian/destroy-all-monuments
This is data taken from the SPLC report titled "Whose Heritage? Public Symbols of the Confederacy" from April 21, 2016
data-science government-data social-justice
Last synced: 10 Apr 2025
https://github.com/imsanjoykb/r-programming-practice
R Programming Practice For Data Science
data-science machine-learning programming-exercises r rprogramming
Last synced: 30 Oct 2025
https://github.com/tensorsense/vlm_databuilder
This SDK generates datasets for training Video LLMs from youtube videos.
data-generation data-science llm video-llms vlm
Last synced: 11 Sep 2025
https://github.com/jbris/stan-cmdstanr-gpu-docker
A Docker image to run Stan, cmdstanr, and brms for Bayesian statistical modelling. GPU support using OpenCL is available.
bayes bayesian-inference brms cmdstan cmdstanr data-science docker posterior probabilistic-programming projpred rstan rstanarm shinystan stan stan-gpu stan-lang stan-math-library tidybayes tidyverse
Last synced: 04 May 2025
https://github.com/whizsid/kddbscan-rs
A rust library inspired by kDDBSCAN clustering algorithm
clustering data-science density-based-clustering deviation machine-learning-algorithms pinned
Last synced: 10 Apr 2025
https://github.com/carlos-gg/digitalgarden
NO LONGER MAINTAINED. Go to: https://aigarden.vercel.app/
artificial-intelligence data-science digital-garden knowledge-management machine-learning
Last synced: 07 May 2025
https://github.com/leomaurodesenv/data-science-api-framework
A simple framework to test and deploy your Data Science API
api api-rest data-science dataops docker flask-api python
Last synced: 09 Sep 2025
https://github.com/nikbarb810/pattern-recognition
Basic pattern recognition algorithms implemented in Python
data-science ipynb-jupyter-notebook matplotlib numpy pattern-recognition python
Last synced: 06 Mar 2026
https://github.com/polakowo/textai
Applications using state-of-the-art in NLP
bert data-science gpt-2 machine-learning nlp telegram-bot transformers
Last synced: 07 May 2025
https://github.com/blockchain-etl/anomalous-transactions-detector-dataflow
Dataflow pipeline for detecting anomalous transactions on the Ethereum and Bitcoin blockchains
anomaly-detection apache-beam bitcoin blockchain-analytics crypto cryptocurrency data-analytics data-engineering data-science ethereum gcp google-cloud google-cloud-platform google-dataflow google-pubsub on-chain-analysis real-time real-time-analytics stream-processing web3
Last synced: 15 Apr 2026
https://github.com/newjerseystyle/litepolis
The package manager of a Customizable e-democracy opinion collection and insight mining system. Built using Python and optimized for scalability and performance.
civic-tech data-science deliberative-democracy litepolis package-manager participatory-democracy
Last synced: 28 Feb 2026
https://github.com/pfed-prog/catalonia_data
we have analyzed air quality in Catalonia by using the data from the Catalan Transparency Portal.
data-science dspyt jupyter-notebook ocean oceanprotocol python3
Last synced: 05 Oct 2025
https://github.com/amirhosseinhonardoust/algorithmic-empath-human-fallibility
A deep exploration of Algorithmic Empathy, the next frontier in AI understanding. This project examines how machines can learn from human fallibility, model disagreement, and align with moral reasoning. It blends psychology, fairness metrics, interpretability, and co-learning design into one framework for humane intelligence.
ai algorithmic-bias co-learning cognitive-science data-science empathy ethics fairness human-centered-ai intelligence interpretability machine-learning neural-networks neurosymbolic philosophy psychology reflective-ai research responsible-ai xai
Last synced: 28 Feb 2026
https://github.com/oscarqjh/ntu_sc1015_project
A mini project for NTU's data science and artificial intelligence mod - Analysis on League of Legends competitive matches
data-science machine-learning pandas python scikit-learn
Last synced: 09 Apr 2025
https://github.com/jmcph4/open-engine-data
Open dataset of various technical specifications for automotive engines
auto automotive cars csv data-science dataset engine mechanical-engineering open-access open-data reference-data technical-specifications
Last synced: 22 Jan 2026
https://github.com/hritik5102/shala2020
MastAI ki paathSHALA : Data Science, Machine Learning, and Deep Learning codes with explanation and reference links ๐จโ๐ป
artificial-intelligence computer-vision data-science deep-learning machine-learning statistics
Last synced: 12 Oct 2025
https://github.com/trafficgcn/optimal_path_dijkstra_for_data_science
Plotting the Optimal Route in Python for Data Scientists using the Dijkstra Algorithm
data-science dijkstra dijkstra-algorithm dijkstra-shortest-path map mapping open-street-map optimal-route osm osmnx python
Last synced: 27 Oct 2025
https://github.com/sevdanurgenc/r-programming-for-data-science-lecture-notes
In this repo, I have the course contents of R Programming For Data Science training, which will be given to Sigorta Bilgi ve Gรถzetim Merkezi by the cooperation of Academy Peak Information Technologies Training and Consultancy between 21 - 23 March 2023.
data-analysis data-science data-visualization r r-programming r-programming-projects
Last synced: 11 Oct 2025
https://github.com/mlr-org/mlr3ordinal
Ordinal Regression for mlr3
data-science machine-learning mlr3 ordinal-regression r r-package regression
Last synced: 13 Oct 2025
https://github.com/joaopfonseca/ml-research
A Python library with utilities for Machine Learning research and algorithm implementations
active-learning data-science machine-learning python scikit-learn
Last synced: 26 Oct 2025
https://github.com/ahammadmejbah/python-problem-statement-and-solutions
Create a Python Problem Statement to challenge programmers. Specify a task, input/output requirements, and constraints. Ensure clarity and complexity to evaluate coding skills effectively.
data-science machine machine-learning python python3
Last synced: 27 Apr 2025
https://github.com/caerbannogwhite/aargh
A library that helps you out of data nightmares in Go. ๐งโโ๏ธ
csv data data-science data-wrangling dataframe go golang html json linq statistics stats xlsx xpt
Last synced: 14 Jan 2026
https://github.com/sferez/twitter_toolbox
Complete Toolbox for Scraping, Streaming, Interact with API, Cleaning, Preprocessing, Applying NLP on Twitter Data
data-collection data-science nlp preprocessing twitter twitter-api twitter-scraping twitter-streaming-api
Last synced: 10 Apr 2025
https://github.com/marianogappa/sctool
Starcraft: Remastered replay analyzer library and CLI tool
cli cli-app data-science replays starcraft starcraft-broodwar starcraft-remastered
Last synced: 17 Apr 2026
https://github.com/zsxkib/ttds-g35-cw3
TTDS Group Project: Video Games Search Engine. Sakib Ahamed. Dan Buxton, Kenza Amira, Wini Lau, Mansoor Ahmad
corpora data-science neural-ranking-models pagerank query search-engine technologies text text-analysis text-classification ttds web-search
Last synced: 10 Apr 2025
https://github.com/taharallouche/hakeem
Flexible crowdsourced data labeling solutions for scarce and incomplete annotations
crowdsourcing data-science datalabeling python
Last synced: 10 Oct 2025
https://github.com/insightsengineering/nest
Website for the Nest project ๐ชบ
clinical-trial-analysis data-science nest r shiny website
Last synced: 12 Sep 2025
https://github.com/gbeckers/birdwatcher
A Python computer vision library for animal behavior
animal behavior computer-vision data-science ffmpeg opencv python science
Last synced: 13 Oct 2025
https://github.com/milos-agathon/forest_map_europe
This repo demonstrates how to easily overlay community polygons on forest cover data and make a beautiful map using R and ggplot2
data-science data-visualization ggplot2 gis r satellite-imagery spatial-analysis zonal-statistics
Last synced: 04 Apr 2026
https://github.com/akcarsten/non_negative_matrix_factorization
From scratch Python implementation of the Non-Negative Matrix Factorization algorithm.
clustering data-science machine-learning python
Last synced: 11 Mar 2026
https://github.com/cricksmaidiene/mids_machine_learning
๐ค A unified repository of coursework fragments from UC Berkeley MIDS ML courses
coursework data-science generative-ai jupyter-notebook machine-learning numpy pandas prompt-engineering scikit-learn spark tensorflow uc-berkeley
Last synced: 10 Oct 2025
https://github.com/ucla-biostat-203b/2023winter
Course webpage for UCLA Biostat 203B (Intro. to Data Science)
biostatistics data-science docker machine-learning r
Last synced: 07 Sep 2025
https://github.com/espoirmur/student-grade-predictor
This repository contains all my work for my Machine Learning Project
data-mining data-science education educational-data-mining machine-learning numpy pandas seaborn sklearn statistics
Last synced: 17 Jan 2026
https://github.com/joschnitzbauer/dalymi
A lightweight, data-focused and non-opinionated pipeline manager written in and for Python.
dag data data-science pipeline python workflow
Last synced: 14 Jan 2026
https://github.com/cgivre/drill-geoip-functions
GeoIP Functions for Apache Drill
apache-drill city country data data-analysis data-science drill geoip-functions ip-address ipv4
Last synced: 12 Apr 2025
https://github.com/raynardj/forgebox
The deep learning tool box
data-science machine-learning nlp pandas-dataframe
Last synced: 16 Oct 2025
https://github.com/pufanyi/genderrecognitionbyvoice
NTU SC1015 Group Project - Gender Recognition by Voice
data-science machine-learning voice-recognition
Last synced: 09 Feb 2026
https://github.com/adityashrm21/exploratory_data_analysis
A collection of exploratory data analysis techniques and resources
data data-analysis data-exploration data-science data-visualization dataset datasets eda exploratory-data-analysis insights kaggle
Last synced: 29 Apr 2025
https://github.com/devinterview-io/ml-design-patterns-interview-questions
๐ฃ ML Design Patterns interview questions and answers to help you prepare for your next machine learning and data science interview in 2024.
ai-interview-questions coding-interview-questions coding-interviews data-science data-science-interview data-science-interview-questions data-scientist-interview interview-practice interview-preparation machine-learning machine-learning-and-data-science machine-learning-interview machine-learning-interview-questions ml-design-patterns ml-design-patterns-interview-questions ml-design-patterns-questions ml-design-patterns-tech-interview software-engineer-interview technical-interview-questions
Last synced: 08 Jan 2026
https://github.com/ikegwukc/csc-405-605_spring_2022
Introductory Data Science Course Taught at UNCG (Spring 2022)
Last synced: 09 Oct 2025
https://github.com/wlongxiang/dutch_traffic_monitor
Visualize traffic on dutch high way A9 as an example
computer-vision data-science deep-learning object-detection opencv visualization
Last synced: 04 Aug 2025
https://github.com/tuliosg/cdp
Repositรณrio do curso "Ciรชncia de Dados para Pesquisa".
data-analysis data-manipulation data-science data-visualization google-colab jupyter-notebook python
Last synced: 03 Mar 2026
https://github.com/erictleung/data-science
:computer: Repository for teaching materials and notes on machine learning and data science for freeCodeCamp
data-cleaning data-engineering data-science data-visualization freecodecamp learning machine-learning mathematics notes python statistics
Last synced: 25 Mar 2025
https://github.com/aianytime/fishvision
FishVision built using Streamlit identifies the different species of Fishes in a given image. It is trained on "A Large Scale Fish Data" available on Kaggle using the pre-trained model "MobileNetV2".
data-science deep-learning deep-neural-networks heroku heroku-deployment machine-learning machine-learning-algorithms mobilenetv2 python python3 streamlit streamlit-webapp
Last synced: 01 Sep 2025
https://github.com/zapier/awsjavasdk
Boilerplate rJava Access to the AWS Java SDK
Last synced: 14 Apr 2025
https://github.com/abhinav-ark/mal_lyrics_analysis
Preprocessing and EDA on a Dataset of Malayalam Songs and Lyrics
data-science eda jupyter-notebook python
Last synced: 22 Jul 2025
https://github.com/espoirMur/Student-Grade-Predictor
This repository contains all my work for my Machine Learning Project
data-mining data-science education educational-data-mining machine-learning numpy pandas seaborn sklearn statistics
Last synced: 09 Aug 2025
https://github.com/snowflakedb/snowpark-checkpoints
Snowpark Python / Spark Migration Testing Tools
data-analytics data-engineering data-science python snowflake sql
Last synced: 31 Aug 2025
https://github.com/prakalp-pande/twitter-sentiment-analysis
Analyze public opinion on Tweet by mining and processing live Twitter data. Employ machine learning to classify tweets as positive, negative, or neutral, then visualize sentiment trends and identify key influencers.
data-science twitter-sentiment-analysis
Last synced: 18 Aug 2025
https://github.com/fusky-labs/pacopanda-drawing-stats
A case study and data analysis project that collects drawings from a furry artist Paco Panda
data-science data-visualization fastapi furries furry furry-fandom pandas python
Last synced: 09 Aug 2025
https://github.com/chris-santiago/steps
A SciKit-Learn style feature selector using best subsets and stepwise regression.
best-subset-selection data-science python scikit-learn stepwise-selection
Last synced: 28 Jun 2025
https://github.com/kennethleungty/responsible-ai-masterclass
Responsible AI Masterclass (June 2024 Run)
ai aif360 artificial-intelligence bcg data-science gen-ai generative-ai machine-learning nemo-guardrails rai responsible-ai veritas
Last synced: 03 Jan 2026
https://github.com/datadistillr/datadistillr-python-sdk
A Python SDK for Programmatically Interacting with DataDistillr
apache-drill data data-science datadistillr jupyter sql
Last synced: 01 Jul 2025
https://github.com/omarsar/friendly_machine_learning
Mini blog for notes and guides on Machine Learning (Open Notes)
artificial-intelligence classification clustering cnn data-mining data-science deep-learning dnn lstm machine-learning neural-network reinforcement-learning rnn supervised-learning unsupervised-learning
Last synced: 16 Mar 2025
https://github.com/ekagra-ranjan/Optimal-Bidding
Inter IIT Techmeet 2017, IIT Madras - Data Science Competition
bid-price bidding boosting-algorithm cross-validation data-science data-science-challenges data-science-competition electricity-market ensemble feature-engineering feature-extraction feature-selection gradient-boosting-machine interiit knn linear-regression market-price random-forest sklearn svm
Last synced: 08 May 2025
https://github.com/wgierke/git_better
3rd-placed solution for the informatiCup2017
data-science docker docker-image heroku machine-learning tensorboard
Last synced: 24 Mar 2025
https://github.com/vatshayan/machine-learning-and-data-science-projects
Top New and research Paper Based Projects for your final year and college projects. #FREE HELP #Live_explanation #Research papers #PPT # Report #Code
college-project data-science data-science-projects final-project final-year-project finalproject finalyearproject machine-learning machine-learning-projects machinelearning ml-project opencv-project projects research-project semester-project semester-projects
Last synced: 17 Aug 2025
https://github.com/rehan-ankalgi-7t2/software-architect-roadmap
comprehensive reference guides, tips, best practices and cheatsheets to various tech in software engineering
api architecture aws business-analytics cloud containerization data-science design-patterns frameworks libraries networking orchestration programming-languages system-design
Last synced: 27 Feb 2025
https://github.com/pegah-ardehkhani/pneumonia-diagnosis-in-chest-x-ray-images
Detect Pneumonia Using Deep Learning Models (CNN and InceptionV3)
cnn data-science deep-learning deep-learning-algorithms deep-learning-healthcare deep-neural-networks healthcare image-classification image-processing lungs machine-learning machine-learning-algorithms pneumonia pneumonia-classification pneumonia-detection pneumonia-diagnosis pneumoniac-xray tensorflow transfer-learning x-ray-images
Last synced: 14 Apr 2025
https://github.com/c0sogi/auxeticmop-with-abaqus
Finding metametrial structure by NSGA genetic algorithm with ABAQUS CAE
abaqus abaqus-python-script cae data-science genetic-algorithm mechanical-engineering metamaterial-design metamaterials nsga
Last synced: 21 Aug 2025
https://github.com/ahmedshahriar/pulsepoint-data-analytics
EDA, data processing, cleaning and extensive geospatial analysis on a selenium based web scraped dataset
clustering data-science data-visualization folium folium-choropleth-map folium-map folium-python geopy hierarchical-clustering jupyter-notebook k-means-clustering machine-learning matplotlib numpy pandas plotly-python python scikit-learn seaborn wordcloud
Last synced: 27 Sep 2025
https://github.com/arasgungore/job-posting-duplicate-detection
A project aiming to leverage text embeddings and Milvus, a high-performance vector search engine, to detect duplicate job postings.
data-science docker-compose dockerfile duplicate-detection duplicates embedding embeddings exploratory-data-analysis job-posting job-postings machine-learning milvus natural-language-processing sentence-embedding sentence-embeddings sentence-encoder sentence-encoding sentence-transformers text-embedding vector-search-engine
Last synced: 06 May 2026
https://github.com/firelink-sh/evolve-py
A highly efficient, composable, and lightweight ETL and data integration framework.
analytics arrow big-data data data-engineering data-integration data-science duckdb elt etl ingestion ingress ml olap pipeline polars postgresql python s3
Last synced: 10 Mar 2026
https://github.com/nelson-gon/mde
mde: Missing Data Explorer
data-analysis data-cleaning data-exploration data-science datacleaner datacleaning exploratory-data-analysis missing missing-data missing-value-treatment missing-values missingness omit r r-package r-stats recode replace rstats statistics
Last synced: 24 Jul 2025
https://github.com/ahammadmejbah/ultimate-data-science-resources
๐ Welcome to the Unlimited Data Science Resources community! Dive into a wealth of knowledge with curated tutorials, courses, and insights. Elevate your data science journey with boundless learning opportunities! ๐โจ
data-engineering data-mining data-science data-visualization database datascience
Last synced: 26 Feb 2025