Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/gabrieldim/a1on-webscraping-pandas-data-science

Learning WebScraping using Pandas in python. - Data Science

data data-science pandas sciecne web-scraping

Last synced: 20 Nov 2024

https://github.com/microsoft/autobrewml

With AutoBrewML Framework the time it takes to get production-ready ML models with great ease and efficiency highly accelerates.

anomaly-detection azure-automl cleansing-data data-science datavisualization machine-learning microsoft nlp-machine-learning responsible-ml sampling-strategies text-analysis text-classification text-summarization

Last synced: 08 Nov 2024

https://github.com/sidgupta234/codingninjas_datascience_machinelearning

The notebooks are written in a way that they are sufficient on their own to learn the basics of Python, Machine Learning and Data Science.

data-science jupyter-notebook machine-learning python-3

Last synced: 30 Nov 2024

https://github.com/chalmerlowe/machine_learning

A gentle introduction to machine learning: data handling, linear regression, naive bayes, clustering

data data-science linear-regression machine-learning nearest-neighbors python scikit-learn

Last synced: 16 Feb 2025

https://github.com/jameslamb/talks

Conference talks, meetup talks, and misc. writing

conference-talk data-science machine-learning open-source presentations python r

Last synced: 01 Jan 2025

https://github.com/github/mlops

Use GitHub to facilitate automation, collaboration and reproducibility in your machine learning workflows.

actions cicd data-science devops-tools machine-learning mlops pages primer primer-design

Last synced: 23 Jan 2025

https://github.com/brunorosilva/todoist-analytics

Just a simple app for weekly and monthly reviewing of tasks in todoist.

analytics dashboard data-science streamlit todoist

Last synced: 04 Dec 2024

https://github.com/thechymera/behaviopy

Behavioral data analysis and plotting in Python.

animal-behavior biomedical data-science foss multimodality plotting

Last synced: 16 Nov 2024

https://github.com/lourd/react-google-sheet

Pulling data from Google Sheets with React components

api-client data-science google-sheets javascript react spreadsheets

Last synced: 14 Oct 2024

https://github.com/incubated-geek-cc/Text-To-Speech-App

A Fusion of OCR Technology (Tesseract.js) & Web Speech API. Standalone, portable and works offline.

data-science javascript machine-learning ocr ocr-recognition tesseract tesseract-ocr tesseract-ocr-api tesseractjs webapp

Last synced: 08 Nov 2024

https://github.com/florents-tselai/pandas-sets

Set-oriented Operations in Pandas

data-science pandas set-operations sets

Last synced: 31 Oct 2024

https://github.com/tomasonjo/bitcoin-to-neo4jdash

Project that listens to bitcoin websocket API for new transactions and stores them to Neo4j to be analyzed

bitcoin dashboard data data-science graph graphdatabase neo4j python websocket

Last synced: 22 Oct 2024

https://github.com/gagandeepb/frames-beam

Accessing Postgres in a data frame in Haskell

data-science database postgres

Last synced: 19 Dec 2024

https://github.com/climopy-dev/climopy

🌍🌏🌎 A succinct toolset for analyzing climate data. This project is a work-in-progress.

climate-analysis climate-science data-science python xarray xarray-accessor

Last synced: 27 Nov 2024

https://github.com/mainakrepositor/covid19-india-bcr

A bar chart race demonstrating the start and trends of COVID-19 in India

barchartrace covid-19 data-science data-visualization dataanalysisandmlusingpython visualization

Last synced: 12 Nov 2024

https://github.com/brpy/ml-books

A list of freely available Machine Learning related books.

books data-science free freely machine-learning statistics

Last synced: 14 Feb 2025

https://github.com/solegalli/feature-selection-in-machine-learning-book

Code repository for the book feature selection in machine learning

data-science feature-selection machine-learning python

Last synced: 02 Nov 2024

https://github.com/mainakrepositor/brs

Recommend books using Machine Learning Techniques

data-science python-3

Last synced: 12 Nov 2024

https://github.com/bukson/nancorrmp

Parallel correlation calculation of big numpy arrays or pandas dataframes with NaNs and infs.

correlation correlation-matrices data-science machine-learning multiprocessing numpy pandas python

Last synced: 17 Dec 2024

https://github.com/facultyai/boltzmannclean

Fill missing values in Pandas DataFrames using Restricted Boltzmann Machines

data-cleaning data-science dataframe pandas restricted-boltzmann-machine

Last synced: 08 Nov 2024

https://github.com/medoidai/skrobot

skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.

artificial-intelligence data-science feature-engineering feature-selection hyperparameter-tuning machine-learning model-evaluation model-selection model-training model-tuning modelling predictive-modelling python scikit-learn

Last synced: 27 Oct 2024

https://github.com/1ambda/practical-data-pipeline

Gitbook Repo for Practical Data Pipeline :fire:

aws data-pipeline data-science gitbook

Last synced: 18 Nov 2024

https://github.com/humburg/reportmd

Create multi-page HTML reports in R

data-science r rmarkdown rstudio

Last synced: 27 Oct 2024

https://github.com/paulosalem/gpt3-poc-tutorial-with-braindump

A demo application to support my tutorial on building applications with GPT-3.

data-science gpt gpt-3 natural-language-understanding openai proof-of-concept

Last synced: 12 Nov 2024

https://github.com/RConsortium/r-collaboration

Open Collaboration, Data Registry, and Use Cases Developed by the R Community

data-analysis-in-r data-analytics data-science r

Last synced: 27 Nov 2024

https://github.com/nuhmanpk/webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code

audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets

Last synced: 28 Oct 2024

https://github.com/danlessa/coursera-networkx

Notebooks used in the Network Data Science with NetworkX and Python guided course

course-project coursera data-science network-science networkx

Last synced: 08 Feb 2025

https://github.com/isala404/speculo

Realtime face detection and recognition using deep learning

data-science face-recognition faces footages opencv python3 reactjs speculo surveillance tensorflow typescript

Last synced: 15 Oct 2024

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 22 Nov 2024

https://github.com/montanaz0r/mma-parser-for-sherdog-and-ufc-data

Python web scraper for Sherdog & UFC data. Creates output of your choice in csv or json format.

beautifulsoup data-science mma python ufc webscraping

Last synced: 14 Dec 2024

https://github.com/ahmedosamamath/statistics-basics

A comprehensive guide to applying statistical techniques in machine learning, including data preprocessing, model development, evaluation metrics, and real-world applications. This repository provides beginner-to-advanced insights into the statistical foundations of machine learning.

artificial-intelligence data-analysis data-science machine-learning statistics

Last synced: 05 Feb 2025

https://github.com/rfordatascience/r4dswebsite

Public repository for the R4DS community website.

blogdown data-analysis data-analytics data-science data-visualization r r4ds tidyverse

Last synced: 14 Nov 2024

https://github.com/koonimaru/omniplot

Statistical analysis, clustering and visualinzing scientific data with hassle free

data-science matplotlib numpy pandas python

Last synced: 16 Nov 2024

https://github.com/graphbookai/graphbook

The framework for AI-driven data pipelines. Build interactive, highly efficient data pipelines with PyTorch. ⭐ Leave a star to support us!

ai data-processing data-processing-pipelines data-science framework machine-learning ml pytorch research workflow

Last synced: 02 Dec 2024

https://github.com/maastrichtu-ids/dsri-documentation

📖 Documentation for the Data Science Research Infrastructure at Maastricht University

data-science data-science-research documentation dsri kubernetes openshift

Last synced: 21 Dec 2024

https://github.com/loaiabdalslam/dbd

Demo By Demo Machine Learning Book Written in Arabic

book data-science deep-learning machine-learning

Last synced: 15 Feb 2025

https://github.com/Azure/azure-data-labs

Terraform templates to deploy Azure Data resources

analytics azure blueprints data data-science github github-actions labs terraform

Last synced: 13 Nov 2024

https://github.com/azure/azure-data-labs

Terraform templates to deploy Azure Data resources

analytics azure blueprints data data-science github github-actions labs terraform

Last synced: 05 Feb 2025

https://github.com/nrennie/data-science-resources

Resources relating to data science.

data-science resources

Last synced: 08 Feb 2025

https://github.com/tushar2704/powerbi-portfolio

Welcome to my personal Power BI portfolio repository! Here you will find a collection of Power BI projects and dashboards that demonstrate my skills and expertise in data visualization, business intelligence, and analytics using Power BI.

artificial-intelligence dashboards data-science data-visualization powerbi streamlit-tushar2704 tushar2704

Last synced: 18 Feb 2025

https://github.com/sanjinkurelic/casebasedreasoning

Find missing values in data set using Euclid distance, normalization and calculating information value, weight of evidence

case-based-reasoning csv data-science influence information-value machine-learning numpy pandas python3 weight-of-evidence

Last synced: 06 Nov 2024

https://github.com/ragibhasan894/phishing_website_detection

This project is based on detecting phishing/fraud/malicious website using Random Forest Classification formula. Implemented using Python programming language and Django framework.

cyber-security data-mining data-science django django-framework machine-learning phsihing python random-forest scikit-learn security

Last synced: 12 Feb 2025

https://github.com/naqvis/crysda

Crystal library for Data Analysis, Wrangling, Munging

crystal crystal-lang crystal-language crystal-shard data-a data-science data-wrangling

Last synced: 09 Nov 2024

https://github.com/code2k13/feed-visualizer

Feed Visualizer creates interactive visualizations by clustering RSS/Atom feed items based on semantic similarity. Feed Visualizer also attempts to automatically predict the labels for each cluster. This application will create a "semantic summary" of a website's contents by scanning its RSS/Atom feed, allowing for easy discovery and navigation to topics of interest. Feed Visualizer creates interactive visualizations in the form of static HTML and JS files, which may be edited and sent to a server.

artificial-intelligence atom data-science data-visualization machine-learning no-code python rss semantic-similarity visualization

Last synced: 13 Nov 2024

https://github.com/gbeckers/darr

A Python library for numpy arrays that persist on disk in a format that is simple, self-documented and tool-independent, and maximizes universal readability.

array bsd-3-clause data-science data-sharing data-storage idl interoperability jagged-array julia-language maple mathematica matlab numeric octave python r ragged-array science scilab

Last synced: 19 Dec 2024

https://github.com/kalebu/desktop-chatbot-app

A python knowledge-based chatbot application built with Tkinter

chatbot chatbot-application data-science nlp nlp-projects python-tanzania python3 tanzania

Last synced: 09 Nov 2024

https://github.com/mainakrepositor/whosthegoat

Find out which footballer is the greatest of all times from their La-Liga stats. Is it Leo Messi or CR7?

data-science data-visualization football-data messi ronaldo streamlit webapp

Last synced: 12 Nov 2024

https://github.com/khuyentran1401/python_snippet

Python and data science snippets on the command line

cli command-line command-line-tool data-science python python3 snippet

Last synced: 03 Dec 2024

https://github.com/pyurbans/urbans

A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.

artificial-intelligence data-science machine-translation nlp python

Last synced: 10 Nov 2024

https://github.com/zhoudaxia233/pyalpha

A process mining tool written in Python3

alpha-miner data-science petri-net process-mining

Last synced: 15 Dec 2024

https://github.com/rpodcast/shinycal

The Data Science StreamRs Calendar!

data-science r shiny streaming

Last synced: 05 Nov 2024

https://github.com/imvladikon/yandex-practicum

tasks and projects from the data science course by Yandex.Practicum

data-science jupyter-notebook

Last synced: 09 Nov 2024

https://github.com/pgebert/bike-sharing-dataset

Analysis and model development for the Kaggle Bike Sharing Dataset.

bike-sharing-dataset bikesharing data-science jupyter kaggle python

Last synced: 27 Jan 2025

https://github.com/cfpb/aurora

An open source enterprise data warehousing and analysis platform.

ansible data-science data-warehousing

Last synced: 07 Dec 2024

https://github.com/alenrajsp/tcxreader

tcxreader is a reader / parser for Garmin’s TCX file format. It also works well with missing data!

data-mining data-science python sports-analytics tcx tcx-parser

Last synced: 07 Nov 2024

https://github.com/hneth/ds4psy

Data science for psychologists (ds4psy): R package supporting book and course

data-literacy data-science education exploratory-data-analysis psychology r r-package social-sciences visualisation

Last synced: 01 Nov 2024

https://github.com/ikivanc/data-driven-cycling-and-workout-prediction

Data-Driven Cycling using Strava data and GPX data analysis. Digital Personal Trainer using old cycling workout data to predict new workouts

botframework chatbot csharp cycling cycling-workouts data-science digital-assistant fastapi gpx-files jupyter-notebook machine-learning machine-learning-algorithms microsoft-teams python strava strava-data

Last synced: 09 Nov 2024

https://github.com/chiarorosa/ia_aprendizado_maquina_basico

Material Básico sobre Inteligência Artificial aplicando Aprendizado de Máquina e Data Science

artificial-intelligence data-science machine-learning python

Last synced: 14 Nov 2024

https://github.com/ajl2718/whereabouts

Fast, accurate, open-source geocoding in Python

data-science duckdb geocoding geospatial record-linkage

Last synced: 04 Jan 2025

https://github.com/charlywargnier/keywordmapperforbrightonseo

As part of my Brightonseo talk, I created a mighty Streamlit app which auto-maps your keywords to your crawled URLs!

data-science python seo streamlit

Last synced: 19 Nov 2024

https://github.com/hassaku/ds-and-ml-with-screen-reader

Data science and machine learning resources for screen reader users

colaboratory data-science machine-learning python screen-reader visually-impaired

Last synced: 17 Feb 2025

https://github.com/aws/amazon-finspace-examples

This repo contains sample code and sample notebooks to illustrate how to work with Amazon FinSpace

aws data-science data-versioning examples finspace timeseries-analysis

Last synced: 05 Feb 2025

https://github.com/tjpalanca/facebook-news-analysis

Analysis of Facebook News in the Philippines

analysis data data-science facebook news philippines

Last synced: 14 Oct 2024