Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/florents-tselai/pandas-sets

Set-oriented Operations in Pandas

data-science pandas set-operations sets

Last synced: 31 Oct 2024

https://github.com/tomasonjo/bitcoin-to-neo4jdash

Project that listens to bitcoin websocket API for new transactions and stores them to Neo4j to be analyzed

bitcoin dashboard data data-science graph graphdatabase neo4j python websocket

Last synced: 22 Oct 2024

https://github.com/anitagraser/eda-protocol-movement-data

Step-by-step exploratory movement data analysis protocol in a Jupyter notebook

data-quality-assessment data-science exploratory-data-analysis movement-data

Last synced: 10 Nov 2024

https://github.com/bgroenks96/normalizing-flows

Implementations of normalizing flows using python and tensorflow

data-science machine-learning machine-learning-algorithms normalizing-flows

Last synced: 28 Oct 2024

https://github.com/incubated-geek-cc/text-to-speech-app

A Fusion of OCR Technology (Tesseract.js) & Web Speech API. Standalone, portable and works offline.

data-science javascript machine-learning ocr ocr-recognition tesseract tesseract-ocr tesseract-ocr-api tesseractjs webapp

Last synced: 15 Nov 2024

https://github.com/luiscib3r/solar-rad-forecasting

In these notebooks the entire research and implementation process carried out for the construction of various machine learning models based on neural networks that are capable of predicting levels of solar radiation is captured given a set of historical data taken by meteorological stations.

convolutional-neural-networks data-science deep-learning forecasting machine-learning rnn rnn-tensorflow

Last synced: 05 Nov 2024

https://github.com/paulosalem/gpt3-poc-tutorial-with-braindump

A demo application to support my tutorial on building applications with GPT-3.

data-science gpt gpt-3 natural-language-understanding openai proof-of-concept

Last synced: 12 Nov 2024

https://github.com/chalmerlowe/machine_learning

A gentle introduction to machine learning: data handling, linear regression, naive bayes, clustering

data data-science linear-regression machine-learning nearest-neighbors python scikit-learn

Last synced: 12 Oct 2024

https://github.com/facultyai/boltzmannclean

Fill missing values in Pandas DataFrames using Restricted Boltzmann Machines

data-cleaning data-science dataframe pandas restricted-boltzmann-machine

Last synced: 08 Nov 2024

https://github.com/aatmunbaxi/orgroamtools

Helper library for data analysis of org-roam collections

data-science emacs exploratory-data-analysis library org-roam personal-knowledge-management python

Last synced: 12 Oct 2024

https://github.com/humburg/reportmd

Create multi-page HTML reports in R

data-science r rmarkdown rstudio

Last synced: 27 Oct 2024

https://github.com/medoidai/skrobot

skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.

artificial-intelligence data-science feature-engineering feature-selection hyperparameter-tuning machine-learning model-evaluation model-selection model-training model-tuning modelling predictive-modelling python scikit-learn

Last synced: 27 Oct 2024

https://github.com/1ambda/practical-data-pipeline

Gitbook Repo for Practical Data Pipeline :fire:

aws data-pipeline data-science gitbook

Last synced: 18 Nov 2024

https://github.com/jameslamb/talks

Conference talks, meetup talks, and misc. writing

conference-talk data-science machine-learning open-source presentations python r

Last synced: 28 Oct 2024

https://github.com/RConsortium/r-collaboration

Open Collaboration, Data Registry, and Use Cases Developed by the R Community

data-analysis-in-r data-analytics data-science r

Last synced: 08 Aug 2024

https://github.com/nuhmanpk/webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code

audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets

Last synced: 28 Oct 2024

https://github.com/solegalli/feature-selection-in-machine-learning-book

Code repository for the book feature selection in machine learning

data-science feature-selection machine-learning python

Last synced: 02 Nov 2024

https://github.com/mainakrepositor/brs

Recommend books using Machine Learning Techniques

data-science python-3

Last synced: 12 Nov 2024

https://github.com/mainakrepositor/covid19-india-bcr

A bar chart race demonstrating the start and trends of COVID-19 in India

barchartrace covid-19 data-science data-visualization dataanalysisandmlusingpython visualization

Last synced: 12 Nov 2024

https://github.com/isala404/speculo

Realtime face detection and recognition using deep learning

data-science face-recognition faces footages opencv python3 reactjs speculo surveillance tensorflow typescript

Last synced: 15 Oct 2024

https://github.com/koonimaru/omniplot

Statistical analysis, clustering and visualinzing scientific data with hassle free

data-science matplotlib numpy pandas python

Last synced: 16 Nov 2024

https://github.com/azure/azure-data-labs

Terraform templates to deploy Azure Data resources

analytics azure blueprints data data-science github github-actions labs terraform

Last synced: 07 Oct 2024

https://github.com/datalab-platform/datalab

Open-source Platform for Scientific and Technical Data Processing and Visualization

data-science data-visualization image-processing opencv python scientific-computing scikit-image scipy signal-processing visualization

Last synced: 11 Oct 2024

https://github.com/brunorosilva/todoist-analytics

Just a simple app for weekly and monthly reviewing of tasks in todoist.

analytics dashboard data-science streamlit todoist

Last synced: 13 Aug 2024

https://github.com/naqvis/crysda

Crystal library for Data Analysis, Wrangling, Munging

crystal crystal-lang crystal-language crystal-shard data-a data-science data-wrangling

Last synced: 09 Nov 2024

https://github.com/sanjinkurelic/casebasedreasoning

Find missing values in data set using Euclid distance, normalization and calculating information value, weight of evidence

case-based-reasoning csv data-science influence information-value machine-learning numpy pandas python3 weight-of-evidence

Last synced: 06 Nov 2024

https://github.com/Azure/azure-data-labs

Terraform templates to deploy Azure Data resources

analytics azure blueprints data data-science github github-actions labs terraform

Last synced: 13 Nov 2024

https://github.com/climopy-dev/climopy

🌍🌏🌎 A succinct toolset for analyzing climate data. This project is a work-in-progress.

climate-analysis climate-science data-science python xarray xarray-accessor

Last synced: 08 Aug 2024

https://github.com/code2k13/feed-visualizer

Feed Visualizer creates interactive visualizations by clustering RSS/Atom feed items based on semantic similarity. Feed Visualizer also attempts to automatically predict the labels for each cluster. This application will create a "semantic summary" of a website's contents by scanning its RSS/Atom feed, allowing for easy discovery and navigation to topics of interest. Feed Visualizer creates interactive visualizations in the form of static HTML and JS files, which may be edited and sent to a server.

artificial-intelligence atom data-science data-visualization machine-learning no-code python rss semantic-similarity visualization

Last synced: 13 Nov 2024

https://github.com/rfordatascience/r4dswebsite

Public repository for the R4DS community website.

blogdown data-analysis data-analytics data-science data-visualization r r4ds tidyverse

Last synced: 14 Nov 2024

https://github.com/rpodcast/shinycal

The Data Science StreamRs Calendar!

data-science r shiny streaming

Last synced: 05 Nov 2024

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 26 Oct 2024

https://github.com/charlywargnier/keywordmapperforbrightonseo

As part of my Brightonseo talk, I created a mighty Streamlit app which auto-maps your keywords to your crawled URLs!

data-science python seo streamlit

Last synced: 19 Nov 2024

https://github.com/mainakrepositor/whosthegoat

Find out which footballer is the greatest of all times from their La-Liga stats. Is it Leo Messi or CR7?

data-science data-visualization football-data messi ronaldo streamlit webapp

Last synced: 12 Nov 2024

https://github.com/tjpalanca/facebook-news-analysis

Analysis of Facebook News in the Philippines

analysis data data-science facebook news philippines

Last synced: 14 Oct 2024

https://github.com/autonomio/studio

GUI for Keras and TensorFlow with integrated hyperparameter optimization and NLP

ai artificial-intelligence data-science deep-learning hyperparameter-optimization hyperparameter-tuning keras tensorflow

Last synced: 06 Nov 2024

https://github.com/chiarorosa/ia_aprendizado_maquina_basico

Material Básico sobre Inteligência Artificial aplicando Aprendizado de Máquina e Data Science

artificial-intelligence data-science machine-learning python

Last synced: 14 Nov 2024

https://github.com/heavyai/heavyai.jl

Julia client for OmniSci GPU-accelerated SQL engine and analytics platform

cuda data-science database gpu julia-language julia-package julialang sql

Last synced: 31 Oct 2024

https://github.com/onlyphantom/textmining

Beginner's Introduction to Text Mining: An App Store Reviews Exercise

app appstore data-science r reviews sentiment-analysis text-mining wordcloud

Last synced: 08 Nov 2024

https://github.com/hneth/ds4psy

Data science for psychologists (ds4psy): R package supporting book and course

data-literacy data-science education exploratory-data-analysis psychology r r-package social-sciences visualisation

Last synced: 01 Nov 2024

https://github.com/zhoudaxia233/pyalpha

A process mining tool written in Python3

alpha-miner data-science petri-net process-mining

Last synced: 18 Nov 2024

https://github.com/ikivanc/data-driven-cycling-and-workout-prediction

Data-Driven Cycling using Strava data and GPX data analysis. Digital Personal Trainer using old cycling workout data to predict new workouts

botframework chatbot csharp cycling cycling-workouts data-science digital-assistant fastapi gpx-files jupyter-notebook machine-learning machine-learning-algorithms microsoft-teams python strava strava-data

Last synced: 09 Nov 2024

https://github.com/kalebu/desktop-chatbot-app

A python knowledge-based chatbot application built with Tkinter

chatbot chatbot-application data-science nlp nlp-projects python-tanzania python3 tanzania

Last synced: 09 Nov 2024

https://github.com/aws/amazon-finspace-examples

This repo contains sample code and sample notebooks to illustrate how to work with Amazon FinSpace

aws data-science data-versioning examples finspace timeseries-analysis

Last synced: 07 Oct 2024

https://github.com/imvladikon/yandex-practicum

tasks and projects from the data science course by Yandex.Practicum

data-science jupyter-notebook

Last synced: 09 Nov 2024

https://github.com/pyurbans/urbans

A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.

artificial-intelligence data-science machine-translation nlp python

Last synced: 10 Nov 2024

https://github.com/alenrajsp/tcxreader

tcxreader is a reader / parser for Garmin’s TCX file format. It also works well with missing data!

data-mining data-science python sports-analytics tcx tcx-parser

Last synced: 07 Nov 2024

https://github.com/ccao-data/model-res-avm

Automated valuation model for all class 200 residential properties in Cook County (except vacant land and condos)

assessment data-science machine-learning model property-taxes r res tidymodels

Last synced: 14 Nov 2024

https://github.com/njanakiev/scalable-geospatial-data-science

Scripts and notebooks for scalable geospatial data science

data-science geospatial python

Last synced: 06 Nov 2024

https://github.com/saranshbansal/data-science-with-python

Data science with Python: This repository mostly contains DataCamp data-science courses/exercises that I have completed.

data-analysis data-science datacamp-exercises numpy python

Last synced: 09 Nov 2024

https://github.com/inab/biolitmap

Code for the paper "BIOLITMAP: a web-based geolocated and temporal visualization of the evolution of bioinformatics publications" in Oxford Bioinformatics.

data-mining data-science data-visualization machine-learning maps natural-language-processing research research-paper science social-analytics-team

Last synced: 10 Nov 2024

https://github.com/anselmoo/spectrafit

📊📈🔬 SpectraFit is a command-line and Jupyter-notebook tool for quick data-fitting based on the regular expression of distribution functions.

console-application curve-fitting data-analysis data-science fitting juypter-notebook numpy pandas python science science-research scientific-plotting spectral-analysis spectroscopy

Last synced: 17 Nov 2024

https://github.com/ahammadmejbah/pytorch-developers-roadmap

PyTorch is an open-source machine learning framework that provides a flexible platform for building, training, and deploying deep learning models. It is widely used for research and development in artificial intelligence, offering dynamic computation, GPU acceleration, and a rich ecosystem of libraries and tools.

ai data-science deep-learning developer machine-learning python python3 pytroch

Last synced: 11 Nov 2024

https://github.com/amine-smahi/r-learning-journey

Some of the projects i made when starting to learn R for Data Science at the university

afc cpa data-cleaning data-integration data-science datascience r r-language

Last synced: 27 Oct 2024

https://github.com/gyrdym/ml_preprocessing

Implementation of popular data preprocessing algorithms for Machine learning

data-preprocessing data-science machine-learning machine-learning-algorithms onehot-encoder ordinal-encoder

Last synced: 28 Oct 2024

https://github.com/ahammadmejbah/ahammadmejbah

Data Science || Machine Learning || Deep Learning || Computer Vision || NLP Enthusiast Talks about #datascience, #deeplearning, #dataanalytics, #machinelearning, and #machinelearningalgorithms

artificial-intelligence computer-vision data-science deep-learning machine-learning nlp python

Last synced: 11 Nov 2024

https://github.com/nuhmanpk/Webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code

audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets

Last synced: 04 Aug 2024

https://github.com/mainakrepositor/brain-stroke-detection

Detects Brain Stroke using machine learning models with the highest optimal probability

data-science deployment-automation gui-application machine-learning streamlit-webapp

Last synced: 12 Nov 2024

https://github.com/PySloth/pysloth

A Python Package for Probabilistic Prediction

data-analysis data-science machine-learning python statistics

Last synced: 17 Nov 2024

https://github.com/OGFris/GoStats

GoStats is a go library for math statistics mostly used in ML domains, it covers most of the statistical measures functions.

data-science go golang gostats machine-learning math mathematics mit-license statistical-measures statistics stats

Last synced: 25 Oct 2024

https://github.com/ragibhasan894/phishing_website_detection

This project is based on detecting phishing/fraud/malicious website using Random Forest Classification formula. Implemented using Python programming language and Django framework.

cyber-security data-mining data-science django django-framework machine-learning phsihing python random-forest scikit-learn security

Last synced: 11 Oct 2024

https://github.com/bartczernicki/ArtificialIntelligence-Presentations

Public location of delivered Artificial Intelligence & Machine Intelligence Presentations

analytics artificial-intelligence data-science machine-learning

Last synced: 09 Nov 2024

https://github.com/tjmahr/polypoly

Helper functions for orthogonal polynomials in R

data-science r statistics

Last synced: 12 Nov 2024

https://github.com/somdeep/Statball

Statball - Football soccer stats analyser from top 5 european leagues with data obtained by web scraping from Fbref and Statsbomb

csharp data-science data-scraping data-viz dotnet dotnet-core fbref football football-analytics football-data scouting-data scraping soccer soccer-analytics soccer-data statsbomb tableau visualizations

Last synced: 03 Nov 2024

https://github.com/rurlus/diptest

Python/C++ implementation of Hartigan & Hartigan's dip test, based on Martin Maechler's R package

data-science modality python statistics unimodal

Last synced: 15 Nov 2024

https://github.com/catdevnull/preciazo

analisis de precios en supermercados minoristas. en constante evolución https://preciazo.nulo.in

data data-science price-tracker scraper supermarket

Last synced: 27 Oct 2024

https://github.com/bcgov/bcgroundwater

An R package to facilitate analysis and visualization of groundwater data from the British Columbia groundwater observation well network

data-science env r rstats

Last synced: 08 Aug 2024

https://github.com/bcgov/wqbc

An R package for water quality thresholds and index calculation for British Columbia

data-science env r r-package rstats

Last synced: 08 Aug 2024