Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/climopy-dev/climopy

🌍🌏🌎 A succinct toolset for analyzing climate data. This project is a work-in-progress.

climate-analysis climate-science data-science python xarray xarray-accessor

Last synced: 08 Aug 2024

https://github.com/code2k13/feed-visualizer

Feed Visualizer creates interactive visualizations by clustering RSS/Atom feed items based on semantic similarity. Feed Visualizer also attempts to automatically predict the labels for each cluster. This application will create a "semantic summary" of a website's contents by scanning its RSS/Atom feed, allowing for easy discovery and navigation to topics of interest. Feed Visualizer creates interactive visualizations in the form of static HTML and JS files, which may be edited and sent to a server.

artificial-intelligence atom data-science data-visualization machine-learning no-code python rss semantic-similarity visualization

Last synced: 13 Nov 2024

https://github.com/sanjinkurelic/casebasedreasoning

Find missing values in data set using Euclid distance, normalization and calculating information value, weight of evidence

case-based-reasoning csv data-science influence information-value machine-learning numpy pandas python3 weight-of-evidence

Last synced: 06 Nov 2024

https://github.com/naqvis/crysda

Crystal library for Data Analysis, Wrangling, Munging

crystal crystal-lang crystal-language crystal-shard data-a data-science data-wrangling

Last synced: 09 Nov 2024

https://github.com/heavyai/heavyai.jl

Julia client for OmniSci GPU-accelerated SQL engine and analytics platform

cuda data-science database gpu julia-language julia-package julialang sql

Last synced: 31 Oct 2024

https://github.com/alenrajsp/tcxreader

tcxreader is a reader / parser for Garmin’s TCX file format. It also works well with missing data!

data-mining data-science python sports-analytics tcx tcx-parser

Last synced: 07 Nov 2024

https://github.com/onlyphantom/textmining

Beginner's Introduction to Text Mining: An App Store Reviews Exercise

app appstore data-science r reviews sentiment-analysis text-mining wordcloud

Last synced: 08 Nov 2024

https://github.com/kalebu/desktop-chatbot-app

A python knowledge-based chatbot application built with Tkinter

chatbot chatbot-application data-science nlp nlp-projects python-tanzania python3 tanzania

Last synced: 09 Nov 2024

https://github.com/chiarorosa/ia_aprendizado_maquina_basico

Material Básico sobre Inteligência Artificial aplicando Aprendizado de Máquina e Data Science

artificial-intelligence data-science machine-learning python

Last synced: 14 Nov 2024

https://github.com/autonomio/studio

GUI for Keras and TensorFlow with integrated hyperparameter optimization and NLP

ai artificial-intelligence data-science deep-learning hyperparameter-optimization hyperparameter-tuning keras tensorflow

Last synced: 06 Nov 2024

https://github.com/ikivanc/data-driven-cycling-and-workout-prediction

Data-Driven Cycling using Strava data and GPX data analysis. Digital Personal Trainer using old cycling workout data to predict new workouts

botframework chatbot csharp cycling cycling-workouts data-science digital-assistant fastapi gpx-files jupyter-notebook machine-learning machine-learning-algorithms microsoft-teams python strava strava-data

Last synced: 09 Nov 2024

https://github.com/hneth/ds4psy

Data science for psychologists (ds4psy): R package supporting book and course

data-literacy data-science education exploratory-data-analysis psychology r r-package social-sciences visualisation

Last synced: 01 Nov 2024

https://github.com/aws/amazon-finspace-examples

This repo contains sample code and sample notebooks to illustrate how to work with Amazon FinSpace

aws data-science data-versioning examples finspace timeseries-analysis

Last synced: 07 Oct 2024

https://github.com/pyurbans/urbans

A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.

artificial-intelligence data-science machine-translation nlp python

Last synced: 10 Nov 2024

https://github.com/imvladikon/yandex-practicum

tasks and projects from the data science course by Yandex.Practicum

data-science jupyter-notebook

Last synced: 09 Nov 2024

https://github.com/zhoudaxia233/pyalpha

A process mining tool written in Python3

alpha-miner data-science petri-net process-mining

Last synced: 03 Aug 2024

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 26 Oct 2024

https://github.com/rpodcast/shinycal

The Data Science StreamRs Calendar!

data-science r shiny streaming

Last synced: 05 Nov 2024

https://github.com/tjpalanca/facebook-news-analysis

Analysis of Facebook News in the Philippines

analysis data data-science facebook news philippines

Last synced: 14 Oct 2024

https://github.com/mainakrepositor/whosthegoat

Find out which footballer is the greatest of all times from their La-Liga stats. Is it Leo Messi or CR7?

data-science data-visualization football-data messi ronaldo streamlit webapp

Last synced: 12 Nov 2024

https://github.com/njanakiev/scalable-geospatial-data-science

Scripts and notebooks for scalable geospatial data science

data-science geospatial python

Last synced: 06 Nov 2024

https://github.com/saranshbansal/data-science-with-python

Data science with Python: This repository mostly contains DataCamp data-science courses/exercises that I have completed.

data-analysis data-science datacamp-exercises numpy python

Last synced: 09 Nov 2024

https://github.com/inab/biolitmap

Code for the paper "BIOLITMAP: a web-based geolocated and temporal visualization of the evolution of bioinformatics publications" in Oxford Bioinformatics.

data-mining data-science data-visualization machine-learning maps natural-language-processing research research-paper science social-analytics-team

Last synced: 10 Nov 2024

https://github.com/ccao-data/model-res-avm

Automated valuation model for all class 200 residential properties in Cook County (except vacant land and condos)

assessment data-science machine-learning model property-taxes r res tidymodels

Last synced: 14 Nov 2024

https://github.com/anselmoo/spectrafit

📊📈🔬 SpectraFit is a command-line and Jupyter-notebook tool for quick data-fitting based on the regular expression of distribution functions.

console-application curve-fitting data-analysis data-science fitting juypter-notebook numpy pandas python science science-research scientific-plotting spectral-analysis spectroscopy

Last synced: 27 Oct 2024

https://github.com/ahammadmejbah/ahammadmejbah

Data Science || Machine Learning || Deep Learning || Computer Vision || NLP Enthusiast Talks about #datascience, #deeplearning, #dataanalytics, #machinelearning, and #machinelearningalgorithms

artificial-intelligence computer-vision data-science deep-learning machine-learning nlp python

Last synced: 11 Nov 2024

https://github.com/ahammadmejbah/pytorch-developers-roadmap

PyTorch is an open-source machine learning framework that provides a flexible platform for building, training, and deploying deep learning models. It is widely used for research and development in artificial intelligence, offering dynamic computation, GPU acceleration, and a rich ecosystem of libraries and tools.

ai data-science deep-learning developer machine-learning python python3 pytroch

Last synced: 11 Nov 2024

https://github.com/amine-smahi/r-learning-journey

Some of the projects i made when starting to learn R for Data Science at the university

afc cpa data-cleaning data-integration data-science datascience r r-language

Last synced: 27 Oct 2024

https://github.com/gyrdym/ml_preprocessing

Implementation of popular data preprocessing algorithms for Machine learning

data-preprocessing data-science machine-learning machine-learning-algorithms onehot-encoder ordinal-encoder

Last synced: 28 Oct 2024

https://github.com/nuhmanpk/Webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code

audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets

Last synced: 04 Aug 2024

https://github.com/mainakrepositor/brain-stroke-detection

Detects Brain Stroke using machine learning models with the highest optimal probability

data-science deployment-automation gui-application machine-learning streamlit-webapp

Last synced: 12 Nov 2024

https://github.com/ragibhasan894/phishing_website_detection

This project is based on detecting phishing/fraud/malicious website using Random Forest Classification formula. Implemented using Python programming language and Django framework.

cyber-security data-mining data-science django django-framework machine-learning phsihing python random-forest scikit-learn security

Last synced: 11 Oct 2024

https://github.com/PySloth/pysloth

A Python Package for Probabilistic Prediction

data-analysis data-science machine-learning python statistics

Last synced: 03 Aug 2024

https://github.com/OGFris/GoStats

GoStats is a go library for math statistics mostly used in ML domains, it covers most of the statistical measures functions.

data-science go golang gostats machine-learning math mathematics mit-license statistical-measures statistics stats

Last synced: 25 Oct 2024

https://github.com/catdevnull/preciazo

analisis de precios en supermercados minoristas. en constante evolución https://preciazo.nulo.in

data data-science price-tracker scraper supermarket

Last synced: 27 Oct 2024

https://github.com/dataprofessor/streamlit-for-datascience

The Streamlit for Data Science shows how to build interactive data apps powered by data visualization and machine learning!!

data-science machine-learning numpy pandas python

Last synced: 11 Nov 2024

https://github.com/rurlus/diptest

Python/C++ implementation of Hartigan & Hartigan's dip test, based on Martin Maechler's R package

data-science modality python statistics unimodal

Last synced: 15 Nov 2024

https://github.com/tjmahr/polypoly

Helper functions for orthogonal polynomials in R

data-science r statistics

Last synced: 12 Nov 2024

https://github.com/somdeep/Statball

Statball - Football soccer stats analyser from top 5 european leagues with data obtained by web scraping from Fbref and Statsbomb

csharp data-science data-scraping data-viz dotnet dotnet-core fbref football football-analytics football-data scouting-data scraping soccer soccer-analytics soccer-data statsbomb tableau visualizations

Last synced: 03 Nov 2024

https://github.com/bcgov/bcgroundwater

An R package to facilitate analysis and visualization of groundwater data from the British Columbia groundwater observation well network

data-science env r rstats

Last synced: 08 Aug 2024

https://github.com/bcgov/wqbc

An R package for water quality thresholds and index calculation for British Columbia

data-science env r r-package rstats

Last synced: 08 Aug 2024

https://github.com/leerenjie/100-days-of-code-in-python

Udemy Angela Yu's course has 100 projects for students to make each day with classes for 2 hours each day. This repository will store all the related projects

100-days-of-code api backend-webdevelopment data-science database flask frontend-web game-development version-control

Last synced: 07 Nov 2024

https://github.com/bartczernicki/ArtificialIntelligence-Presentations

Public location of delivered Artificial Intelligence & Machine Intelligence Presentations

analytics artificial-intelligence data-science machine-learning

Last synced: 09 Nov 2024

https://github.com/gbeckers/darr

A Python library for numpy arrays that persist on disk in a format that is simple, self-documented and tool-independent, and maximizes universal readability.

array bsd-3-clause data-science data-sharing data-storage idl interoperability jagged-array julia-language maple mathematica matlab numeric octave python r ragged-array science scilab

Last synced: 11 Oct 2024

https://github.com/brianruizy/2019-microsoft-iot-hackathon

🥇 1st place winner | Bump.IT - Pothole detection and mapping. Using data science methods of analysis, mobile phone's telemetry, computer vision, and, deployed through Azure.

computer-vision data-science geocoding internet-of-things pothole-detection

Last synced: 27 Oct 2024

https://github.com/brakmic/data-science-for-losers

:chart_with_upwards_trend: Articles on Data Science, Jupyter, and Pandas

data-science jupyter machine-learning python

Last synced: 08 Nov 2024

https://github.com/adamvvu/tsfracdiff

Efficient and easy to use fractional differentiation transformations for stationarizing time series data in Python.

data-science machine-learning python quantitative-finance

Last synced: 12 Nov 2024

https://github.com/facultyai/faculty

A Python library for interacting with the Faculty platform

data-science faculty-platform python

Last synced: 08 Nov 2024

https://github.com/jmshea/foundations-of-data-science-with-python

Interactive flashcards and quizzes, as well as additional tutorials, animations, and code, for "Foundations of Data Science with Python" by John M. Shea

data-science data-visualization probability statistics statistics-course

Last synced: 07 Nov 2024

https://github.com/noahgift/core-stats-datascience

Core Statistics for Datascience

core data-science pragmaticai statistics

Last synced: 11 Oct 2024

https://github.com/mauroluzzatto/explainy

explainy is a Python library for generating machine learning model explanations for humans

data-science explanation machine-learning machine-learning-explainability python scikit-learn

Last synced: 11 Nov 2024

https://github.com/mkearney/tfse

🛠 Useful R functions for various things

data-science functions mkearney-r-package r-language rstats utility

Last synced: 15 Nov 2024

https://github.com/bcgov/groundwater-levels-indicator

R scripts for an indicator on long-term trends in groundwater levels in B.C. published on Environmental Reporting BC

data-science env r rstats soe

Last synced: 08 Aug 2024

https://github.com/smathot/eeg_eyetracking_parser

Python routines for parsing of combined EEG and eye-tracking data

data data-science eeg eye eye-tracking mne pupillometry python

Last synced: 07 Nov 2024

https://github.com/csinva/data-viz-utils

Functions for easily making publication-quality figures with matplotlib.

big-data data-analysis data-science data-visualization eda legend matplotlib python python3 scatterplot time-series

Last synced: 09 Nov 2024

https://github.com/hfawaz/miccai18

Evaluating surgical skills from kinematic data using convolutional neural networks

class-activation-maps cnn cnn-keras data-science deep-learning research-paper surgery surgical time-series-classification

Last synced: 06 Nov 2024

https://github.com/alastairrushworth/tdf

🚴🏅📊Tour de France winners and stages data

data-science dataframe exploratory-data-analysis rstats tdf tour-de-france

Last synced: 14 Oct 2024

https://github.com/morgan-sell/caiso-price-forecast

Predicts the CAISO day-ahead market hourly prices using different forecasting methods including ARIMA and LSTM.

arima data-science electricity-prices lstm neural-networks python time-series

Last synced: 23 Oct 2024

https://github.com/laurentrdc/javelin

Haskell implementation of series, or labeled one-dimensional arrays.

data-science data-structures-and-algorithms haskell quantitative-finance

Last synced: 02 Nov 2024

https://github.com/nirala96/bangalore-house-prediction-app

Predicts home prices of Bangalore. Used Flutter, Flask and Jupyter Notebook.

data-science datacleaning exploratory-data-analysis flask-api flutter jupyter-notebook linear-regression python

Last synced: 28 Oct 2024

https://github.com/robinlovelace/opengeohub2023

Content for lecture at OpenGeoHub 2023 on spatial data and the tidyverse

course data-science opengeohub osgeo practical r reproducible summer-school tidy-data

Last synced: 27 Oct 2024

https://github.com/amitkaps/multidim

Visualising Multi Dimensional Data

data-science data-visualization grammar python r visualization

Last synced: 06 Nov 2024