Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

https://github.com/imgcook/datacook

Machine Learning and Data Analysis in JavaScript.

data-science feature-engineering javascript machine-learning

Last synced: 02 Aug 2024

https://github.com/briatte/dsr

Introduction to Data Science with R (Sciences Po, Paris, 2023)

course data-analysis data-science data-visualization r statistics

Last synced: 02 Aug 2024

https://github.com/kb22/GitHub-User-Insights-using-API

The project involves using the GitHub API using user authentication to fetch information such as commits and repositories for that specific user and store them as CSV files for data collection and analysis.

api data-analysis data-science data-scraping github-api python

Last synced: 01 Aug 2024

https://github.com/weiji14/deepbedmap

Going beyond BEDMAP2 using a super resolution deep neural network. Also a convenient flat file data repository for high resolution bed elevation datasets around Antarctica.

antarctica bedmap binder chainer data-science deep-neural-network digital-elevation-model flat-file-db generative-adversarial-network glaciology jupyter-notebook optuna pangeo remote-sensing super-resolution

Last synced: 08 Aug 2024

https://github.com/stefan-m-lenz/BoltzmannMachines.jl

A Julia package for training and evaluating multimodal deep Boltzmann machines

data-science deep-boltzmann-machine deep-learning julia machine-learning neural-networks restricted-boltzmann-machine

Last synced: 02 Aug 2024

https://github.com/SOCR/SOCRAT

A Dynamic Web Toolbox for Interactive Data Processing, Analysis, and Visualization

data-analysis data-science data-visualization socr statistics visual-analytics visualization

Last synced: 01 Aug 2024

https://github.com/AidanCooper/shap-analysis-guide

How to Interpret SHAP Analyses: A Non-Technical Guide

data-science machine-learning shap tutorial

Last synced: 02 Aug 2024

https://github.com/leemengtw/gist-evernote

A Python application that sync Github Gists and save them to Evernote notebook as screenshots.

data-science evernote gists github github-graphql jupyter-notebook pet-project python selenium sync

Last synced: 07 Aug 2024

https://github.com/jmari/iPharo

Pharo Smaltalk kernel for Jupyter

data-science jupyter-notebook pharo pharo-smalltalk smalltalk

Last synced: 03 Aug 2024

https://github.com/tejzpr/ordered-concurrently

Ordered-concurrently a library for concurrent processing with ordered output in Go. Process work concurrently and returns output in a channel in the order of input. It is useful in concurrently processing items in a queue, and get output in the order provided by the queue.

concurrent concurrent-data-structure data-pipeline data-science golang golang-library ordered parallel parallel-computing

Last synced: 30 Jul 2024

https://github.com/jhwohlgemuth/pwsh-prelude

PowerShell “standard” library for supercharging your productivity. Provides a powerful cross-platform scripting environment enabling efficient analysis and sustainable science in myriad contexts.

applied-mathematics cli cli-app data-science hacktoberfest library mathematics powershell powershell-module statistics text-processing text-to-speech user-interface

Last synced: 13 Aug 2024

https://github.com/ActuariesInstitute/cookbook

Data and analytics cookbook for actuaries

actuarial analytics data-science hacktoberfest

Last synced: 08 Aug 2024

https://github.com/rafzamb/sknifedatar

sknifedatar is a package that serves primarily as an extension to the modeltime 📦 ecosystem. In addition to some functionalities of spatial data and visualization.

data data-analysis data-science data-visualization forecasting r statistics time-series

Last synced: 05 Aug 2024

https://github.com/AnonCatalyst/Coeus-OSINT-ToolBox

Coeus 🌐 is an OSINT ToolBox empowering users with tools for effective intelligence gathering from open sources. From social media monitoring 📱 to data analysis 📊, it offers a centralized platform for seamless OSINT investigations.

data-science data-visualization database forensic-analysis forensics forensics-tools framework information-retrieval infosec osint osint-framework osint-python osint-resources osint-tool osint-toolkit people-search reconnaissance

Last synced: 02 Aug 2024

https://github.com/benjaminmbrown/real-time-data-viz-d3-crossfilter-websocket-tutorial

Tutorial on real-time data visualization. Python websocket server & d3.js + crossfilter.js frontend

crossfilter d3 d3js data-science data-visualization dcjs tutorial websockets

Last synced: 06 Aug 2024

https://github.com/lamres/capm_shiny

Demo project of creating an interactive analytical tool for stock market using CAPM.

capm data-science r shiny shinyapps stock-market stocks time-series

Last synced: 13 Aug 2024

https://github.com/dfinke/PSDuckDB

PSDuckDB is a PowerShell module that provides seamless integration with DuckDB, enabling efficient execution of analytical SQL queries directly from the PowerShell environment.

data-analysis data-science duckdb powershell sql

Last synced: 23 Aug 2024

https://github.com/dMLTquant/openbb_sdk_exporation

Explore OpenBB SDK without having to install anything on your local machine. You just need a GitHub and a GitPod account.

algorithmic-trading data-science financial-data jupyter notebook openbb python

Last synced: 01 Aug 2024

https://github.com/center-for-threat-informed-defense/sightings_ecosystem

Sightings Ecosystem gives cyber defenders visibility into what adversaries actually do in the wild. With your help, we are tracking MITRE ATT&CK® techniques observed to give defenders real data on technique prevalence.

ctid cyber-threat-intelligence cybersecurity data-science data-visualization mitre-attack

Last synced: 04 Aug 2024

https://github.com/ak-coram/cl-duckdb

Common Lisp CFFI wrapper around the DuckDB C API

c-bindings common-lisp data-science duckdb lisp olap parquet sql

Last synced: 02 Aug 2024

https://github.com/IMSoley/cs-study-plan

📚👨‍🎓 Resources I'm using everyday to develop my skills to become a self-taught good programmer ...

artificial-intelligence computer-science data-science data-structures-and-algorithms higher-education machine-learning web-development

Last synced: 04 Aug 2024

https://github.com/wri-dssg-omdena/policy-data-analyzer

Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.

active-learning bert data-science document-classification environmental huggingface incentives landscape-restoration lda machine-learning nlp policy sbert scraping scrapy sentence-transformers spyder text-classification topic transformers

Last synced: 31 Jul 2024

https://github.com/Smat26/Roman-Urdu-Dataset

Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources

data-science dataset hindi hindi-language natural-language-processing nlp urdu urdu-language urdu-nlp

Last synced: 04 Aug 2024

https://github.com/petersontylerd/mlmachine

mlmachine accelerates machine learning experimentation

data-analysis data-science data-visualization machine-learning python

Last synced: 02 Aug 2024

https://github.com/JieZheng-ShanghaiTech/KG4SL

Synthetic lethality (SL) is a promising gold mine for the discovery of anti-cancer drug targets. KG4SL is the first graph neural network (GNN)-based model that uses knowledge graph for SL prediction.

ai4science bioinformatics cancer data-science drug-discovery machine-learning

Last synced: 02 Aug 2024

https://github.com/m-clark/data-processing-and-visualization

This document forms the basis of several workshops/talks that get into everyday programming with R, but also includes mirrored code in Python as Jupyter notebooks.

data-processing data-science datatable dplyr ggplot2 htmlwidgets jupyter-notebooks machine-learning model-criticism modeling numpy pandas programming programming-exercises python r tidyverse visualization workshop workshops

Last synced: 08 Aug 2024

https://github.com/braph-software/BRAPH-2

BRAPH 2.0 is a comprehensive software package for the analysis and visualization of brain connectivity data, offering flexible customization, rich visualization capabilities, and a platform for collaboration in neuroscience research.

biomedical-engineering brain-connectivity-analysis brain-research computational-neuroscience connectomics data-analysis data-science data-visualization deep-learning graph-theory machine-learning matlab network-analysis neuroimaging neuroscience open-source reproducible-research research-tools scientific-software toolbox

Last synced: 02 Aug 2024

https://github.com/bluegreen-labs/daymetr

An R Interface to the Daymet Web Services

climate-data data-science daymet gridded-data netcdf ornl-daac r-package rstats

Last synced: 08 Aug 2024

https://github.com/alagoa/youtube-or-pornhub

Service identification on ciphered traffic.

capture data-science machinelearning ml pcap python3 spotify traffic tshark youtube

Last synced: 01 Aug 2024

https://github.com/datapane/examples

Datapane Examples

data-science datapane jupyter python

Last synced: 09 Aug 2024

https://github.com/oldratlee/data-science-practice

数据科学实践 | data science practice

anaconda data-science python statistics

Last synced: 01 Aug 2024

https://github.com/Azure/aml-run

GitHub Action that allows you to submit a run to your Azure Machine Learning Workspace.

aml azure azure-machine-learning data-science machine-learning mlops

Last synced: 13 Aug 2024

https://github.com/0x0be/scrapeadvisor

A user-friendly python-based GUI which provides sentiment analysis of users' reviews toward a specific TripAdvisor facility

data-mining data-science python3 r scraping sentiment-analysis sentiment-classification text-mining tripadvisor tripadvisor-scraper web-scraping

Last synced: 01 Aug 2024

https://github.com/iesahin/xvc

A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)

command-line-tool data data-engineering data-pipelines data-science devops machine-learning machine-learning-engineering mlops rust

Last synced: 02 Aug 2024

https://github.com/denadai2/google_street_view_deep_neural

Deep Neural Network model to predict security perception from Google Street View images. Model based on AlexNet CNNs

computational-social-science computer-vision data-science deep-learning urban-planning urban-science

Last synced: 31 Jul 2024

https://github.com/rjbergerud/open-source-for-common-good

A list I'm keeping of active open source projects that serve a social or environmental goal.

citizen-science civic-tech community data-science humanity non-profit social social-impact sustainability

Last synced: 01 Aug 2024

https://github.com/incubated-geek-cc/Text-To-Speech-App

A Fusion of OCR Technology (Tesseract.js) & Web Speech API. Standalone, portable and works offline.

data-science javascript machine-learning ocr ocr-recognition tesseract tesseract-ocr tesseract-ocr-api tesseractjs webapp

Last synced: 01 Aug 2024

https://github.com/openghg/openghg

A cloud platform for greenhouse gas (GHG) data analysis and collaboration.

analysis cloud collaboration data-science greenhouse-gas

Last synced: 03 Aug 2024

https://github.com/goplus/pandas

Flexible and powerful data analysis / manipulation library for Go+, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

data-analysis data-science data-tech go golang gop goplus pandas scientific-computing

Last synced: 02 Aug 2024

https://github.com/RConsortium/r-collaboration

Open Collaboration, Data Registry, and Use Cases Developed by the R Community

data-analysis-in-r data-analytics data-science r

Last synced: 08 Aug 2024

https://github.com/humburg/reportmd

Create multi-page HTML reports in R

data-science r rmarkdown rstudio

Last synced: 31 Jul 2024

https://github.com/climopy-dev/climopy

🌍🌏🌎 A succinct toolset for analyzing climate data. This project is a work-in-progress.

climate-analysis climate-science data-science python xarray xarray-accessor

Last synced: 08 Aug 2024

https://github.com/brunorosilva/todoist-analytics

Just a simple app for weekly and monthly reviewing of tasks in todoist.

analytics dashboard data-science streamlit todoist

Last synced: 13 Aug 2024

https://github.com/rpodcast/shinycal

The Data Science StreamRs Calendar!

data-science r shiny streaming

Last synced: 13 Aug 2024

https://github.com/Azure/azure-data-labs

Terraform templates to deploy Azure Data resources

analytics azure blueprints data data-science github github-actions labs terraform

Last synced: 02 Aug 2024

https://github.com/hneth/ds4psy

Data science for psychologists (ds4psy): R package supporting book and course

data-literacy data-science education exploratory-data-analysis psychology r r-package social-sciences visualisation

Last synced: 05 Aug 2024

https://github.com/zhoudaxia233/pyalpha

A process mining tool written in Python3

alpha-miner data-science petri-net process-mining

Last synced: 03 Aug 2024

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 31 Jul 2024

https://github.com/nuhmanpk/Webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code

audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets

Last synced: 04 Aug 2024

https://github.com/PySloth/pysloth

A Python Package for Probabilistic Prediction

data-analysis data-science machine-learning python statistics

Last synced: 03 Aug 2024

https://github.com/OGFris/GoStats

GoStats is a go library for math statistics mostly used in ML domains, it covers most of the statistical measures functions.

data-science go golang gostats machine-learning math mathematics mit-license statistical-measures statistics stats

Last synced: 30 Jul 2024

https://github.com/pyurbans/urbans

A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.

artificial-intelligence data-science machine-translation nlp python

Last synced: 02 Aug 2024

https://github.com/bcgov/wqbc

An R package for water quality thresholds and index calculation for British Columbia

data-science env r r-package rstats

Last synced: 08 Aug 2024

https://github.com/bcgov/bcgroundwater

An R package to facilitate analysis and visualization of groundwater data from the British Columbia groundwater observation well network

data-science env r rstats

Last synced: 08 Aug 2024

https://github.com/bcgov/groundwater-levels-indicator

R scripts for an indicator on long-term trends in groundwater levels in B.C. published on Environmental Reporting BC

data-science env r rstats soe

Last synced: 08 Aug 2024

https://github.com/somdeep/Statball

Statball - Football soccer stats analyser from top 5 european leagues with data obtained by web scraping from Fbref and Statsbomb

csharp data-science data-scraping data-viz dotnet dotnet-core fbref football football-analytics football-data scouting-data scraping soccer soccer-analytics soccer-data statsbomb tableau visualizations

Last synced: 01 Aug 2024

https://github.com/mkearney/tfse

🛠 Useful R functions for various things

data-science functions mkearney-r-package r-language rstats utility

Last synced: 13 Aug 2024

https://github.com/wlandau/targetsketch

Sketch a pipeline of targets in an interactive web app

data-science high-performance-computing pipeline r reproducibility rstats shiny targets workflow

Last synced: 05 Aug 2024

https://github.com/lvalnegri/workshops-setup_cloud_analytics_machine

Tips and Tricks to setup a cloud machine for Analytics and Data Science with R, RStudio and Shiny Servers, Python and JupyterLab

analytics cloud dashboard data-science docker dockerfile jupyterlab linux machine-learning python r raspberry-pi rmarkdown rstats rstudio-server scipy shiny shiny-apps shiny-server ubuntu

Last synced: 13 Aug 2024

https://github.com/Jeniffen/projectr

Set up 📂-structure for data science projects

data-science package r rstats setup

Last synced: 13 Aug 2024