Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/mGalarnyk/datasciencecoursera

Data Science Repo and blog for John Hopkins Coursera Courses. Please let me know if you have any questions.

data-science jhu-coursera john-hopkins-coursera python r stanford

Last synced: 08 Apr 2024

https://github.com/imgcook/datacook

Machine Learning and Data Analysis in JavaScript.

data-science feature-engineering javascript machine-learning

Last synced: 08 Apr 2024

https://drivendata.github.io/cookiecutter-data-science/

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

ai cookiecutter cookiecutter-data-science cookiecutter-template data-science machine-learning

Last synced: 08 Apr 2024

https://github.com/senderle/topic-modeling-tool

A point-and-click tool for creating and analyzing topic models produced by MALLET.

data-science digital-humanities mallet text-analytics topic-modeling

Last synced: 08 Apr 2024

https://github.com/ipython-books/cookbook-2nd-code

Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]

computing data-analysis data-mining data-science data-visualization ipython jupyter jupyter-notebook machine-learning numerical-computation python visualization

Last synced: 08 Apr 2024

https://github.com/HoloClean/holoclean

A Machine Learning System for Data Enrichment.

data-enrichment data-science inference-engine machine-learning pytorch

Last synced: 08 Apr 2024

https://github.com/jobream/List-of-Learning-Resources

This collection provides a list of educational resources for Software Engineers. Feel free to add your favorite resources as well and help others in their journey of learning.

competitive-programming computer-science data-science resources software-engineering web-development

Last synced: 08 Apr 2024

https://github.com/alenrajsp/tcxreader

tcxreader is a reader / parser for Garmin’s TCX file format. It also works well with missing data!

data-mining data-science python sports-analytics tcx tcx-parser

Last synced: 08 Apr 2024

https://github.com/firefly-cpp/tcx-test-files

A collection of the sports activity (tcx) test files

data-mining data-science garmin-connect tcx-files tcx-parser

Last synced: 08 Apr 2024

https://codeformunich.github.io/radlquartier/

Command-line tool to prepare and extract bike sharing data. Plus example implementations of visualizations and a example website.

data-science data-visualization munich open-data visualization

Last synced: 08 Apr 2024

https://github.com/0x0be/scrapeadvisor

A user-friendly python-based GUI which provides sentiment analysis of users' reviews toward a specific TripAdvisor facility

data-mining data-science python3 r scraping sentiment-analysis sentiment-classification text-mining tripadvisor tripadvisor-scraper web-scraping

Last synced: 07 Apr 2024

https://github.com/davendw49/k2

Code and datasets for paper "K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization" in WSDM-2024

ai4science data-science geoai geoscience kg large-language-models llm

Last synced: 07 Apr 2024

https://github.com/rasbt/python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource

data-mining data-science logistic-regression machine-learning machine-learning-algorithms neural-network python scikit-learn

Last synced: 07 Apr 2024

https://github.com/Smat26/Roman-Urdu-Dataset

Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources

data-science dataset hindi hindi-language natural-language-processing nlp urdu urdu-language urdu-nlp

Last synced: 06 Apr 2024

https://github.com/zhoudaxia233/pyalpha

A process mining tool written in Python3

alpha-miner data-science petri-net process-mining

Last synced: 06 Apr 2024

https://github.com/Mybridge/python-articles

Monthly Series - Top 10 Python Articles

data-science data-visualization django flask python python3

Last synced: 05 Apr 2024

https://github.com/Mybridge/machine-learning-open-source

Monthly Series - Machine Learning Top 10 Open Source Projects

ai algorithm artificial-intelligence data-science machine-learning neural-network

Last synced: 05 Apr 2024

https://github.com/pyxelr/recommendations-for-engineers

All of my recommendations for aspiring engineers in a single place, coming from various areas of interest.

awesome awesome-list cybersecurity data-science lists machine-learning macos mlops pyxelr-setup resources windows

Last synced: 05 Apr 2024

https://github.com/youssefHosni/Practical-Machine-Learning

Practical machine learning notebook & articles covers the machine learning end to end life cycle.

data-science machine-learning

Last synced: 05 Apr 2024

https://github.com/Ph055a/OSINT_Collection

Maintained collection of OSINT related resources. (All Free & Actionable)

court-search data-science dataset infosec investigation journalism osint research search

Last synced: 05 Apr 2024

https://github.com/jeroenjanssens/python-polars-the-definitive-guide

Scripts and datasets for the O'Reilly book Python Polars: The Definitive Guide

data-science oreilly oreilly-books polars python

Last synced: 04 Apr 2024

https://github.com/Lackoftactics/facebook_data_analyzer

Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more

conversation data-science data-visualization english-language facebook facebook-data facebook-data-analyzer ruby ruby-gem scraping script statistics

Last synced: 04 Apr 2024

https://github.com/google/starthinker

Reference framework for building data workflows provided by Google. Accelerates authentication, logging, scheduling, and deployment of solutions using GCP. To borrow a tagline.. "The framework for professionals with deadlines."

airflow app-engine automation bigquery cloud-functions cm360 colab-notebook data-science django dv360 google-ads google-analytics logger python scheduler ui workflows

Last synced: 03 Apr 2024

https://github.com/chiphuyen/python-is-cool

Cool Python features for machine learning that I used to be too afraid to use. Will be updated as I have more time / learn more.

advanced-python data-science machine-learning python-tutorials python3

Last synced: 03 Apr 2024

https://github.com/dlab-berkeley/Python-Fundamentals-Legacy

D-Lab's 12 hour introduction to Python. Learn how to create variables and functions, use control flow structures, use libraries, import data, and more, using Python and Jupyter Notebooks.

data-science introduction-to-python jupyter python

Last synced: 02 Apr 2024

https://github.com/dlab-berkeley/Python-Data-Wrangling-Legacy

D-Lab's 3 hour introduction to data wrangling in Python. Learn how to import and manipulate dataframes using pandas in Python.

data-science pandas python

Last synced: 02 Apr 2024

https://github.com/Azure/DataScienceVM

Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)

ai azure big-data data-analysis data-science deep-learning dsvm machine-learning ml python r sqlserver

Last synced: 01 Apr 2024

https://github.com/Azure/azureml-examples

Official community-driven Azure Machine Learning examples, tested with GitHub Actions.

azure azure-machine-learning azureml data-science ml

Last synced: 01 Apr 2024

https://github.com/DeutscheAktuarvereinigung/Data_Science_Challenge_2020_Betrugserkennung

In this notebook we take a look at a relevant project that is frequently encountered by insurers: Fraud Detection. For this purpose we use a car data set from a public source and will show the necessary steps to establish an automated fraud detection.

actuarial-modeling betrugserkennung challenge data-science datasciencechallenge fraud-detection frauddetection

Last synced: 01 Apr 2024

https://github.com/adityakamble49/loss-ratio-prediction

Predicting Loss Ratios for Auto Insurance Portfolios - ITCS 6100 Big Data Analytics for Competitive Advantage

big-data big-data-analytics data-science insurance jupyter-notebook politics python

Last synced: 01 Apr 2024

https://github.com/rasgointelligence/RasgoQL

Write python locally, execute SQL in your data warehouse

data-analysis data-science pandas python sql

Last synced: 01 Apr 2024

https://github.com/woz-u/DS-Student-Resources

Data Science Student Companion Notebooks and Data Lake

data-analysis data-science data-visualization machine-learning nosql python r sql statistics

Last synced: 01 Apr 2024

https://github.com/ActuariesInstitute/cookbook

Data and analytics cookbook for actuaries

actuarial analytics data-science hacktoberfest

Last synced: 01 Apr 2024

https://github.com/plotly/dashR

Create data science and AI web apps in R

dash data-science data-visualization plotly plotly-dash python r react web-application

Last synced: 01 Apr 2024

https://github.com/InseeFrLab/onyxia

🔬 Web app to simplify data science environment setup on Kubernetes

bluehats data-science datalab helm insee kubernetes onyxia

Last synced: 01 Apr 2024

https://github.com/spsanderson/steveondata

Repository for R and SQL tips and tricks for @steveondata every Friday

ai blog data data-science machinelearning-r ml r sql time-series tipoftheday

Last synced: 01 Apr 2024

https://github.com/rivasiker/ggHoriPlot

A user-friendly, highly customizable R package for building horizon plots in ggplot2

data-science data-visualization ggplot2 horizon-plots r r-package

Last synced: 01 Apr 2024

https://github.com/glm-tools/pyglmnet

Python implementation of elastic-net regularized generalized linear models

data-science elastic-net glm lasso machine-learning python

Last synced: 01 Apr 2024

https://github.com/iesahin/xvc

A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)

command-line-tool data data-engineering data-pipelines data-science devops machine-learning machine-learning-engineering mlops rust

Last synced: 01 Apr 2024

https://github.com/Visualize-ML/Book3_Elements-of-Mathematics

Book_3_《数学要素》 | 鸢尾花书:从加减乘除到机器学习;上架;欢迎继续纠错,纠错多的同学还会有赠书!

data-science linear-algebra machine-learning mathematics matrix

Last synced: 01 Apr 2024

https://github.com/dwhitena/gophernet

A simple from-scratch neural net written in Go

artificial-intelligence data-science go golang machine-learning neural-network

Last synced: 01 Apr 2024

https://github.com/njtierney/rmd4sci

Rmarkdown for Scientists

book bookdown data-science r rmarkdown rstats science

Last synced: 31 Mar 2024

https://github.com/briatte/dsr

Introduction to Data Science with R (Sciences Po, Paris, 2023)

course data-analysis data-science data-visualization r statistics

Last synced: 31 Mar 2024

https://github.com/bradleyboehmke/data-science-learning-resources

A collection of data science and machine learning resources that I've found helpful (I only post what I've read!)

data-science machine-learning

Last synced: 31 Mar 2024

https://github.com/rebecca-vickery/data-science-learning-resources

A comprehensive list of free resources for learning data science

artificial-intelligence data data-science machine-learning python

Last synced: 31 Mar 2024

https://github.com/tlverse/sl3

💪 🤔 Modern Super Learning with Machine Learning Pipelines

data-science ensemble-learning ensemble-model machine-learning model-selection r r-package regression stacking statistics

Last synced: 31 Mar 2024

https://hdi-project.github.io/ATM/

Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).

automl data-science distributed-computing hyperparameter-optimization machine-learning

Last synced: 31 Mar 2024

https://github.com/PKU-DAIR/mindware

An efficient open-source AutoML system for automating machine learning lifecycle, including feature engineering, neural architecture search, and hyper-parameter tuning.

automl-algorithms automl-pipeline bayesian-optimization blackbox-optimization data-science deep-learning distributed-systems ensemble-learning hyper-parameter-optimization knobs-tuning machine-learning meta-learning neural-architecture-search python

Last synced: 31 Mar 2024

https://github.com/swoop-inc/spark-alchemy

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive

data-engineering data-science scala spark

Last synced: 31 Mar 2024

https://github.com/benjaminmbrown/real-time-data-viz-d3-crossfilter-websocket-tutorial

Tutorial on real-time data visualization. Python websocket server & d3.js + crossfilter.js frontend

crossfilter d3 d3js data-science data-visualization dcjs tutorial websockets

Last synced: 31 Mar 2024

https://github.com/capitalone/datacompy

Pandas and Spark DataFrame comparison for humans and more!

compare dask data data-science dataframes fugue numpy pandas polars pyspark python spark

Last synced: 31 Mar 2024

https://github.com/great-expectations/great_expectations_action

A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.

actions continuous-integration data-integrity data-quality data-science mlops

Last synced: 31 Mar 2024

https://github.com/gdsbook/book

This book serves as an introduction to a whole new way of thinking systematically about geographic data, using geographical analysis and computation to unlock new insights hidden within data.

data-analysis-python data-science geographic-data geographical-information-system spatial-analysis spatial-data-analysis spatial-statistics statistics

Last synced: 31 Mar 2024

https://github.com/aws/amazon-redshift-python-driver

Redshift Python Connector. It supports Python Database API Specification v2.0.

amazon-redshift aws-redshift data-analysis data-science

Last synced: 30 Mar 2024

https://github.com/RDeconomist/RDeconomist.github.io

RapidCharts - a site for teaching and demonstrating Data Science and Visualisation techniques

data data-science data-visualization economics politics sports

Last synced: 30 Mar 2024

https://github.com/uclatommy/tweetfeels

Real-time sentiment analysis in Python using twitter's streaming api

data-mining data-science python-3-6 sentiment-analysis twitter

Last synced: 30 Mar 2024

https://github.com/awesomecosmos/MS-Data-Science

Repository for my MS in Data Science at Pace University.

data-science masters-degree pace-university

Last synced: 30 Mar 2024

https://github.com/benthecoder/yt-channels-DS-AI-ML-CS

A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.

ai artificial-intelligence awesome awesome-list coding data data-analysis data-engineering data-science deep-learning machine-learning math ml programming python resources software-engineering statistics web-development youtube

Last synced: 29 Mar 2024

https://github.com/shenwei356/awesome

Awesome resources on Bioinformatics, data science, machine learning, programming language (Python, Golang, R, Perl) and miscellaneous stuff.

awesome data-science git golang linux perl programing-language python

Last synced: 28 Mar 2024

https://github.com/ottogroup/palladium

Framework for setting up predictive analytics services

data-science machine-learning scikit-learn

Last synced: 28 Mar 2024

https://github.com/incubated-geek-cc/Text-To-Speech-App

A Fusion of OCR Technology (Tesseract.js) & Web Speech API. Standalone, portable and works offline.

data-science javascript machine-learning ocr ocr-recognition tesseract tesseract-ocr tesseract-ocr-api tesseractjs webapp

Last synced: 28 Mar 2024

https://github.com/modelscope/data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

chinese data-analysis data-science data-visualization dataset gpt gpt-4 instruction-tuning large-language-models llama llava llm llms multi-modal nlp opendata pre-training pytorch sora streamlit

Last synced: 28 Mar 2024