Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/inphyt/covid19-italy-integrated-surveillance-data

COVID-19 integrated surveillance data provided by the Italian Institute of Health and processed via UnrollingAverages.jl to deconvolve the weekly moving averages.

covid-19 covid19-data data data-analysis data-structures data-visualization data-wrangling database dataset epidemiological-data epidemiology italy italy-data italy-dataset open-data surveillance surveillance-data time-series time-series-analysis

Last synced: 12 Nov 2024

https://github.com/stellar/stellar-etl

Stellar ETL will enable real-time analytics on the Stellar network

bitcoin blockchain data-analysis ethereum etl-framework etl-pipeline stellar stellar-lumens stellar-network

Last synced: 06 Nov 2024

https://github.com/petersontylerd/mlmachine

mlmachine accelerates machine learning experimentation

data-analysis data-science data-visualization machine-learning python

Last synced: 13 Nov 2024

https://github.com/kwokhing/yandexcatboost-python-demo

Demo on the capability of Yandex CatBoost gradient boosting classifier on a fictitious IBM HR dataset obtained from Kaggle. Data exploration, cleaning, preprocessing and model tuning are performed on the dataset

catboost data-analysis data-preprocessing data-science feature-selection gradient-boosting gradient-boosting-classifier one-hot-encode pandas pearson-correlation python python27 seaborn variance-analysis visualization yandex-catboost

Last synced: 12 Oct 2024

https://github.com/davidchall/ipaddress

Data analysis for IP addresses and networks

cyber data-analysis ip-address ipv4 ipv6 r vctrs

Last synced: 13 Aug 2024

https://github.com/cdhunt/pselect

PowerShell DSL for aggregating data

data-analysis dsl powershell powershell-module

Last synced: 28 Oct 2024

https://github.com/hackersandslackers/pandas-sqlalchemy-tutorial

:panda_face: :computer: Load or insert data into a SQL database using Pandas DataFrames.

data-analysis data-science dataframes pandas pandas-sqlalchemy-tutorial python sql-database sqlalchemy tutorial

Last synced: 09 Nov 2024

https://github.com/alexbykoff/datafield

Sort, select, filter, evaluate and perform maths on your arrays of data

arrays collections data-analysis data-structures filtering sorting

Last synced: 09 Nov 2024

https://github.com/rshkarin/quanfima

Quanfima (Quantitative Analysis of Fibrous Materials)

data-analysis material-science morphological-analysis volumetric-data

Last synced: 14 Nov 2024

https://github.com/vvzen/houdini-geospatial-tools

tools for geospatial exploration in Houdini (ipython notebooks + GeoJSON python library)

data-analysis data-visualization geojson geospatial geotiff houdini python27

Last synced: 12 Oct 2024

https://github.com/mainakrepositor/data-analysis

Different types of data analytics projects : EDA, PDA, DDA, TSA and much more.....

data-analysis data-science deeplearning machine-learning-algorithms neural-networks time-series-analysis tsa

Last synced: 12 Nov 2024

https://github.com/theengineeringworld/python-data-science

Python Data Science has all the data sets and jupyter notebook files for the Youtube course at http://youtube.com/theengineeringworld under the name of " Python Data Science Course ".

data data-analysis data-mining data-science data-visualization jupyter-notebook jupyter-notebooks machine-learning python python27

Last synced: 12 Oct 2024

https://github.com/activitywatch/aw-research

Tools to analyse and experiment with ActivityWatch data

activitywatch data-analysis python quantified-self

Last synced: 08 Nov 2024

https://github.com/ActivityWatch/aw-research

Tools to analyse and experiment with ActivityWatch data

activitywatch data-analysis python quantified-self

Last synced: 12 Nov 2024

https://github.com/mkcor/advanced-pandas

Pandas is a powerful tool for data exploration and analysis (including timeseries).

data-analysis data-science labeled-data notebooks python3 teaching-materials

Last synced: 16 Oct 2024

https://github.com/computationalcore/introduction-to-python

A very useful collection of Jupyter Notebooks, which aims to introduce the Python programming language.

data-analysis data-science fundamental google-colab jupyter-notebook jupyter-notebooks numpy pandas python python-language python-programming python3

Last synced: 10 Nov 2024

https://github.com/staircase-dev/piso

Pandas Interval Set Operations: providing methods for set operations, analytics, lookups and joins on pandas' Interval, IntervalArray and IntervalIndex

data-analysis data-science data-structures interval interval-arithmetic interval-set pandas set set-operations set-theory

Last synced: 09 Nov 2024

https://github.com/PiotrZakrzewski/merge-chance

Source code of https://merge-chance.info

analysis data data-analysis open-source

Last synced: 29 Oct 2024

https://github.com/pravj/ospi

Open Source Presence Infographic of Indian Startups

data-analysis data-visualization india open-source startup

Last synced: 14 Oct 2024

https://github.com/goplus/pandas

Flexible and powerful data analysis / manipulation library for Go+, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

data-analysis data-science data-tech go golang gop goplus pandas scientific-computing

Last synced: 12 Nov 2024

https://github.com/itzmeanjan/chanalyze

A simple WhatsApp Chat Analyzer ( for both Private & Group chats ), made with :heart:

chat-analysis data-analysis datascience dataviz matplotlib python3 visualization whatsapp whatsapp-chat whatsapp-chat-analyzer

Last synced: 30 Sep 2024

https://github.com/rfordatascience/r4dswebsite

Public repository for the R4DS community website.

blogdown data-analysis data-analytics data-science data-visualization r r4ds tidyverse

Last synced: 14 Nov 2024

https://github.com/skyzh/meteor

🚆 Fine-grained analysis and visualization of Hangzhou Metro for efficient traveling in metro system. Project report, slide and presentation video included.

cmake data-analysis hangzhou metro qt sqlite visualize

Last synced: 28 Oct 2024

https://github.com/gagolews/genie

Genie: A Fast and Robust Hierarchical Clustering Algorithm (this R package has now been superseded by genieclust)

cluster cluster-analysis clustering data-analysis data-mining data-science datascience genie hierarchical-clustering-algorithm machine-learning machine-learning-algorithms outliers r

Last synced: 26 Oct 2024

https://github.com/skyzh/Meteor

🚆 Fine-grained analysis and visualization of Hangzhou Metro for efficient traveling in metro system. Project report, slide and presentation video included.

cmake data-analysis hangzhou metro qt sqlite visualize

Last synced: 07 Nov 2024

https://github.com/danvk/march-madness-data

NCAA brackets in JSON form

data-analysis ncaa-basketball sports

Last synced: 14 Nov 2024

https://github.com/simfg/etcd-analysis

🔦 Etcd Data Analysis Tool

data-analysis etcd go raft

Last synced: 28 Oct 2024

https://github.com/anselmoo/spectrafit

📊📈🔬 SpectraFit is a command-line and Jupyter-notebook tool for quick data-fitting based on the regular expression of distribution functions.

console-application curve-fitting data-analysis data-science fitting juypter-notebook numpy pandas python science science-research scientific-plotting spectral-analysis spectroscopy

Last synced: 27 Oct 2024

https://github.com/PySloth/pysloth

A Python Package for Probabilistic Prediction

data-analysis data-science machine-learning python statistics

Last synced: 03 Aug 2024

https://github.com/saranshbansal/data-science-with-python

Data science with Python: This repository mostly contains DataCamp data-science courses/exercises that I have completed.

data-analysis data-science datacamp-exercises numpy python

Last synced: 09 Nov 2024

https://github.com/alejandrodumas/kodiak

Enhance your feature engineering workflow with Kodiak

data-analysis pandas

Last synced: 07 Aug 2024

https://github.com/davidgasquez/filecoin-data-portal

🧮 Open and local-first data hub for Filecoin!

data-analysis data-platform filecoin

Last synced: 03 Aug 2024

https://github.com/cengel/R-data-wrangling

Materials for my my R data workshop. https://cengel.github.io/R-data-wrangling/

data-analysis data-workshop datascience material r rstats social-sciences teaching tidyverse workshop

Last synced: 13 Nov 2024

https://github.com/bcgov/shinyrems

An R package to launch shinyrems; an online application that allows a user to access, download, clean, plot and calculate simple statistics using data from the B.C. government Environmental Monitoring System database.

data-analysis environment environmental-data water-quality

Last synced: 13 Aug 2024

https://github.com/csinva/data-viz-utils

Functions for easily making publication-quality figures with matplotlib.

big-data data-analysis data-science data-visualization eda legend matplotlib python python3 scatterplot time-series

Last synced: 09 Nov 2024

https://github.com/jpenuchot/ctbench

Compiler-assisted variable size benchmarking for the study of C++ metaprogram compile times.

benchmark clang compilation data-analysis data-visualization gcc metaprogramming

Last synced: 12 Nov 2024

https://github.com/chyikwei/bnp

Bayesian nonparametric models for python

bayesian data-analysis probabilistic-graphical-models python topic-modeling

Last synced: 13 Nov 2024

https://github.com/andrewreynen/lazylyst

Lazylyst is a GUI created for time series review, using a flexible framework for new workflows

data-analysis earthquakes gui python qt seismology

Last synced: 19 Oct 2024

https://github.com/hoangsonww/north-carolina-household-analysis

🏠 This repository contains data analysis scripts for the 2022 American Community Survey (ACS) focusing on individuals aged 25 and over in North Carolina, based on 75,340 observations. This repository offers valuable insights into demographic and economic patterns across North Carolina's urban areas.

confidence-interval confidence-score data data-analysis data-analytics data-science data-visualization ggplot2 hypothesis-testing hypothesis-tests north-carolina r r-language r-programming stata

Last synced: 14 Nov 2024

https://github.com/rob-med/everything-shapelets

This repo contains useful links to research papers and implementations of shapelets discovery/learning techniques from different sources.

data-analysis data-mining shapelets time-series-analysis timeseries

Last synced: 14 Nov 2024

https://github.com/mcwaage1/qs

Quantified Self: A Personal Data Aggregator and Dashboard for Self-Trackers and Quantified Self Enthusiasts

activity-tracking data-analysis data-visualization fitbit goodreads google-sheets lastfm mood personal-data quantified-self quantifiedself self-tracking writing

Last synced: 08 Nov 2024

https://github.com/llnl/topoms

Topological Analysis for Molecular Systems

data-analysis data-viz

Last synced: 11 Nov 2024

https://github.com/nceas/metajam

Bringing data and metadata togetheR

data data-analysis metadata r repositories

Last synced: 10 Oct 2024

https://github.com/cmudig/texture

Visualize your text data with structured attributes

data-analysis llm text visualization

Last synced: 09 Nov 2024

https://github.com/zekeriyyaa/pyspark-structured-streaming-ros-kafka-apachespark-cassandra

A structured streaming was applied to the robot data from ROS-Gazebo simulation environment using Apache Spark. Data is collected in Kafka, analyzed by Apache Spark and stored in Cassandra.

apache-cassandra apache-kafka apache-spark cqlsh data-analysis kafka-consumer kafka-producer pyspark python python3 ros ros-noetic spark-cassandra spark-cassandra-connector spark-kafka-connector spark-kafka-integration spark-sql spark-streaming structured-streaming

Last synced: 12 Oct 2024

https://github.com/tsffarias/data-analysis-queries

Este repositório foi cuidadosamente criado para fornecer uma extensa coleção de consultas SQL que visam facilitar o trabalho dos analistas de dados em diversas áreas de uma empresa, incluindo marketing, logística, comercial, financeiro, recursos humanos, operação, jurídico, suporte e muito mais.

business-intelligence comercial data-analysis data-insights esg finance-management fraud-prevention human-resources juridico kpis logistics marketing marketing-analytics operacao pricing sql suporte

Last synced: 05 Nov 2024

https://github.com/kongruksiamza/python-datascience

เอกสารประกอบการสอนเนื้อหา Python - Data Science และงานด้าน Machine Learning

data-analysis data-science numpy pandas python

Last synced: 09 Nov 2024

https://github.com/hoangsonww/global-covid19-analysis

🌍 This repository hosts an in-depth analysis of COVID-19's impact across five key countries from Jan 2020 to Dec 2021. Through advanced data analysis and visualization, we aim to provide insights into how the pandemic evolved differently across these nations, shedding light on the effectiveness of various health measures and vaccination campaigns.

covid covid-19 covid19-tracker data data-analysis data-analytics data-science data-visualization ggplot2 julia julia-language python r r-language r-markdown r-programming sas sas-programming stata vaccination

Last synced: 12 Oct 2024

https://github.com/rugk/crops-parser

🌱🍎🍆 A shell script to parse the data by the Food and Agriculture Organization of the United Nations on crops/fruits.

agriculture agriculture-research crop crops data-analysis data-science food fruit fruits statistics streetcomplete tree vegetables

Last synced: 23 Oct 2024

https://github.com/tanglespace/hotstepper

A Numpy based step function library for analysis and profit. More than just taking you up and down.

data-analysis kernel-methods linear-algebra numpy pandas step-functions time-series

Last synced: 27 Oct 2024

https://github.com/msyriac/orphics

A library containing analysis and theory tools for cosmological data.

analysis cmb cosmology data-analysis theory

Last synced: 27 Oct 2024

https://github.com/omarsar/data_mining_2017_fall_lab

Contains information and instructions for the first Data Mining lab session for 2017 Fall.

data data-analysis data-mining data-science data-visualization

Last synced: 13 Oct 2024

https://github.com/librariesio/metrics

:chart_with_upwards_trend: What to measure, how to measure it.

data-analysis measure metrics open-source

Last synced: 10 Nov 2024

https://github.com/awadell1/datamaster

Tool for accessing and extracting data from MoTeC Log Files

data-analysis data-visualization fsae matlab motec

Last synced: 08 Nov 2024

https://github.com/dawievlill/datascience-871

Data science module for economists written mostly in Julia and R

data-analysis data-science machine-learning

Last synced: 11 Nov 2024

https://github.com/engali94/twitter-account-analyzer

Using various Python libraries such as Pandas, tweetPy, JSON ans matplotLib to take a sneak peek on your Twitter account using Google Colab.

data-analysis data-visualization python3 twitter-api twitter-sentiment-analysis twitter-streaming-api

Last synced: 11 Nov 2024

https://github.com/charliezcr/Kpop-Data-Analysis

Data analysis about K-pop industry, artists, and companies. Visualized business performances of public K-pop companies and analyzed artist management and international marketing strategies

data-analysis data-visualization kpop pandas python

Last synced: 08 Nov 2024

https://github.com/wildtreetech/explore-open-data

📈🔍 Exploring open data from Zürich

binder data-analysis notebook open-data opendata python tutorial

Last synced: 12 Nov 2024