Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/globeandmail/upstartr

An R package powering core startr functionality

data data-analysis data-journalism data-visualization journalism news r r-package

Last synced: 18 Dec 2024

https://github.com/ztjhz/sc1015-project

Predict the success of an anime using data science and machine learning (regression + classification)

anime data-analysis data-science machine-learning

Last synced: 28 Oct 2024

https://github.com/abmenzel/messenger-wrapped

Explore your Facebook Messenger chat history, in a manner inspired by Spotify Wrapped

data-analysis data-visualization facebook meta nextjs reactjs spotify

Last synced: 25 Nov 2024

https://github.com/cosmoduende/r-ggsoccer

StrangeR things: Visualizing Soccer Data with R… on a Soccer Pitch? How to analyze, visualize and report soccer data and strategies on a soccer pitch with the "ggsoccer" package

data-analysis data-analysis-in-r data-analytics ggsoccer package packages r-language r-package r-programming r-programming-projects r-studio soccer soccer-analytics soccer-data soccer-game soccer-matches soccer-simulation

Last synced: 07 Nov 2024

https://github.com/cosmoduende/r-spotify-history-analysis

Explore your activity on Spotify with R and "spotifyr": How to analyze and visualize your streaming history and music tastes. Find out how and how much you consume from Spotify, using a copy of your personal data and the "spotifyr" package

analisis-de-data analytics data-analysis data-analysis-r data-analytics data-visualization r-language r-programming sentiment-analysis spotify-analysis spotify-api spotify-connect spotify-data spotify-playlist spotify-streaming-history spotify-web-api spotifyr streaming-history visualizacion-de-datos visualizaciones

Last synced: 07 Nov 2024

https://github.com/ahmed-maher77/wind-turbine-power-prediction-app-using-machine-learning

"Wind Power Predictor" is a machine learning project that forecasts turbine output using real-time data from Turkish wind farms. Its web app interface offers convenient access to predictions, enabling informed decisions for maximizing energy production and advancing renewable energy usage.

ai catboost data-analysis data-science flask html-css-javascript javascript machine-learning matplotlib numpy pandas predictive-modeling pwa python sklearn web web-development wind wind-turbine

Last synced: 24 Nov 2024

https://github.com/kaustubhgupta/blogathon-analysis

Analytics Vidhya Blogathon Data Analysis: Python Data Extraction with PowerBI dashboard

data-analysis data-mining data-scraping excel pandas powerbi powerbi-report project python tqdm

Last synced: 14 Oct 2024

https://github.com/arv-anshul/yt-watch-history-v2

Analyse your YouTube watch history using ML with Graphs.

data-analysis docker fastapi machine-learning python streamlit youtube

Last synced: 25 Dec 2024

https://github.com/zeeshanahmad4/my-path-to-python

Templates and Referance code in Python for Web development,Data Science, Analysing,visualization,Cleaning and Scraping, Machine Learning, Artificial Intelegence

code data-analysis data-visualization machine-learning python refereance selenium template

Last synced: 14 Dec 2024

https://github.com/neelshah18/scopus-analysis-for-indian-researcher

It contains data analysis of indian researcher in scopus journal from 2000 to 2016. It also have dataset.

artificial-intelligence data-analysis deep-learning india industry jupyter-notebook machine-learning scopus-analysis universities world

Last synced: 30 Nov 2024

https://github.com/ayaanhossain/sharedb

An on-disk pythonic embedded key-value store based on LMDB for compressed data storage and distributed data analysis

data-analysis data-storage distributed embedded-database key-value lmb msgpack multiprocessing parallel python python-3 python-library python-script python2 python3 store

Last synced: 15 Oct 2024

https://github.com/otherwa/csvdash

Analyze and visualize CSV data with ease using this Streamlit-powered data analytics tool

analytics data-analysis huggingface llama-index pandas python statistics streamlit

Last synced: 28 Nov 2024

https://github.com/nico-curti/data-analysis

Data Analysis Utilities (from Statistics to Machine Learning)

data-analysis deep-learning-algorithms linear-algebra machine-learning-algorithms statistics

Last synced: 07 Nov 2024

https://github.com/acs/ghtorrent

Experimental GItHub project analysis based on GHTorrent

data-analysis data-visualization ghtorrent github visualization

Last synced: 27 Oct 2024

https://github.com/stappit/blog

I often post solutions to textbook exercises, including: Bayesian Data Analysis (BDA) by Gelman et al; Causal Inference in Statistics Primer (CISP) by Pearl et al; Purely Functional Data Structures (PFDS) by Okasaki.

bayesian-data-analysis blog data-analysis data-science gelman hakyll haskell pearl purely-functional-data-structures solutions stan static-site statistical-inference statistics

Last synced: 25 Oct 2024

https://github.com/bpkaur/word-frequency-in-moby-dick

To find out the most frequent words in the novel Moby Dick using Python.

beautifulsoup data-analysis data-science moby-dick nltk notebook-jupyter python3

Last synced: 11 Dec 2024

https://github.com/qzcool/sac

中国证券业协会数据分析和可视化。Data Analysis and Visualization for Securities Association of China (SAC).

china data-analysis finance python

Last synced: 21 Nov 2024

https://github.com/sanvishal/Exoplanet-Explore

An Interactive data visualization of Exoplanets

animation d3js data-analysis data-science exoplanet python space visualization

Last synced: 08 Nov 2024

https://github.com/anotherkamila/stalky

Self-hosted app to keep track of anything you want. Don't let others stalk you, stalk yourself!

csv data-analysis data-storage geeks privacy-by-design self-hosted timeseries tracking

Last synced: 25 Nov 2024

https://github.com/coalio/Assistant

A data science library providing flexible dataframes for Lua 5.1+

data-analysis data-science data-structures dataframe lua

Last synced: 07 Nov 2024

https://github.com/drkostas/eestech-bigdata-challenge

EESTech Challenge is a brand new competition organized by EESTEC, that has the aim to create opportunities for European students to gain knowledge in the field of EECS and develop a professional network. The technological topic of 2017-2018ths competition was Big Data. This is the code I sumbitted with my team (BFS), which consisted of 3 members in total.

apache-spark data-analysis data-visualization eestec machine-learning matplotlib python

Last synced: 10 Nov 2024

https://github.com/hrolive/from-data-to-insights-with-google-cloud-platform

Four-course accelerated online specialization teaches course participants how to derive insights through data analysis and visualization using the Google Cloud Platform

data-analysis data-cleaning data-preparation data-visualization sql

Last synced: 09 Nov 2024

https://github.com/kingabzpro/annual-recycled-energy-saved-in-singapore

Learn how much Singapore is saving energy per years by recycling plastics, paper, glass, ferrous and non-ferrous metal

cleaning-data data-analysis data-science deepnote energy environment

Last synced: 17 Nov 2024

https://github.com/bydevmar/master_masd_fpo

Ce dépôt GitHub regroupe tous les cours, TP, TD, projets, et exercices de ma formation en master en mathématiques appliquées pour la science des données. Parcourez-le pour une vue complète de mon parcours académique, offrant une perspective détaillée de mon apprentissage dans ce domaine.

acp afc algebra big-data-analytics dashboards data-analysis datascience economics english graph-theory latex linear-algebra non-linear-algebra probability prog python scientific-research software-package statistics

Last synced: 17 Nov 2024

https://github.com/sukanyabag/statistical-analysis-of-my-medium-articles

This repository contains an exploratory data analysis of my writer data at Medium. I use it to carry out data analysis once in every 4 months to see audience and fan growth, and topics they love! You can check out the articles here👇

data-analysis data-storytelling matplotlib pandas seaborn statistical-analysis sweetviz web-scraping

Last synced: 10 Nov 2024

https://github.com/amrrs/iq18_workshop

Data Science in Python - Workshop Data and Notebook

data-analysis ipl python

Last synced: 15 Nov 2024

https://github.com/danieldacosta/airbnb-analysis

Data analysis of AirBnb website history in the city of Rio de Janeiro

airbnb-analysis airbnb-website-history data-analysis

Last synced: 12 Nov 2024

https://github.com/shubham18024/census_analysis

This repository contains code and resources for a summer research project focused on statistical analysis of census data. The project aims to analyze demographic trends, population distributions, and other relevant metrics derived from census datasets.

census-data csv-files data-analysis data-visualization jupyter-notebook mitosheet python report statistics

Last synced: 08 Nov 2024

https://github.com/leonism/sample-superstore

This is the Python version analysis approach, towards the legendary Sample Superstore Dataset with Pandas

data-analysis datamining datascience dataset eda jupyter-notebook machine-learning python

Last synced: 08 Dec 2024

https://github.com/koldlight/r4ds

R for data science course

course data-analysis data-science data-viz r

Last synced: 09 Nov 2024

https://github.com/lpsm-dev/twitter-sentimental-analysis-covid

✔️ Twitter Sentimental Analysis Covid-19 (using Textblob - Naive bayes) + Python Backend Flask + Docker + Docker Compose + MongoDB

adminmongo alpine backend corona coronavirus-analysis covid covid-19 data-analysis docker docker-compose docker-compose-wait flask flask-restful mongodb mongoku python python-api sentiment-analysis twitter

Last synced: 09 Nov 2024

https://github.com/vishnu-t-r/sql_functions_reference

This repository contains intermediate to complex sql queries which explains sql concepts. This repository can be helpful when writing queries with complex concepts and can be considered for reference. (Most queries have DDL and DML command within for practise)

complex-sql data-analysis data-mining sql sql-query

Last synced: 10 Nov 2024

https://github.com/pawelgoj/envelope-for-qe-ph-calculations

Crate envelopes for IR and Raman spectra calculated by PHonon from Quantum Espresso.

appium data-analysis matplotlib numpy pytest python quantum-chemistry scipy spetroscopy tkinter-gui tkinter-python

Last synced: 29 Nov 2024

https://github.com/alandefreitas/scistats

High-Performance Descriptive Statistics and Hypothesis Tests in C++20

bayesian-statistics data-analysis descriptive-statistics hypothesis-testing performance-statistics statistics

Last synced: 13 Oct 2024

https://github.com/pseudomanifold/enchiridion-tda

An enchiridion for instructing mortals in the hidden arts of topological data analysis

data-analysis graph-kernels persistent-homology topology

Last synced: 06 Nov 2024

https://github.com/memoryfraction/LLSDA-Lightning-Location-System-Data-Analyzer

LLSDA is a public benefit project that helps lightning engineers and lightning scientists analyze lightning distribution. 一款跨平台的闪电定位(LLS)数据分析工具软件基础类库,用于对LLS数据进行时空特征分析,帮助雷电相关分析人员(科研人员、学生)提高开发效率、避免重复造轮子。

data-analysis data-visualization lightning-location-system public-benefits

Last synced: 13 Nov 2024

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Nov 2024

https://github.com/labrijisaad/data-scientist-tools-pandas

In this hands-on training notebook, we'll see all the basics of the Pandas library used for data science/data analysis and machine learning tasks.

data-analysis data-science dataframe-objects machine-learning pandas python series-objects

Last synced: 06 Nov 2024

https://github.com/haroldeustaquio/sql-coding-challenges

Repository dedicated to solving SQL problems from HackerRank, DataLemur and other challenges. Contains solutions to improve skills in database querying, optimization, and data manipulation.

challenge data-analysis database hackerrank-solutions mysql query sql sqlite t-sql-exercises

Last synced: 22 Nov 2024

https://github.com/zvtvz/zvdata

an extendable library for recording and analyzing data

analyzing-data dash data-analysis pandas plotly-dash sqlalchemy

Last synced: 12 Nov 2024

https://github.com/stefen-taime/nifi-etl-data-pipeline

This post will demonstrate the creation of a containerized data engineer environment using Docker Stacks.

apache api big-data cloud data-analysis data-engineering docker-compose etl-pipeline machine-learning nifi postgresql-database slack zookeeper

Last synced: 16 Nov 2024

https://github.com/yaricom/english-article-correction

The experiment with applying NLP to correction of definite/indefinite articles in English text corpus

data-analysis glove-vectors nlp nlp-machine-learning numpy pandas scikit-learn umbc-webbase-corpus

Last synced: 05 Nov 2024

https://github.com/mahshaaban/intro_data_r

A gentle introduction to data analysis in R

data-analysis image-analysis qpcr-analysis r

Last synced: 11 Nov 2024

https://github.com/aglove2189/appias

Machine learning workflow toolkit ✨🦋✨

appias data-analysis data-science machine-learning pandas python sklearn workflow

Last synced: 16 Nov 2024

https://github.com/pjagielski/worldcup

2018 World Cup match data analysis

clojure data-analysis repl world-cup-2018

Last synced: 05 Nov 2024

https://github.com/theakashshukla/r-project

🎓 A Collection of Programming Assignment for R Language

algorithms data-analysis data-science data-science-projects ml r

Last synced: 30 Nov 2024

https://github.com/duhaime/douglasduhaime.com

Data analysis and visualization

data-analysis data-visualization website

Last synced: 14 Oct 2024

https://github.com/jthomperoo/holtwinters

Holt-Winters exponential smoothing implemented in Go.

data-analysis exponential-smoothing go golang holt-winters prediction prediction-algorithm

Last synced: 23 Nov 2024

https://github.com/volkansah/python-modules-overview

This repository provides an overview of some common and useful Python modules, categorized by their functionality. This list is not exhaustive but serves as a starting point for exploring various Python libraries.

data-analysis modules overview overview-page phyton phyton3 python python-3

Last synced: 09 Dec 2024

https://github.com/iondv/report

IONDV. Framework: Report module is to form the analytical reports.

analytics businessintelligence css data data-analysis data-visualization iondv iondv-module reporting

Last synced: 10 Nov 2024

https://github.com/jagadishmali567/tata-data-visualisation-empowering-business-with-effective-insights

This repository holds all of the assignments I was needed to complete for the TATA Data Visualization Empowering Business with Effective Insights Virtual Experience Program. 📊 📈 📉

analysis-and-reporting analytics analytics-and-decision-science charts communications dashboards data-analysis data-cleanup data-interpretation data-storytelling data-visualizations graph insights power-bi tableau visual-basic visualizations

Last synced: 18 Dec 2024

https://github.com/apache/incubator-devlake-playground

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

dashboard-friendly data data-analysis data-engineering data-integration data-transfers devops domain-layer dora etl hacktoberfest integration jira open-source python user-friendly

Last synced: 07 Oct 2024

https://github.com/mribeirodantas/vidente

R package to parse and preprocess the Surveillance, Epidemiology, and End Results (SEER) Program data from NIH/NCI

cancer-patients cancer-research data-analysis data-science data-structures r-package seer

Last synced: 15 Oct 2024

https://github.com/ankman007/cricket-statsguru

Streamlit-based Nepali cricket visualization dashboard that utilizes various python tools & libraries to display interactive and engaging charts & statistics.

data-analysis data-visualization machine-learning python streamlit

Last synced: 20 Nov 2024

https://github.com/vedadiyan/genql

GenQL is a generic querying language fully written in Go

data-analysis data-mapping data-processing data-science data-translation json json-data sql

Last synced: 11 Nov 2024

https://github.com/ethan-wickstrom/rrrs

Welcome to RRRS, a rapid, hyper-optimized CSV random sampling tool designed with performance and efficiency at its core. Crafted meticulously in Rust, RRRS offers an unparalleled solution for extracting random data samples from CSV files swiftly and effortlessly.

analytics cli command-line command-line-tool data data-analysis data-science dataset rust rust-lang sample samples

Last synced: 19 Nov 2024

https://github.com/rahul-jha98/restauranttrends.stats

Visualise the trends in food and restaurant choices of customers in a city by scraping data from Zomato.

data-analysis data-science visualization vuejs zomato zomato-api zomato-scraper

Last synced: 19 Nov 2024

https://github.com/ptyadana/dv-data-visualization-with-python

Data analysis and Data Visualization of Countries's GDP, Life Expectancy comparison across continents, GDP per Capita Relative Growth, Population Reative Growth comparison etc using Pandas, Matplotlib.

csdojo data-analysis data-visualization datavisualization matplotlib matplotlib-pyplot numpy pandas pluralsight python python3

Last synced: 15 Nov 2024

https://github.com/theengineeringworld/numpy-data-science

NumPy Data Science Essential Traing COurse. Part of Youtube Course Offered by TheEngineeringWorld.

data-analysis data-science numpy numpy-exercises numpy-library numpy-tutorial python python-3-6 python3 scipy2018

Last synced: 08 Nov 2024

https://github.com/ynikitenko/lena

Lena is an architectural framework for data analysis

analysis-framework analysis-pipeline data-analysis data-science

Last synced: 11 Nov 2024

https://github.com/martinthoma/bad-stats

Examples of how not to do statistics / visualizations

data-analysis statistics visualizations

Last synced: 09 Dec 2024

https://github.com/maksimekin/umd_data_challange_2020

Ocean Clean up data analysis project for the UMD Data Challenge 2020. Data Exploration for a Sustainable Planet.

cleanup competition data-analysis data-science folium geolocation machine-learning ocean planet pollution sklearn sustainability time-series trash umd

Last synced: 15 Dec 2024

https://github.com/martinthoma/shell-history-analysis

Analyze how you use your shell

data-analysis python shell

Last synced: 06 Dec 2024

https://github.com/franpog859/top-of-the-world

🌍🔝 Proof that your country is the top of the world using GeoTIFF images and a little bit of geometry. Data mining project

data data-analysis data-mining elevation geometry geotiff image-processing matplotlib nvector rasterio

Last synced: 12 Nov 2024

https://github.com/globeandmail/startr-cli

A command-line scaffolder for the startr R project template

data-analysis data-journalism data-visualization journalism r

Last synced: 27 Nov 2024

https://github.com/palewire/baseball-notebooks

Python notebooks exploring Major League Baseball data

baseball baseball-statistics data-analysis jupyter-notebook pandas python

Last synced: 18 Oct 2024

https://github.com/avinashkranjan/basic-data-analysis-and-visualization-in-python

📊 Some of the most important python tools in data science for Data Analysis and Data Visualization.

data-analysis data-science matplotlib matplotlib-pyplot numpy pandas plotly seabourne

Last synced: 13 Dec 2024

https://github.com/scottgriv/river-charts

🌊 📉 A Python, Django, Plotly, and Pandas web application that visualizes river data in real time, pulled using an API from the United States Geological Survey (USGS).

api charts data data-analysis data-visualization dataset django pandas plotly python usgs usgs-api visualization webapp

Last synced: 14 Dec 2024