Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/fbecerra/fbecerra.github.io

Source code for my website www.fernandobecerra.com

data-analysis data-science data-visualization dataviz interactive-visualizations

Last synced: 27 Oct 2024

https://github.com/banyc/dfsql

SQL REPL/lib for Data Frames

cli csv data-analysis jsonl ndjson repl sql

Last synced: 19 Nov 2024

https://github.com/jonperk318/machine-learning-analysis-of-hyperspectral-data

Using Non-negative Matrix Factorization (NMF) and Variational Autoencoder (VAE) machine learning architectures to analyze spatial and spectral features of hyperspectral cathodoluminescence (CL) spectroscopy images taken from hybrid inorganic-organic perovskite material

data-analysis data-science deep-neural-networks explained-variance hybrid-perovskite hyperspectral-image-classification machine-learning matplotlib nmf non-negative-matrix-factorization python pytorch scikit-learn semi-supervised-learning signal-processing solar-energy spectroscopy unsupervised-learning vae variational-autoencoder

Last synced: 06 Nov 2024

https://github.com/ivanildobarauna-dev/data-consumer-api

ETL Process for Currency Quotes Data" project is a complete solution dedicated to extracting, transforming and loading (ETL) currency quote data. This project uses several advanced techniques and architectures to ensure the efficiency and robustness of the ETL process.

business-intelligence data-analysis data-analytics data-engineering data-pipeline data-visualization etl-pipeline python

Last synced: 19 Dec 2024

https://github.com/kalebers/data_streams_parametric_t-sne

Research for Parametric T-SNE in high to low dimensional data stream, published in 2021 by Kalebe Rodrigues Szlachta and Andre de Macedo Wlodkowski, oriented by Jean Paul Barddal, Computer Science graduation from Pontifical Catholic University of Parana (PUCPR)

classifier data-analysis data-science data-visualization machinelearning parametric parametric-tsne python tsne-algorithm tsne-visualization

Last synced: 21 Jan 2025

https://github.com/mrjxtr/tokyo_airbnb_analysis_project

Full project case study and analysis to show potential opportunities to start an AirBnb business in Tokyo, Japan.

data-analysis data-cleaning data-science data-visualization pandas python3

Last synced: 06 Nov 2024

https://github.com/muzammil-13/mimlrepo

Data Analysis using Python Machine Learning Libraries

data-analysis data-science machine-learning numpy pandas python python-library

Last synced: 16 Jan 2025

https://github.com/marios-mamalis/mca-visualisation

A script for automatic visualisation of Multiple Correspondence Analysis (MCA) results from FactoMineR in 3 dimensions using Plotly (exported as html)

3d-scatterplots correspondence-analysis data-analysis factominer html mca multiple-correspondence-analysis plotly visualisation

Last synced: 13 Jan 2025

https://github.com/simoneas02/data-science

🐍 A planning study to become a data scientist and to improve my current skills. 🤘🏼🌻

data data-analysis data-science data-visualization deep-learning machine-learning pandas python3 r sql

Last synced: 06 Dec 2024

https://github.com/revogati/ecommerce_consumer_behaviour

This is a Full Data Analytics project From data cleaning, preparation, exploration, Interpretation of insights up to Presentation of findings and recommendations..

data-analysis data-exploration ecommerce jupyter-notebook python sql tableau-public visualization

Last synced: 07 Jan 2025

https://github.com/thisisashukla/survival-analysis

Hands-On Survival Analysis in Python

data-analysis data-science survival-analysis

Last synced: 28 Dec 2024

https://github.com/arnabsaha7/customer-churn_prediction---analysis

Predict customer churn using machine learning. This project employs a RandomForestClassifier to analyze customer data and determine the likelihood of churn. Explore the Jupyter Notebook for insights into the data and model, and contribute to the project's development.

customer-churn-prediction data-analysis machine-learning

Last synced: 12 Jan 2025

https://github.com/asifdotexe/timeseriesanalysis

This repository serves as a central hub for all of my projects related to time series analysis. Here, you'll find a collection of projects, code samples, and resources that explore various aspects of time series data and its analysis.

data-analysis feature-engineering jupyter-notebook pandas python time-series-analysis visualization

Last synced: 15 Jan 2025

https://github.com/asifdotexe/covidporfolioproject

This is a SQL + Tableau Project on real world Covid 19 Dataset from the start of recorded case to 2nd March 2022 i.e My birthday XD

dashboard data-analysis data-exploration data-visualization sql sql-server tableau

Last synced: 15 Jan 2025

https://github.com/nikoshet/exploratory-data-analysis-using-r

Exploratory Data Analysis using R Course Project for M.Sc. 'Data Science and Machine Learning' in NTUA

data data-analysis data-science eda exploratory-data-analysis ggplot2 r

Last synced: 03 Jan 2025

https://github.com/aravind-selvam/covid_dashboard

With Covid death and vaccine data. I have created a dashboard.

covid-19 data-analysis data-science data-visualization tableau tableau-public visualization

Last synced: 14 Jan 2025

https://github.com/invictusaman/socioeconomic-indicators-in-chicago-sql-python

This project displays how to create a database connection in notebook, update database using python and how to run Python program and SQL queries together. It uses SQLite and Chicago dataset for analysis.

data-analysis jupyter-notebook python sql sql-queries sqlite

Last synced: 12 Oct 2024

https://github.com/cosmoduende/r-arduino

Interoperability Data-IoT: How to send and receive data and take control of your Arduino, from R. How to establish interoperability between R and Arduino (Data and IoT) using a data flow between the two

arduino arduino-data arduino-dataflow arduino-serial arduino-serial-data arduino-serial-led arduino-uno data-analysis data-arduino data-cleaning data-iot data-visualization interoperability iot-rstudio r-analytics r-data-visualization r-iot rstudio-arduino serial-read serialport

Last synced: 27 Dec 2024

https://github.com/itzmeanjan/indian-railway

Exploring Indian Railways time table dataset, with :heart:

data-analysis data-visualization indian-railways matplotlib python python3 railway

Last synced: 05 Oct 2024

https://github.com/andreaschandra/who-suicides-statistics

Exploratory Data Analysis for Suicides using Python

data-analysis data-science eda python

Last synced: 19 Dec 2024

https://github.com/idaraabasiudoh/vehicle-co2emission_model

Predicts CO2 emissions from vehicle fuel consumption using a multiple linear regression model trained on sklearn, based on a dataset of engine sizes and corresponding CO2 emissions in Canada.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 14 Jan 2025

https://github.com/iantomasinicola/portfoliodataanalyst

Progetto di Data analysis con Python, Microsoft Sql Server e Excel

data-analysis excel python sql

Last synced: 02 Jan 2025

https://github.com/dcs-training/r-qgisintegratingspatialanalysis

This was an intermediate course of three sessions with a focus on developing skills in data visualisation, analysis and integration using both R studio and QGIS. Go to the readme file

data-analysis data-visualisation data-wrangling gis qgis r spatial-analysis

Last synced: 10 Nov 2024

https://github.com/josechirif/reviews-and-satisfaction-analysis-of-airbnb-brazil-and-mexico-from-june-2010-to-february-2021

This project analyzes the reviews and satisfaction of customers who used AirBnB services. It also studies if there is a relationship between another variables.

data data-analysis data-visualization powerbi sql-server

Last synced: 23 Oct 2024

https://github.com/dcs-training/bayesian-statistics

Materials for the CDCS Introduction to Bayesian Statistics course. Go to the readme file

bayesian-statistics data-analysis r statistics

Last synced: 10 Nov 2024

https://github.com/ronylpatil/whatsapplib

WhatsApp Group Chat Analysis Python Package.

data-analysis open-source pypi-package python-library python-package

Last synced: 21 Jan 2025

https://github.com/dcs-training/effectivedatavisualisation

This repository hosts the material connected to a training course developed by Dave Elsmore (Edina) for CDCS on good data visualisation. Go to the readme file

data-analysis data-visualisation data-wrangling python

Last synced: 10 Nov 2024

https://github.com/llnl/hdtopology

High-dimensional topological data analysis library for NDDAV

analysis cpp data-analysis data-viz high-dimensional-data topological-data-analysis visualization

Last synced: 11 Nov 2024

https://github.com/dcs-training/from-spss-to-r-how-to-make-your-statistical-analysis-reproducible

Comfortable/aware of how to run your stats in SPSS? Curious to learn how to run them in R? You've come to the right place. Go to the readme file

data-analysis data-visualisation data-wrangling good-practices-digital-research r rmarkdown spss statistics

Last synced: 10 Nov 2024

https://github.com/techytushar/india-odi-analysis

Analysis of ODI cricket matches of Indian Team

cricket data-analysis data-science pandas plotting python3

Last synced: 07 Jan 2025

https://github.com/vishnu-t-r/data-analytics-portfolio-projects

This repository contain data analyst portfolio projects developed using various data analytics tools including SQL, Python, Tableau, Looker etc.

data data-analysis data-cleaning data-modeling data-visualization looker looker-studio python sql ssms tableau

Last synced: 10 Nov 2024

https://github.com/ac-gomes/data-engineering-with-databricks

A simple boilerplate for data engineering and data analysis training in Databricks.

data-analysis data-engineering databricks databricks-notebooks pyspark python unit-testing

Last synced: 09 Nov 2024

https://github.com/chaganti-reddy/evmarket-india

Electric Vehicle Market Segmentation Analysis in India

data-analysis data-science machine-learning market-segmentation pandas python

Last synced: 20 Jan 2025

https://github.com/trybnetic/tu7-acceleration-sleep-wake-classification

Supporting material for the paper ''Discrimination of sleep and wake periods from a hip-worn raw acceleration sensor using recurrent neural networks''

accelerometer accelerometry actigraphy data-analysis sensors sleep

Last synced: 15 Jan 2025

https://github.com/pitmonticone/covid-italy

References for COVID-19 situation in Italy.

coronavirus covid-19 covid-19-italy data data-analysis documentation testing

Last synced: 22 Jan 2025

https://github.com/rubydamodar/the-ultimate-pandas-bootcamp

Welcome to the Pandas for Data Science repository! This course is designed to take you from beginner to proficient in using Pandas, the powerful data manipulation library in Python. Whether you're just starting your data science journey or looking to sharpen your skills, this repository contains all the resources

beginner-friendly csv-data data-analysis data-cleaning data-manipulation data-science data-visualization dataframe exploratory-data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python python-pandas series statistical-analysis time-series titanic-dataset

Last synced: 18 Oct 2024

https://github.com/1ayanabil1/healthcare-machine-learning

Explore our open-source repository focused on healthcare machine learning. We've developed predictive models for cardiovascular disease, diabetes, breast cancer, and more. Our projects employ diverse machine learning algorithms and data science techniques, enhancing early detection, diagnosis, and patient outcomes.

data-analysis data-science deep-learning disease disease-detection disease-modeling disease-prediction eda healthcare-application heathcare jupyter-notebook machine-learning machine-learning-algorithms machinelearning-python python

Last synced: 07 Jan 2025

https://github.com/virajbhutada/cliquebait-digital-marketing-analysis-using-sql

This GitHub repository contains the CliqueBait Digital Marketing Analysis project, utilizing SQL for comprehensive analysis of marketing campaigns, user engagement, product performance, and website interactions within the Clique Bait food app. The project offers actionable insights for optimizing marketing strategies in competitive landscape.

campaign-website data-analysis data-extraction data-science digital-marketing food-store microsoft-excel mysql product-performance sql sql-database sql-project user-engagement website-analytics

Last synced: 10 Jan 2025

https://github.com/manmolecular/http-response-clustering

:chart_with_downwards_trend: Clustering of HTTP responses using k-means++ and the elbow method

data-analysis elbow-method elbow-plot jupyter k-means-plus-plus python3

Last synced: 16 Jan 2025

https://github.com/i4ds/ecallisto_ng

Ecallisto NG is a Python package tailored for interacting with Ecallisto data.

data-analysis data-visualization e-callisto ecallisto-international-network numpy pandas python spectrometer

Last synced: 09 Nov 2024

https://github.com/alexandregazagnes/unilasalle-public-resources

UniLaSalle-Public-Ressources : This public repository contains the notebooks and the data used for both : 2nd Year - Practical Statistical Tests 4th Year - Data Analysis with Python

data data-analysis data-analytics data-cleaning data-storytelling education educational exploratory-data-analysis python python3 r r-programming rstudio statistics visualization

Last synced: 11 Oct 2024

https://github.com/shipyardapp/postgresql-blueprints

Simplified blueprints for building data pipelines with PostgreSQL.

cli data-analysis data-engineering data-pipeline data-science database elt etl postgres postgresql

Last synced: 04 Dec 2024

https://github.com/abeltavares/nps_performance_analysis

Analyzing and Monitoring Net Promoter Score (NPS) Performance for Healthcare Companies using SQL and Power BI

customer-satisfaction dashboard data-analysis data-visualization healthcare net-promoter-score nps-analysis performance-monitoring power-bi sql

Last synced: 05 Jan 2025

https://github.com/BigBangData/TimesheetAnalysis

R shiny app to help analyze a bookkeeper's business - or anyone with a timesheet and some time.

bookkeeping data-analysis data-viz r-programming shiny-apps shiny-r timesheet-management

Last synced: 04 Dec 2024

https://github.com/shlokashah/student-depression-and-suicide-rate-prediction

https://shlokashah.github.io/Student-Depression-And-Suicide-Rate-Prediction/

data-analysis data-visualization machine-learning student suicide-rate-prediction

Last synced: 28 Dec 2024

https://github.com/elysian01/ml-eda-and-modelling-using-streamlit

Beautiful Web interface made using Streamlit for quick Exploratory Data Analysis and building classification models which are implemented from scratch.

data-analysis data-visualization eda exploratory-data-analysis knn-classification logistic-regression matplotlib ml-model-on-web ml-models naive-bayes-classifier pandas seaborn streamlit streamlit-webapp

Last synced: 07 Nov 2024

https://github.com/shlokashah/ipl-data-analysis

Data Analysis and Visualizations done on IPL dataset

data-analysis data-visualization pandas powerbi

Last synced: 28 Dec 2024

https://github.com/alieymsxxn/sql_project_data_job_analysis

This project explores top-paying jobs, in-demand skills, and where high demand meets high salary in data analytics.

data-analysis postgresql sql sqlite

Last synced: 28 Nov 2024

https://github.com/phollemans/cwutils

CoastWatch Utilities software for working with satellite data files from NOAA CoastWatch and elsewhere

cdat coastwatch-utilities data-analysis data-visualization install4j java noaa-coastwatch remote-sensing satellite-imagery

Last synced: 12 Nov 2024

https://github.com/c0deta1ker/matbase

MatBase provides access to an extensive database of material parameters, inelastic mean free paths (IMFP), photoionization binding energies, cross sections, and asymmetry parameters. Additionally, MatBase includes a suite of functions for users to load, process, model and fit their own data, making it an indispensable tool in the field.

cross-sections crystal-structure crystallography data-analysis data-fitting database electron imfp imfp-calculator-matlab material material-database matlab matlab-application matlab-gui matlab-toolbox pes-modelling photoelectron-spectroscopy photoionization simulation xps

Last synced: 30 Nov 2024

https://github.com/tushar2704/everyday-sql

Welcome to Everyday SQL Sheets – your go-to resource for everyday SQL cheat sheets, pro tips, interview questions, and more. Whether you're a beginner looking to learn SQL or an experienced developer seeking quick reference materials, this application has got you covered.

artificial-intelligence cheatsheet data-analysis data-science database mysql postgresql query-language sql sqlalchemy streamlit streamlit-tushar2704 tushar2704

Last synced: 27 Dec 2024

https://github.com/haloapping/malas-ngetik-clf

Saya malas ngetik, makanya saya buat aja template proyek kompetisi Kaggle 😜. Template ini khusus untuk kasus klasifikasi.

data-analysis exploratory-data-analysis feature-engineering kaggle kaggle-competition machine-learning python3 scikit-learn

Last synced: 06 Jan 2025

https://github.com/mindful-ai-assistants/movierevenueanalysis

🎬💰 Analyze movie companies' revenue, release strategies, and financial performance using statistical techniques for actionable insights. This project explores data on total revenue, number of releases, and lifetime gross to uncover patterns that can drive strategic decisions in the film industry.

correlation-analysis data-analysis data-science heatmap jupyter-notebook open-source python statistical-analysis statistical-analysis-and-hypothesis-testing statistics ttest

Last synced: 22 Jan 2025

https://github.com/emptymalei/mini-lab

Some code snippets used to explain stuff to myself in my personal data science wiki

data-analysis data-mining data-science data-visualization datascience

Last synced: 20 Dec 2024

https://github.com/programmer-rd-ai/moviedatascraper

Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!

beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web

Last synced: 12 Jan 2025

https://github.com/grburgess/gbm_kitty

Database, reduce, and analyze GBM data without having to know anything. Curiosity killed the catalog.

3ml catalogue data-analysis fermi-science grbs pipelines

Last synced: 23 Nov 2024

https://github.com/fabienarcellier/qjoin

qjoin is a data manipulation library that provides simple and efficient joining and collection processing functionality

composable data-analysis developer-tools functools python

Last synced: 28 Nov 2024

https://github.com/jethronap/asylumdataku_website

Mini website for reporting analysis of Asylum Data @ DIKU

data-analysis docsify nlp

Last synced: 06 Jan 2025

https://github.com/CAIDA/submarine-cable-impact-analysis-public

This repository contains tools implemented for the PAM 2020 paper "Unintended consequences: Effects of submarine cable deployment on Internet routing" to collect and analyze data depicting the impact of the South-Atlantic Cable System (SACS) launch on Internet routing. This codebase can be extended to other use-cases of cable launches, failures, etc.

africa-americas africa-south-america bgp-data-analysis caida-ark-measurement-platform data-analysis historical-traceroutes impact internet-routing ripe-atlas-measurement-platform sacs-cable sail-cable submarine-cables

Last synced: 06 Nov 2024

https://github.com/freekatz/english-reading

考研英语(10-19)数据集及相关数据分析

data-analysis dataset

Last synced: 07 Dec 2024

https://github.com/kenvilar/data-analysis-using-python

Transforming a description of a location from an analyzed CSV file data using Pandas with Python 3

bs4 data-analysis jupyter pandas python python3 requests xlrd

Last synced: 18 Nov 2024

https://github.com/anushadatta/airbnb-in-seattle

🏨 Understanding the Airbnb rental landscape in Seattle using data science.

airbnb data-analysis data-exploration data-visualization datascience sentiment-analysis

Last synced: 11 Dec 2024

https://github.com/rapidsurveys/oldr

An Implementation of the Rapid Assessment Method for Older People (RAM-OP)

assessment data-analysis epidata estimate odk older-people r ram-op ranalyticflow rapid-assessment

Last synced: 24 Dec 2024

https://github.com/juliusmarkwei/concrete-data

Data analysis, machine learning, model evaluation and optimization on the Concret_ Dataset

data-analysis data-science data-visualization ensemble-learning machine-learning modeling

Last synced: 01 Jan 2025

https://github.com/nhsdigital/sde_example_analysis

Example of what you can do in Databricks in the Secure Data Environment (SDE) using Python, SQL, and R.

data-analysis data-science databricks-notebooks machine-learning mlflow

Last synced: 23 Dec 2024

https://github.com/adirthaborgohain/community-data-analysis

Data and Visual Analysis on several different communities generated using Louvain Algorithm in Neo4j on the dblp dataset.

data-analysis lda python

Last synced: 11 Dec 2024

https://github.com/antononcube/wl-outlieridentifiers-paclet

Wolfram Language (aka Mathematica) paclet that provides outlier identifier functions.

data-analysis hampel outlier-detection outliers

Last synced: 15 Dec 2024

https://github.com/jshinm/web-scrapper

Web Scrapper used to extract NeuroData github repo stats

data-analysis web-scraping

Last synced: 17 Dec 2024

https://github.com/ziaeemehr/itng_nest

Nest Simulator quick guides and examples, adding new model using NESTML

computational-neuroscience data-analysis nest-simulator neuroscience

Last synced: 01 Jan 2025

https://github.com/bkataru/physics-ia

Programs and files written for Astrostatistics for IB Physics IA. Topic: Visualizing and analyzing the habitable zones for 150,000 stars from the hipparcos catalogue.

astronomical-algorithms astronomy astrophysics astrostatistics data-analysis data-science data-visualization matplotlib plotting

Last synced: 22 Dec 2024

https://github.com/uts58/international-student-job-insights-usa

Data-driven insights on job hunting for international students in the USA, analyzing listings, roles, and trends.

career-insights cpt data-analysis eb1 eb2 eb3 h1b handshake job-analytics job-trends jobs jupyter-notebook opt python work-visa

Last synced: 25 Dec 2024

https://github.com/quantumudit/uk-student-accommodation-analysis

This project focuses on scraping student properties related data from the UK Student Accommodation website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 26 Dec 2024

https://github.com/quantumudit/analyzing-goodreads-famous-quotes

This project focuses on scraping famous quotes and their related data from the GoodReads website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 26 Dec 2024

https://github.com/quantumudit/analyzing-quotes

This project focuses on scraping all the quotes and their related data from the "Quotes To Scrape" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 26 Dec 2024

https://github.com/gjbex/python-dashboards

Repository that contains material for training sessions on creating dashboards using Python.

dash dashboard data-analysis data-exploration data-science data-visualization panel python streamlit training training-materials visualization

Last synced: 22 Nov 2024