Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mrjxtr/tokyo_airbnb_analysis_project

Full project case study and analysis to show potential opportunities to start an AirBnb business in Tokyo, Japan.

data-analysis data-cleaning data-science data-visualization pandas python3

Last synced: 06 Nov 2024

https://github.com/ronylpatil/whatsapplib

WhatsApp Group Chat Analysis Python Package.

data-analysis open-source pypi-package python-library python-package

Last synced: 21 Jan 2025

https://github.com/i4ds/ecallisto_ng

Ecallisto NG is a Python package tailored for interacting with Ecallisto data.

data-analysis data-visualization e-callisto ecallisto-international-network numpy pandas python spectrometer

Last synced: 09 Nov 2024

https://github.com/coumbacoulibaly/adventureworkscycles

Repository for Adventure Works Sample Database Analysis

adventureworks data-analysis data-analytics mssql-database mssqlserver sql ssms

Last synced: 17 Nov 2024

https://github.com/fabienarcellier/qjoin

qjoin is a data manipulation library that provides simple and efficient joining and collection processing functionality

composable data-analysis developer-tools functools python

Last synced: 28 Nov 2024

https://github.com/ivanildobarauna-dev/api-to-dataframe

Python library that simplifies obtaining data from API endpoints by converting them directly into Pandas DataFrames. This library offers robust features, including retry strategies for failed requests.

data-analysis data-analytics data-engineering library pypi-packages python

Last synced: 19 Dec 2024

https://github.com/openpmd/openpmd-ccd

A Python Module & LabView Bindings for Storing CCD Images with openPMD

ccd data-analysis database hdf5 open-data open-science openpmd

Last synced: 04 Jan 2025

https://github.com/a-r-j/npview

CLI tools for quickly inspecting CSV/TSV & NumPy (.npy) array files

cli csv data-analysis inspector npy numpy python tsv

Last synced: 17 Dec 2024

https://github.com/cworld1/da-learning

Some notes and code about CWorld learning Data Analysis

data-analysis data-science jupyter-book jupyter-notebook python r

Last synced: 23 Jan 2025

https://github.com/narius2030/sakila-datawarehouse-ssis

Implement a simple data warehouse to store Saklia data - Create data pipelines for extract, transform and load data from source to warehouse - Retrieve data in warehouse to explore and do several analysis

data-analysis data-integration data-modeling data-visualization excel microsoft-sql-server power-bi ssas ssis

Last synced: 14 Dec 2024

https://github.com/dcs-training/from-spss-to-r-how-to-make-your-statistical-analysis-reproducible

Comfortable/aware of how to run your stats in SPSS? Curious to learn how to run them in R? You've come to the right place. Go to the readme file

data-analysis data-visualisation data-wrangling good-practices-digital-research r rmarkdown spss statistics

Last synced: 10 Nov 2024

https://github.com/CAIDA/submarine-cable-impact-analysis-public

This repository contains tools implemented for the PAM 2020 paper "Unintended consequences: Effects of submarine cable deployment on Internet routing" to collect and analyze data depicting the impact of the South-Atlantic Cable System (SACS) launch on Internet routing. This codebase can be extended to other use-cases of cable launches, failures, etc.

africa-americas africa-south-america bgp-data-analysis caida-ark-measurement-platform data-analysis historical-traceroutes impact internet-routing ripe-atlas-measurement-platform sacs-cable sail-cable submarine-cables

Last synced: 06 Nov 2024

https://github.com/alieymsxxn/sql_project_data_job_analysis

This project explores top-paying jobs, in-demand skills, and where high demand meets high salary in data analytics.

data-analysis postgresql sql sqlite

Last synced: 28 Nov 2024

https://github.com/tushar2704/everyday-sql

Welcome to Everyday SQL Sheets – your go-to resource for everyday SQL cheat sheets, pro tips, interview questions, and more. Whether you're a beginner looking to learn SQL or an experienced developer seeking quick reference materials, this application has got you covered.

artificial-intelligence cheatsheet data-analysis data-science database mysql postgresql query-language sql sqlalchemy streamlit streamlit-tushar2704 tushar2704

Last synced: 27 Dec 2024

https://github.com/jethronap/asylumdataku_website

Mini website for reporting analysis of Asylum Data @ DIKU

data-analysis docsify nlp

Last synced: 06 Jan 2025

https://github.com/banyc/dfsql

SQL REPL/lib for Data Frames

cli csv data-analysis jsonl ndjson repl sql

Last synced: 19 Nov 2024

https://github.com/muzammil-13/mimlrepo

Data Analysis using Python Machine Learning Libraries

data-analysis data-science machine-learning numpy pandas python python-library

Last synced: 16 Jan 2025

https://github.com/arnabsaha7/customer-churn_prediction---analysis

Predict customer churn using machine learning. This project employs a RandomForestClassifier to analyze customer data and determine the likelihood of churn. Explore the Jupyter Notebook for insights into the data and model, and contribute to the project's development.

customer-churn-prediction data-analysis machine-learning

Last synced: 12 Jan 2025

https://github.com/rgalyeon/machine_learning_and_data_analysis

Machine Learning and Data Analysis specialization by Yandex and MIPT

coursera data-analysis data-science machine-learning mipt python yandex

Last synced: 14 Jan 2025

https://github.com/dcs-training/effectivedatavisualisation

This repository hosts the material connected to a training course developed by Dave Elsmore (Edina) for CDCS on good data visualisation. Go to the readme file

data-analysis data-visualisation data-wrangling python

Last synced: 10 Nov 2024

https://github.com/skylord0001/python-daily

Python - Basic, Apache - Conf, Black Stack Hub, Data analysis, Data Structure, Google Cloud, SQL system

apache-configuration data-analysis data-structure python-scripts python-sql

Last synced: 23 Nov 2024

https://github.com/dcs-training/bayesian-statistics

Materials for the CDCS Introduction to Bayesian Statistics course. Go to the readme file

bayesian-statistics data-analysis r statistics

Last synced: 10 Nov 2024

https://github.com/invictusaman/socioeconomic-indicators-in-chicago-sql-python

This project displays how to create a database connection in notebook, update database using python and how to run Python program and SQL queries together. It uses SQLite and Chicago dataset for analysis.

data-analysis jupyter-notebook python sql sql-queries sqlite

Last synced: 12 Oct 2024

https://github.com/draym/covid19tracker

Coronavirus COVID-19 dashboard to track global cases

covid-19 covid19-tracker dashboard data-analysis

Last synced: 08 Dec 2024

https://github.com/dcs-training/r-qgisintegratingspatialanalysis

This was an intermediate course of three sessions with a focus on developing skills in data visualisation, analysis and integration using both R studio and QGIS. Go to the readme file

data-analysis data-visualisation data-wrangling gis qgis r spatial-analysis

Last synced: 10 Nov 2024

https://github.com/abeltavares/nps_performance_analysis

Analyzing and Monitoring Net Promoter Score (NPS) Performance for Healthcare Companies using SQL and Power BI

customer-satisfaction dashboard data-analysis data-visualization healthcare net-promoter-score nps-analysis performance-monitoring power-bi sql

Last synced: 05 Jan 2025

https://github.com/trybnetic/tu7-acceleration-sleep-wake-classification

Supporting material for the paper ''Discrimination of sleep and wake periods from a hip-worn raw acceleration sensor using recurrent neural networks''

accelerometer accelerometry actigraphy data-analysis sensors sleep

Last synced: 15 Jan 2025

https://github.com/iguptashubham/online-retail-sales

This Power BI dashboard, designed for marketing strategists, analyzes sales trends and customer behavior. It provides key insights empowering them to identify sales opportunities and optimize marketing campaigns, ultimately boosting business sales.

dashboard data data-analysis data-analysis-project data-analysis-project-powerbi data-analysis-python data-project data-science powerbi project

Last synced: 14 Jan 2025

https://github.com/revogati/ecommerce_consumer_behaviour

This is a Full Data Analytics project From data cleaning, preparation, exploration, Interpretation of insights up to Presentation of findings and recommendations..

data-analysis data-exploration ecommerce jupyter-notebook python sql tableau-public visualization

Last synced: 07 Jan 2025

https://github.com/programmer-rd-ai/moviedatascraper

Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!

beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web

Last synced: 12 Jan 2025

https://github.com/yahia3200/become-an-independent-data-scientist

My final project for the Applied Plotting, Charting & Data Representation in Python Course

data-analysis data-science data-visualization matplotlib

Last synced: 22 Jan 2025

https://github.com/ac-gomes/data-engineering-with-databricks

A simple boilerplate for data engineering and data analysis training in Databricks.

data-analysis data-engineering databricks databricks-notebooks pyspark python unit-testing

Last synced: 09 Nov 2024

https://github.com/manmolecular/http-response-clustering

:chart_with_downwards_trend: Clustering of HTTP responses using k-means++ and the elbow method

data-analysis elbow-method elbow-plot jupyter k-means-plus-plus python3

Last synced: 16 Jan 2025

https://github.com/simoneas02/data-science

🐍 A planning study to become a data scientist and to improve my current skills. 🤘🏼🌻

data data-analysis data-science data-visualization deep-learning machine-learning pandas python3 r sql

Last synced: 06 Dec 2024

https://github.com/haritha1005/data-analysis-portfolio

This repository showcases my data analytics and data science skills through projects, fostering collaboration and community engagement

data-analysis data-visualization etl excel matplotlib numpy-library pandas powerbi-report python3 r scipy sql tableau

Last synced: 06 Dec 2024

https://github.com/vishnu-t-r/data-analytics-portfolio-projects

This repository contain data analyst portfolio projects developed using various data analytics tools including SQL, Python, Tableau, Looker etc.

data data-analysis data-cleaning data-modeling data-visualization looker looker-studio python sql ssms tableau

Last synced: 10 Nov 2024

https://github.com/emptymalei/mini-lab

Some code snippets used to explain stuff to myself in my personal data science wiki

data-analysis data-mining data-science data-visualization datascience

Last synced: 20 Dec 2024

https://github.com/techytushar/india-odi-analysis

Analysis of ODI cricket matches of Indian Team

cricket data-analysis data-science pandas plotting python3

Last synced: 07 Jan 2025

https://github.com/hongbo-wei/global-status-of-cc-security-certification

Data visualization of CC Security Certification using VUE, Django, and MySQL.

big-date common-criteria data-analysis data-visualisation data-visualization

Last synced: 14 Jan 2025

https://github.com/shipyardapp/postgresql-blueprints

Simplified blueprints for building data pipelines with PostgreSQL.

cli data-analysis data-engineering data-pipeline data-science database elt etl postgres postgresql

Last synced: 04 Dec 2024

https://github.com/thisisashukla/survival-analysis

Hands-On Survival Analysis in Python

data-analysis data-science survival-analysis

Last synced: 28 Dec 2024

https://github.com/haloapping/malas-ngetik-clf

Saya malas ngetik, makanya saya buat aja template proyek kompetisi Kaggle 😜. Template ini khusus untuk kasus klasifikasi.

data-analysis exploratory-data-analysis feature-engineering kaggle kaggle-competition machine-learning python3 scikit-learn

Last synced: 06 Jan 2025

https://github.com/elysian01/ml-eda-and-modelling-using-streamlit

Beautiful Web interface made using Streamlit for quick Exploratory Data Analysis and building classification models which are implemented from scratch.

data-analysis data-visualization eda exploratory-data-analysis knn-classification logistic-regression matplotlib ml-model-on-web ml-models naive-bayes-classifier pandas seaborn streamlit streamlit-webapp

Last synced: 07 Nov 2024

https://github.com/iantomasinicola/portfoliodataanalyst

Progetto di Data analysis con Python, Microsoft Sql Server e Excel

data-analysis excel python sql

Last synced: 02 Jan 2025

https://github.com/w-edward/youtube-keyword-popularity-analyzer

An effort to discover the top trending keywords on Youtube.

data-analysis node-js numpy python webscraping youtube-api

Last synced: 16 Jan 2025

https://github.com/idaraabasiudoh/vehicle-co2emission_model

Predicts CO2 emissions from vehicle fuel consumption using a multiple linear regression model trained on sklearn, based on a dataset of engine sizes and corresponding CO2 emissions in Canada.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 14 Jan 2025

https://github.com/bradleyboehmke/uc-bana-6043

Additional resources for the UC BANA 6043 Statistical Computing course

data-analysis data-science data-visualization python

Last synced: 14 Oct 2024

https://github.com/ivanildobarauna-dev/data-consumer-api

ETL Process for Currency Quotes Data" project is a complete solution dedicated to extracting, transforming and loading (ETL) currency quote data. This project uses several advanced techniques and architectures to ensure the efficiency and robustness of the ETL process.

business-intelligence data-analysis data-analytics data-engineering data-pipeline data-visualization etl-pipeline python

Last synced: 19 Dec 2024

https://github.com/mindful-ai-assistants/movierevenueanalysis

🎬💰 Analyze movie companies' revenue, release strategies, and financial performance using statistical techniques for actionable insights. This project explores data on total revenue, number of releases, and lifetime gross to uncover patterns that can drive strategic decisions in the film industry.

correlation-analysis data-analysis data-science heatmap jupyter-notebook open-source python statistical-analysis statistical-analysis-and-hypothesis-testing statistics ttest

Last synced: 22 Jan 2025

https://github.com/alexandregazagnes/unilasalle-public-resources

UniLaSalle-Public-Ressources : This public repository contains the notebooks and the data used for both : 2nd Year - Practical Statistical Tests 4th Year - Data Analysis with Python

data data-analysis data-analytics data-cleaning data-storytelling education educational exploratory-data-analysis python python3 r r-programming rstudio statistics visualization

Last synced: 11 Oct 2024

https://github.com/aravind-selvam/covid_dashboard

With Covid death and vaccine data. I have created a dashboard.

covid-19 data-analysis data-science data-visualization tableau tableau-public visualization

Last synced: 14 Jan 2025

https://github.com/omarsar/energy_stats

Analyzing energy production with Kibana Lens

data-analysis data-science data-visualization elasticsearch kibana

Last synced: 23 Jan 2025

https://github.com/c0deta1ker/matbase

MatBase provides access to an extensive database of material parameters, inelastic mean free paths (IMFP), photoionization binding energies, cross sections, and asymmetry parameters. Additionally, MatBase includes a suite of functions for users to load, process, model and fit their own data, making it an indispensable tool in the field.

cross-sections crystal-structure crystallography data-analysis data-fitting database electron imfp imfp-calculator-matlab material material-database matlab matlab-application matlab-gui matlab-toolbox pes-modelling photoelectron-spectroscopy photoionization simulation xps

Last synced: 30 Nov 2024

https://github.com/BigBangData/TimesheetAnalysis

R shiny app to help analyze a bookkeeper's business - or anyone with a timesheet and some time.

bookkeeping data-analysis data-viz r-programming shiny-apps shiny-r timesheet-management

Last synced: 04 Dec 2024

https://github.com/llnl/hdtopology

High-dimensional topological data analysis library for NDDAV

analysis cpp data-analysis data-viz high-dimensional-data topological-data-analysis visualization

Last synced: 11 Nov 2024

https://github.com/saranshbansal/spam-detection-analytics-tool

This is a nice tool to read chunks of sms data from a csv and understand how different algorithms (pre-implemented) perform in identifying spam messages.

analytics data-analysis data-science data-visualization mysql spring-boot

Last synced: 05 Jan 2025

https://github.com/pitmonticone/covid-italy

References for COVID-19 situation in Italy.

coronavirus covid-19 covid-19-italy data data-analysis documentation testing

Last synced: 22 Jan 2025

https://github.com/kalebers/data_streams_parametric_t-sne

Research for Parametric T-SNE in high to low dimensional data stream, published in 2021 by Kalebe Rodrigues Szlachta and Andre de Macedo Wlodkowski, oriented by Jean Paul Barddal, Computer Science graduation from Pontifical Catholic University of Parana (PUCPR)

classifier data-analysis data-science data-visualization machinelearning parametric parametric-tsne python tsne-algorithm tsne-visualization

Last synced: 21 Jan 2025

https://github.com/shlokashah/student-depression-and-suicide-rate-prediction

https://shlokashah.github.io/Student-Depression-And-Suicide-Rate-Prediction/

data-analysis data-visualization machine-learning student suicide-rate-prediction

Last synced: 28 Dec 2024

https://github.com/shlokashah/ipl-data-analysis

Data Analysis and Visualizations done on IPL dataset

data-analysis data-visualization pandas powerbi

Last synced: 28 Dec 2024

https://github.com/ikanurfitriani/project-data-analysis-python

This repository contains the results of data analysis learning using the Python.

data-analysis data-analysis-project data-analysis-python python

Last synced: 26 Jan 2025

https://github.com/nikoshet/exploratory-data-analysis-using-r

Exploratory Data Analysis using R Course Project for M.Sc. 'Data Science and Machine Learning' in NTUA

data data-analysis data-science eda exploratory-data-analysis ggplot2 r

Last synced: 03 Jan 2025

https://github.com/kaustubhgupta/data-analysis-hub

This is where all my Data Analysis notebooks are present. All the notebooks are either fully explored and have an explanatory readme or a medium article has been published which is linked in the README.

data-analysis data-science google-play-store kaggle matplotlib pandas seaborn

Last synced: 27 Jan 2025

https://github.com/v-octal/hows_india_feeling

An application which displays India's region wise twitter sentiment on the map.

data-analysis data-visualization flask-restful leaflet nlp-machine-learning sentiment-analysis

Last synced: 30 Dec 2024

https://github.com/rubydamodar/the-ultimate-pandas-bootcamp

Welcome to the Pandas for Data Science repository! This course is designed to take you from beginner to proficient in using Pandas, the powerful data manipulation library in Python. Whether you're just starting your data science journey or looking to sharpen your skills, this repository contains all the resources

beginner-friendly csv-data data-analysis data-cleaning data-manipulation data-science data-visualization dataframe exploratory-data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python python-pandas series statistical-analysis time-series titanic-dataset

Last synced: 18 Oct 2024

https://github.com/cosmoduende/r-holy-books-sentiment-data-analysis

What's the most positive or negative religion? . Sentiment and Data Analysis of Holy Books with R. Analysis of religious dogmas by exploring their Holy Books (The Bible, The Quran, The Dhammapada, and The Book of Mormon) with R

bible book-of-mormon data-analysis data-analytics data-visualisation data-visualization dataviz dhammapada holy-scriptures quran religions-studies religious religious-studies sentiment-analysis sentiment-polarity sentimental-analysis text-analysis text-analytics text-mining text-mining-analysis

Last synced: 07 Nov 2024

https://github.com/cbg-ethz/scdna-pipe

Python data analysis pipeline for single cell copy number event history reconstruction

bioinformatics bioinformatics-pipeline data-analysis genomics python snakemake snakemake-workflows workflow

Last synced: 28 Jan 2025

https://github.com/ivanildobarauna-dev/data-pipeline-sync-ingest

ETL Process for Currency Quotes Data" project is a complete solution dedicated to extracting, transforming and loading (ETL) currency quote data. This project uses several advanced techniques and architectures to ensure the efficiency and robustness of the ETL process.

business-intelligence data-analysis data-analytics data-engineering data-pipeline data-visualization etl-pipeline python

Last synced: 26 Jan 2025

https://github.com/cosmoduende/r-marvel-vs-dc

DC Comics vs Marvel Comics - Exploratory Data Analysis and Data Visualization with R. Who has the smartest, strongest, fastest, or most powerful hero or villain? How to answer this and more questions with R

comics data-analysis data-analysis-r data-analytics data-visualization dataviz dc-characters dc-comics eda exploratory-analysis exploratory-data-analysis exploratory-data-visualizations marvel-characters marvel-comics marvel-vs-dc shdb superherodb superheroes superheros

Last synced: 07 Nov 2024

https://github.com/thealphadollar/messiah

Messiah: The Mighty Son Of God Is Here To Help You Through Times Of Calamity

azure backend data data-analysis flask frontend materialize natural-disasters

Last synced: 21 Dec 2024

https://github.com/aad99bxp/whatsapp-chat-analyzer

A project intended for Business Owners / Managers to analyze Whatsapp chats between their customer care executives and their customers.

data-analysis heroku-deployment python3

Last synced: 22 Jan 2025

https://github.com/josechirif/reviews-and-satisfaction-analysis-of-airbnb-brazil-and-mexico-from-june-2010-to-february-2021

This project analyzes the reviews and satisfaction of customers who used AirBnB services. It also studies if there is a relationship between another variables.

data data-analysis data-visualization powerbi sql-server

Last synced: 23 Oct 2024

https://github.com/tushar2704/store-demand-forecasting

This project predicts the sales demand for various items in different stores based on historical sales data. The objective is to develop a machine learning model that can provide accurate forecasts for future sales of each store-item combination.

artifi data-analysis data-science python sales-analysis sales-forecasting tushar2704

Last synced: 27 Dec 2024

https://github.com/rawsashimi1604/jobextract

Scrapes LinkedIn data. Conducts sentiment analysis on what traits and qualifications employers are looking for.

data data-analysis data-analytics data-cleaning linkedin mvc python webscraper

Last synced: 27 Dec 2024

https://github.com/1ayanabil1/healthcare-machine-learning

Explore our open-source repository focused on healthcare machine learning. We've developed predictive models for cardiovascular disease, diabetes, breast cancer, and more. Our projects employ diverse machine learning algorithms and data science techniques, enhancing early detection, diagnosis, and patient outcomes.

data-analysis data-science deep-learning disease disease-detection disease-modeling disease-prediction eda healthcare-application heathcare jupyter-notebook machine-learning machine-learning-algorithms machinelearning-python python

Last synced: 07 Jan 2025

https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation

we aim to predict trends in the Canadian market basket using sentiment analysis techniques. Sentiment analysis involves analyzing text data to determine the sentiment expressed, whether positive, negative, or neutral.

algorithms-and-data-structures data data-analysis data-science data-visualization feature-engineering machine-learning matplotlib-pyplot numerical-analysis numpy pandas pipelines python sklearn structured-data super unsupervised-learning

Last synced: 06 Dec 2024

https://github.com/itzmeanjan/indian-railway

Exploring Indian Railways time table dataset, with :heart:

data-analysis data-visualization indian-railways matplotlib python python3 railway

Last synced: 05 Oct 2024