Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/md-emon-hasan/data_analytics_project

Data analytics tasks and solutions, featuring hands-on exercises for data cleaning, visualization, and analysis using Python libraries.

cars-dataset census-data covid19-data data-analysis london-house-price police-data weather-data

Last synced: 13 Jan 2025

https://github.com/md-emon-hasan/data-science

Data science tutorials, including data preprocessing, analysis, visualization, project deployment, machine learning and deep learning algorithms.

artificial-intelligence data-analysis data-engineering data-science deep-learning machine-learning-algorithms python

Last synced: 13 Jan 2025

https://github.com/maheshthedev/twitter-analysis

Analysis on Various Topics with Twitter Data

data-analysis twitter-analysis

Last synced: 20 Dec 2024

https://github.com/chaganti-reddy/case-study-market-segmentation

Market Segmentation Case Study Analysis using Clustering

case-study data-analysis machine-learning market-segmentation plotting

Last synced: 20 Jan 2025

https://github.com/adolbyb/data-science-python

An Introduction to Data Science and Data Visualization with the FAU Data Science and Machine Learning Club

data-analysis data-science data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 20 Jan 2025

https://github.com/drcbeatz/aynm-data

Python scripts for data cleaning and processing for AYNM (Pandas/NumPy/Selenium/Azure Cognitive Services)

automation azure-cognitive-services csv data-analysis data-cleaning ipynb numpy ocr pandas python reverb selenium shopify webscraping xml

Last synced: 26 Jan 2025

https://github.com/ahammadshawki8/playing-with-pandas

🐼 Pandas is one of my favourite library in python. It is well-known for "Analyzing" data. Learn basics and beyond the basics of Pandas from this repository. 🤍🖤

beginner-friendly data-analysis favourite-library pandas python

Last synced: 28 Dec 2024

https://github.com/jubinjacob03/heartdiseaseclassify-ml

Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.

data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine

Last synced: 11 Oct 2024

https://github.com/unrndm/dataanalysis

artifacts and sollutions of homework for course "Data Analysis" in Magistrate of HSE during 2023-2024

2023-2024 data-analysis hse

Last synced: 05 Dec 2024

https://github.com/jhrcook/wagenmaker-data-analysis

Analysis of Registered Replication Report: Strack, Martin, & Stepper (1988) by Wagenmaker et al.

data-analysis r r-project statistics

Last synced: 13 Jan 2025

https://github.com/yash22222/tata-data-visualisation-virtual-internship

Data Visualisation: Empowering Business with Effective Insights Gain insights into leveraging data visualisations as a tool for making informed business decisions.

basics ceo charts cmo data-analysis data-interpretation data-science data-visualization graphs machine-learning mcq microsoft-excel microsoft-power-bi microsoft-word powerpoint-presentations python tableau tata tata-data-visualisation

Last synced: 05 Jan 2025

https://github.com/yash22222/data-analysis-with-python

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

binning data data-acquisition data-analysis data-binning data-cleaning data-formatting data-integration data-normalization data-preprocessing data-science data-transformation data-wrangling dataframe description numpy pandas pandas-dataframe python python3

Last synced: 05 Jan 2025

https://github.com/viseshrp/community_health_indicator

Android app to fetch,organize and represent NYC health data

android data-analysis data-visualization health

Last synced: 13 Jan 2025

https://github.com/wiseaidev/truth-guard

Analyzing a 79k Dataset of Misinformation and Fake News

data-analysis fastapi lstm machine-learning python supervised-learning

Last synced: 20 Dec 2024

https://github.com/carmoreno/analisisaccidentalidadbogota

Data Analysis about traffic accidents at Bogotá, Colombia.

data-analysis data-science jupyer-notebook matplotlib numpy pandas scikit-learn

Last synced: 05 Jan 2025

https://github.com/akshat0427/python_youtube_history

a bunch of data science operations performed on youtube history data

data-analysis data-science extracting-features

Last synced: 11 Jan 2025

https://github.com/mg380/ibm-applied-data-science-capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization

capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql

Last synced: 10 Oct 2024

https://github.com/2003harsh/house-price-prediction-using-machine-learning

This project features a web app that predicts house prices using a linear regression model. Users can input details like location, square footage, bathrooms, and bedrooms through an HTML form. I've added a CI/CD pipeline with GitHub Actions, unit testing with pytest, and automated Docker containerization to improve deployment and robustness.

ci-cd data-analysis docker-image flask linear-regression machine-learning matplotlib mlops-workflow requests scikit-learn

Last synced: 10 Oct 2024

https://github.com/flyingfathead/neurograph-framework

A versatile tool for visualizing entropy loss in TensorFlow-based neural network training, providing insightful scatter plots with annotations.

data-analysis data-analysis-python data-visualization entropy graph graphs neural-network neural-networks neural-networks-visualization nn python python3 tensorflow tensorflow2 training visualization visualization-tools

Last synced: 11 Jan 2025

https://github.com/airscholar/data_analysis_with_ai

A repository showing how to use AI and ChatGPT for Data Analysis with Pandas and Python

chatgpt data-analysis gpt4 openai pandas pandasai python

Last synced: 13 Jan 2025

https://github.com/ayaanjawaid/google_playstore_data_analysis

This project provides an in-depth analysis of Google Play Store apps and user reviews, focusing on understanding app performance, user sentiment, and key trends in app categories. Using Python, I performed data cleaning, feature engineering, and exploratory data analysis (EDA) on app data and reviews.

data-analysis eda html numpy pandas-dataframe plotly python vizualisation

Last synced: 31 Oct 2024

https://github.com/willie-conway/datavista

A robust 🐍Python application for data analysis that provides a wide range of tools for 🔃loading, 🧹cleaning, and 🔃preprocessing data. It includes features for 📈statistical analysis, 👨🏿‍🔬hypothesis testing, 🦾machine learning, clustering, ⏳time series forecasting, and 📊data visualization, all designed to enhance your analytical workflow.

analytics big-data command-line data-analysis data-cleaning data-driven data-mining data-pipeline data-preprocessing data-science data-scientist data-visualization data-wrangling exploratory-data-analysis machine-learning pandas predictive-analytics python statistics visualization-tools

Last synced: 10 Jan 2025

https://github.com/jrschmidtt/csv-to-html

Convert csv file to html table in javascript.

body-parser csv data-analysis javascript nodejs oop

Last synced: 03 Jan 2025

https://github.com/sumitgirwal/procoder-public

"ProCoder", which is a web-based application providing massive open online courses for both professionals and students. It aims to offer a platform for learning coding skills online, accessible to anyone who is interested in learning programming or enhancing their coding knowledge. ProCoder provides courses on various programming languages, tools.

blog-platform bootstrap-4 chat-application css3 data-analysis django-crud django-project html5 javascript numpy-library pandas-library python3

Last synced: 06 Jan 2025

https://github.com/gmbeddard/ee152-realtime_embedded_systems-finalproject

An STM32-based implementation of the Pan-Tompkins algorithm for real-time QRS detection. Includes robust debugging tools, heart rate monitoring, and live ECG signal support via a python graphing script.

cpp-programming data-analysis ecg embedded-c freertos stm32

Last synced: 20 Jan 2025

https://github.com/pinedah/escom_development-of-applications-for-data-analysis

This repository is a personal collection of programs, exercises, and notes from the Development of Applications for Data Analysis course at Instituto Politécnico Nacional (IPN). As part of the Bachelor's in Data Science, the course focuses on developing practical skills in Python for data analysis.

data-analysis data-science data-visualization jupyter-notebook python python-data-analysis

Last synced: 19 Dec 2024

https://github.com/sunnybibyan/exploratory-data-analysis-eda

Welcome to the Titanic Dataset - Exploratory Data Analysis (EDA) project repository! This project aims to uncover insights from the Titanic dataset using Python and Jupyter Notebook. By analyzing key variables such as age, gender, and class, we aim to visualize relationships between passenger characteristics and survival rates.

data-analysis data-visualization jupyter-notebook python titanic-dataset

Last synced: 19 Dec 2024

https://github.com/mr-chang95/loan_data_visualization

Data Visualization Project for Udacity's Data Analyst Program. Using Python in Jupyter Notebook.

data-analysis data-visualization jupyter-notebook loans python udacity-data-analyst-nanodegree

Last synced: 26 Jan 2025

https://github.com/mindgamesnl/yanderestats

https://mindgamesnl.github.io/YandereStats/

data-analysis reporting-pipeline yandere yandere-sim

Last synced: 01 Jan 2025

https://github.com/haloapping/pisangijo

Kumpulan library dan framework untuk analisa data, data science, machine learning, deep learning dan masih banyak lagi berbasis bahasa pemrograman Python 🐍.

belajar data-analysis data-science deep-learning forecasting libraries machine-learning perkakas pustaka python3 recommender-system referensi tools

Last synced: 06 Jan 2025

https://github.com/famarks/grafarg

Grafarg is an interactive data analytics and graphical data visualization application. Grafarg being a progressive fork of Grafana 7.5.17 continues to be available under open source Apache 2.0 License

analytics charts data data-analysis data-science data-visualization grafana grafarg graph

Last synced: 18 Dec 2024

https://github.com/adagio/ivoox_episodes

iVoox Episodes: Scraping & Analysis

beautifulsoup4 data-analysis ivoox pandas python python3 scraping

Last synced: 27 Dec 2024

https://github.com/siddharthbadal/sql-case-studies-data-analysis

Data Analysis case studies on various databases using SQL

data-analysis sql sql-query sql-server sqlserver

Last synced: 26 Jan 2025

https://github.com/gursv/stocksage

Predict next day's close price for a stock like NSEI, NYA, HSI, IXIC, TWII, etc...!

data-analysis data-preprocessing data-science gridsearchcv machine-learning python3 random-forest-regressor stock-data stock-price-prediction streamlit

Last synced: 25 Jan 2025

https://github.com/ikanurfitriani/project-from-dqlab

This repository contains the results of my learning while at DQLab.

data-analysis data-science data-visualization database dqlab python python3 r sql

Last synced: 26 Jan 2025

https://github.com/shadan100/stroke-prediction-analysis

A web based application to predict whether a patient is likely to get stroke based on the input parameters like gender, age, various diseases, and smoking status. Each row in the data provides relevant information about the patient.

artificial-intelligence data-analysis data-science django django-framework jupyter-notebook machine-learning matplotlib pandas predictive-modeling python stroke-prediction web-application

Last synced: 11 Oct 2024

https://github.com/markmusic27/data-statistics-calculator

💣 This method (made in JavaScript / Python) can find the mean, median, mode, range, and standard deviation.

data-analysis standard-deviation statistics statistics-calculator

Last synced: 05 Jan 2025

https://github.com/noeyislearning/sharpe-ratio-amazon-facebook

Explore the Sharpe Ratio and its application to evaluate the performance of two tech giants: Amazon and Facebook.

amazon data-analysis data-science data-visualization facebook python3 sharpe-ratio

Last synced: 06 Dec 2024

https://github.com/noeyislearning/netflix-movie-analysis

Explore movie duration trends on Netflix and assess the impact of non-feature film genres in this data-driven analysis.

data-analysis data-science data-visualization datacamp-projects jupyter-notebook netflix-analysis python3

Last synced: 06 Dec 2024

https://github.com/noeyislearning/customer-shopping-trends

An invaluable resource for businesses aiming to optimize strategies and enhance customer satisfaction. Analyze customer attributes, purchase history, and preferences to make data-driven decisions.

business-analytics data-analysis data-science data-visualization jupyter-notebook matplotlib pandas python3 seaborn

Last synced: 06 Dec 2024

https://github.com/noeyislearning/intro-to-data-analysis

The repository teaches skills for cleaning, exploring, analyzing, and visualizing data in Python to gain insights and make data-driven decisions.

data-analysis jupyter-notebook lecture-notes python

Last synced: 06 Dec 2024

https://github.com/noeyislearning/cancer-linear-regression-model

The correlation between socioeconomic status and lung cancer incidence and mortality rates among low-income populations in the United States.

cancer-research data-analysis data-science data-visualization jupyter-notebook linear-regression-models matplotlib numpy python seaborn statsmodels

Last synced: 06 Dec 2024

https://github.com/noeyislearning/e-commerce-sales-analysis

E-Commerce Sales Analysis, repository contains code and analysis for an e-commerce transaction dataset from Kaggle. The goal is to uncover insights from the data that could help drive business strategy and decisions.

data-analysis data-science jupyter-notebook nextjs python typescript

Last synced: 06 Dec 2024

https://github.com/whis99/userfunnelanalysis

An ecommerce user funnel conversion data analysis with matplotlib & python.

data-analysis data-analysis-python data-analyst data-visualization google-colab jupyter-notebook matplotlib python

Last synced: 13 Jan 2025

https://github.com/githubuseraccountamazing/the-amari-project

a project in which I attempted to push some of the limits of stable-diffusion while taking some data along the way

ai ai-generated-images bash data-analysis machine-learning stable-diffusion textual-inversion

Last synced: 20 Jan 2025

https://github.com/karatechop/noaa-storm-database-data-analysis

Analysis of population health and economic consequences of events documented in the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

data-analysis knitr r rmarkdown

Last synced: 21 Jan 2025

https://github.com/valeriopagliarino/electronics-2021-unito-public

Data analysis and simulations for the course "Electronics laboratory" held at Physics Dep. - University of Turin, 2021

data-analysis electronics physics

Last synced: 06 Dec 2024

https://github.com/prernarohra/heart-disease-prediction

This project develops a machine learning model to predict heart disease risk based on symptoms and medical history. The model achieved the best accuracy with Logistic Regression, as it works well for binary classification problems.

artificial-intelligence data-analysis data-science dataset heartdisease-prediction machine-learning models

Last synced: 27 Dec 2024

https://github.com/csoren66/diabetics_prediction

Predicting that whether the patient has diabetes or not on the basis of the features we will provide to our machine learning model.

data-analysis machine-learning python svm

Last synced: 13 Jan 2025

https://github.com/sarincr/data-analytics-with-knime

Data Analytics with KNIME (Konstanz Information Miner), a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing (ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data-analysis data-mining data-science data-structures data-visualization database datascience deep-learning machine-intelligence machine-learning machine-learning-algorithms machinelearning mining mining-software

Last synced: 21 Jan 2025

https://github.com/narius2030/sakila-datawarehouse-analysis

Implement a Hive data warehouse to store meaningful data, apply Machine Learning like Clustering or Regression for dealing with business problems

apache-hadoop apache-hive data-analysis etl-pipeline hiveql machine-learning statistics

Last synced: 14 Dec 2024

https://github.com/seekinginfiniteloop/fedcal

A feature-rich Python calendar that enables time series analyses of changes in federal workforce schedules and shifts in executive department funding status.

data-analysis data-science econometrics economic-data economics federal federal-government hr pandas pandas-library pandas-python pydata python

Last synced: 14 Oct 2024

https://github.com/vhtua/group4_data_analysis

Hierarchical Cluster Analysis: Movie Genres Preferences

data-analysis hierarchical-clustering r unsupervised-learning

Last synced: 10 Dec 2024

https://github.com/harshmule1/store-sales-analysis

Sales Analysis Using Power Bi

data-analysis powerbi

Last synced: 10 Dec 2024

https://github.com/rajshrestha86/police-brutality-data-analysis

In this project, we analyze the events after George Floyd’s death. The protests and riots across the United States and sentiments of news articles of three different news sources that have different political leaning. We will see how these media reacted after Floyd’s death and see the effect of media bias on the sentiments of news for #BlackLivesMatter and #AllLivesMatter movement. We will also see if there is a correlation between the police budget and the number of protests. This analysis will help us to see if there is really a need for defunding police to reduce police brutality and casualties. We will also see the correlation of partisan segregation and number of deaths to see if political preference has an effect on the number of deaths by police.

data-analysis matplotlib pandas python sentiment-analysis web-scraping

Last synced: 14 Dec 2024

https://github.com/cosmoduende/r-uber-trips-analyisis

Explore your activity on Uber with R: How to analyze and visualize your personal data history. Find out how you consume the Uber App using a copy of your data.

analisis-de-data data-analysis data-analytics data-science data-visualisation data-visualization data-viz eda flexdashboard ggmap ggplot2 mobility-as-a-service qmplot r-language r-programming ridesharing uber uber-data visualizacion-de-datos

Last synced: 27 Dec 2024

https://github.com/kumaranand05/suicide-rate-analysis

Analysis of Mortality data of WHO and visualization using Power BI

analytics data-analysis data-visualization mortality-rates powerbi python suicide-dataset suicide-rate

Last synced: 25 Dec 2024

https://github.com/cosmoduende/r-twitter

Explore your Twitter activity with R: Sentiment Analysis and Data Visualization. How to analyze your Twitter account (or any account), discover your habits and sentiments with the "rtweet" package and NLP.

data-analysis data-visualization lemmatization nlp nlp-library nlp-resources nltk nltk-library r-package r-programming r-studio rtweet stemming twitter twitter-api twitter-data twitter-data-analysis twitter-data-extraction twitter-sentiment-analysis udpipe

Last synced: 27 Dec 2024

https://github.com/fbraza/python-dataframe-skim

Get an extended statistic summary of your pandas DataFrame

data-analysis data-science dataframe pandas python3

Last synced: 26 Jan 2025

https://github.com/fbraza/paris_airbnb

Analysis of Paris AirBnB data using R and Shiny

analysis data data-analysis paris-airbnb r shiny

Last synced: 26 Jan 2025

https://github.com/alxrm/scent-of-literature

Russian literature sentiment analysis in terms of very small dataset

classification data-analysis sentiment-analysis sklearn tf-idf

Last synced: 06 Dec 2024

https://github.com/steciuk/ium-recommendation-system

Evaluation and comparison of 3 different recommendations models for web shopping service simulation.

data-analysis model-evaluation recomendation-system

Last synced: 06 Jan 2025

https://github.com/mr-vozhyk/karpov.courses-study

Часть заданий и проектов от karpov.courses

airflow data-analysis git python sql statistics

Last synced: 20 Dec 2024

https://github.com/priboy313/pandasflow

A set of custom python modules for friendly workflow on pandas

catboost data-analysis data-science pandas phik python scikit-learn shap

Last synced: 21 Dec 2024

https://github.com/rohithsaji97/open_gate_dip

An automatic gate opening system with an additional parking system (using Raspberry PI).

automated data-analysis digital-image-processing opencv python3 raspberry-pi-3 trained-models

Last synced: 29 Dec 2024

https://github.com/stastnypremysl/lsql-csv

lsql-csv is a tool for small CSV file data querying from a shell with short queries. It makes it possible to work with small CSV files like with a read-only relational databases. The tool implements a new language LSQL similar to SQL, specifically designed for working with CSV files in shell.

csv data-analysis data-processing haskell language linux-shell lsql lsql-csv new-language query-language relational-database sql unix-command unix-philosophy unix-shell

Last synced: 08 Dec 2024

https://github.com/ibensusan/wine-properties-assessment

Wine Properties Assessment using Microsoft Excel

data-analysis data-visualization excel

Last synced: 14 Dec 2024

https://github.com/yard1/linearordering

An R package. Provides various methods of linear ordering of data. Supports weights and positive/negative impacts.

data-analysis data-analysis-in-r data-analysis-r data-science r

Last synced: 19 Jan 2025

https://github.com/abhi18av/innovation-competition

Submission for a programming challenge

clojure clojurescript data-analysis

Last synced: 05 Jan 2025

https://github.com/mrjxtr/coffee_sales_analysis

Full data analytics process from data gathering, data processing, data visualization and reporting on a small coffee shop sales data.

dashboard data-analysis data-cleaning data-visualization kpi-report pandas python3 spreadsheet tableau-public

Last synced: 24 Dec 2024

https://github.com/kishlayjeet/zomato-data-exploration

In this project, we will be exploring a dataset containing information on various restaurants and their ratings, location, and other attributes.

data-analysis eda matplotlib numpy pandas zomato-data-exploration

Last synced: 24 Dec 2024

https://github.com/seankwarren/water-quality-analysis

An examination of water quality in the Atlanta watershed with a focus on identifying neglected areas and potential strategies for improving water quality monitoring

analytics data-analysis jupyter-notebook python

Last synced: 14 Dec 2024

https://github.com/sathyasris27/data-analysis-on-adult-smoking-patterns-in-the-uk

The aim of this analysis is to understand the smoking patterns among adults in the UK.

data data-analysis data-visualization python3

Last synced: 10 Jan 2025

https://github.com/gattiharishkumar/employee-attendance-leaves-analytics-dashboard

This project showcases a Power BI dashboard created to analyze employee attendance and leaves over a three-month period. The data was sourced from Excel datasets available on the Codebasics website.

dashboards data-analysis data-cleaning data-transformation data-visualization power-query-editor powerbi

Last synced: 29 Dec 2024

https://github.com/gattiharishkumar/blinkit-sales-analysis-dashboard

This project presents a comprehensive sales analysis dashboard for Blinkit, an Indian last-minute delivery app. The dashboard was created using Power BI and provides a detailed overview of the company's sales performance across various outlets and product categories.

dashboard data-analysis data-transformation data-visualization ms-excel-data-analytics power-query powerbi powerbi-visuals

Last synced: 29 Dec 2024

https://github.com/giordano-lucas/tesco-extension

Products clustering and interactive visualization

clustering data-analysis data-visualization tesco

Last synced: 02 Jan 2025