Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/gappeah/global-shipping-analytics-dashboard

This Tableau project provides a comprehensive visual analysis of global sales, shipping costs, and quality metrics across different regions and countries.

data data-analysis data-analyst data-visualization metrics tableau

Last synced: 07 Jan 2025

https://github.com/dsrodrigovieira/houserocketsales

Este repositório contém um projeto desenvolvido para praticar habilidades de análise de dados utilizando Python

data-analysis data-visualization heroku kaggle-dataset python

Last synced: 30 Dec 2024

https://github.com/shibam120302/heart-disease-data-analysis-by-shibam

You can read more on the heart disease statistics and causes for self-understanding. This project covers manual exploratory data analysis

analysis data-analysis scraper

Last synced: 20 Nov 2024

https://github.com/mohamedhany99/human-voice-identifier-counter

the application developed in (KIVY) it can identify the users imported into the dataset based on the support vector machine training model it has two features ( Importing new voice - Detection to detect the human voices and count them)

android android-app android-application automation automation-framework data data-analysis data-mining data-science data-visualization datascience kivy kivy-framework machine-learning python

Last synced: 17 Nov 2024

https://github.com/virajbhutada/global-universities-success-analysis-powerbi-sql-excel

This capstone project conducts in-depth analysis using Power BI, SQL, and Excel to explore complex dynamics shaping global university success. Integrating data from diverse ranking systems and criteria, our aim is to unravel the factors influencing universities worldwide.

capstone capstoneproject data-analysis data-analytics data-insights data-science data-science-projects data-visualization excel exploratory-data-analysis mece mysql powerbi powerpoint sql

Last synced: 10 Jan 2025

https://github.com/hevalhazalkurt/exploring_the_data_of_lego_history

A data exploration project on LEGO history in Python with pandas, matplotlib etc. (WIP)

data data-analysis data-science data-visualization datascience datasets lego lego-history matplotlib pandas python python3

Last synced: 20 Nov 2024

https://github.com/manikantasanjay/time_series_data_analysis_on_stocks

Time Series Data Analysis project on Daily Stock Prices of the following companies(Apple, Microsoft, Google, Amazon) for a span of 5 years.

data-analysis pandas stock time-series time-series-analysis

Last synced: 22 Dec 2024

https://github.com/2003harsh/house-price-prediction-using-machine-learning

This project features a web app that predicts house prices using a linear regression model. Users can input details like location, square footage, bathrooms, and bedrooms through an HTML form. I've added a CI/CD pipeline with GitHub Actions, unit testing with pytest, and automated Docker containerization to improve deployment and robustness.

ci-cd data-analysis docker-image flask linear-regression machine-learning matplotlib mlops-workflow requests scikit-learn

Last synced: 10 Oct 2024

https://github.com/raad07/sql_project-world_layoffs_dataset

This is a SQL project which comprises the Data Cleaning in the first part and Exploratory Data Analysis (EDA) in the second part.

data-analysis database mysql sql

Last synced: 22 Nov 2024

https://github.com/mg380/ibm-applied-data-science-capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization

capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql

Last synced: 10 Oct 2024

https://github.com/abhi18av/innovation-competition

Submission for a programming challenge

clojure clojurescript data-analysis

Last synced: 05 Jan 2025

https://github.com/hayatiyrtgl/data_analysis_project

Financial data analysis: preprocess, visualize, calculate technical indicators.

data-analysis data-analysis-python data-science dataframe numpy pandas python python3 stock-price-prediction talib trade-analysis

Last synced: 22 Dec 2024

https://github.com/akshat0427/python_youtube_history

a bunch of data science operations performed on youtube history data

data-analysis data-science extracting-features

Last synced: 11 Jan 2025

https://github.com/fx2y/datanarrate

[WIP] LLM-powered agent for adaptive data analysis across multiple sources. Uses natural language for complex queries, visualizations, and insights. Features autonomous planning, SQL/Elasticsearch generation, and AI storytelling. Built with LangChain, GPT-4, FastAPI, and React.

ai data-analysis data-visualization elasticsearch fastapi gpt-4 langchain machine-learning nlp react sql

Last synced: 15 Nov 2024

https://github.com/abhinavsharma07/fraud_analytics-credit_card_fraud_detection

The aim of this project is to predict fraudulent credit card transactions with the help of different machine learning models.

banking data-analysis decision-trees hyperparameter-optimization machine-learning-algorithms pipelines random-forest-classifier svm-classifier xgboost-classifier

Last synced: 10 Jan 2025

https://github.com/sumidcyber/dataviz-master

This Python application provides a user-friendly interface to load and visualize the contents of a CSV file. Users can choose from various types of graphs and perform analyses on the dataset.

data-analysis data-analysis-project data-analysis-python database databases python python3

Last synced: 22 Nov 2024

https://github.com/moindalvs/learn_eda_house_price_dataset

Data Set: House Prices: Advanced Regression Techniques Exploratory Data Analysis on more than 80 features

cardinality data-analysis data-science data-structures data-visualization missing-values

Last synced: 17 Nov 2024

https://github.com/lightbridge-ks/zoominterface

A data analysis Shiny app of program Zoom report files.

data-analysis r shiny-apps zoom-class zoom-meetings

Last synced: 15 Nov 2024

https://github.com/sarincr/basics-of-julia-programming-language

Julia is a high-level, high-performance, dynamic programming language. While it is a general purpose language and can be used to write any application, many of its features are well-suited for high-performance numerical analysis and computational science.

data data-analysis data-mining data-science data-visualization dataanalysis dataanalytics datascience julia julia-language julia-library julia-package julialang machine-learning

Last synced: 20 Nov 2024

https://github.com/sarincr/data-analytics-with-knime

Data Analytics with KNIME (Konstanz Information Miner), a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing (ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data-analysis data-mining data-science data-structures data-visualization database datascience deep-learning machine-intelligence machine-learning machine-learning-algorithms machinelearning mining mining-software

Last synced: 20 Nov 2024

https://github.com/kirkalyn13/opensignal_autogenerate_report

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 15 Nov 2024

https://github.com/vishal-038/real_estate_price_prediction

The Real Estate Price Prediction project aims to develop a machine learning model to predict house prices based on various features

data-analysis data-science data-visualization machine-learning python

Last synced: 22 Nov 2024

https://github.com/garciparedes/castile-and-leon-crops

Data Analysis of Castile and Leon Crops Area over the last years

castile-and-leon crops data-analysis data-science jupyter jupyter-notebook notebook spain

Last synced: 15 Nov 2024

https://github.com/karatechop/noaa-storm-database-data-analysis

Analysis of population health and economic consequences of events documented in the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

data-analysis knitr r rmarkdown

Last synced: 20 Nov 2024

https://github.com/jen-uis/la-crime-data-analysis

This repository contains project materials for the Fall 2023 MGT 256 class. This project is completed with assists from Professor Adem Orsdemir.

business-analytics crime-data crime-data-analysis data-analysis knn la-crimes-from-2020 la-safe r r-markdown r-studio report-generation rmd united-states visualization

Last synced: 20 Nov 2024

https://github.com/gholamrezadar/favourite-youtube-channels

this program goes through your youtube watch history and sorts channels based how many of their videos you have watched!

data-analysis data-visualization python

Last synced: 30 Dec 2024

https://github.com/asifdotexe/quickvu

Quick VU: No-code, data cleaning analysis and visualization tool built on Streamlit. Quickly clean, visualize, explore, and understand data relationships and correlations with ease. Perfect for analysts, business users, and anyone looking to gain data insights—without writing a single line of code.

automation data-analysis data-cleaning data-visualization python3 streamlit-application toolkit

Last synced: 15 Nov 2024

https://github.com/yash-kavaiya/ai-analytics

This is a Streamlit app that uses Pandas and AI to perform data analytics on uploaded CSV files.

data-analysis generative-ai pandas streamlit

Last synced: 24 Dec 2024

https://github.com/asifdotexe/timeseriesanalysis

This repository serves as a central hub for all of my projects related to time series analysis. Here, you'll find a collection of projects, code samples, and resources that explore various aspects of time series data and its analysis.

data-analysis feature-engineering jupyter-notebook pandas python time-series-analysis visualization

Last synced: 15 Nov 2024

https://github.com/gholamrezadar/most-profitable-actors

Finds the list of actors with the most boxoffice profit using TMDB API.

crawling data-analysis tmdb

Last synced: 30 Dec 2024

https://github.com/mattdelaune/retail_rfm_analysis

Power BI multi-page report leveraging advanced data visualization for RFM analysis. Delivers deep analytical insights into customer behavior, engagement, and spending patterns, driving strategic business decisions.

data-analysis dax powerbi report rfm-analysis sales-data visualization

Last synced: 30 Dec 2024

https://github.com/shriram-vibhute/data-analysis

This repository offers a comprehensive collection of tools and scripts for data science, encompassing essential tasks such as data cleaning, wrangling, and aggregation. It includes practical examples and utilities for numerical computations with NumPy, data manipulation with Pandas, and effective data visualization techniques.

data-aggregation data-analysis data-visualization data-wrangling matplotlib numpy pandas

Last synced: 15 Nov 2024

https://github.com/aryansharma5/data-visualization-and-thorough-analysis

comprehensive guide for data analysis and visualization

data-analysis data-visualization

Last synced: 24 Nov 2024

https://github.com/wiseaidev/truth-guard

Analyzing a 79k Dataset of Misinformation and Fake News

data-analysis fastapi lstm machine-learning python supervised-learning

Last synced: 20 Dec 2024

https://github.com/danhenriquex/data-science-project

The main goal of this project was to apply the concepts of data visualization and analysis.

data-analysis data-science numpy pandas python

Last synced: 14 Jan 2025

https://github.com/solrikk/pictrace-web

PicTraceV2 is a highly efficient image matching platform that leverages computer vision using OpenCV, deep learning with TensorFlow and the ResNet50 model, asynchronous processing with aiohttp, and Selenium for browser automation. PicTraceV2 allows users to upload images directly or provide URLs, quickly scanning a vast database to find image

automation computer-vision data-analysis data-extraction deep-learning image-processing image-search machine-learning natural-language-processing opencv openpyxl pandas python selenium tensorflow web-scraping yandex yandex-api

Last synced: 09 Jan 2025

https://github.com/danhenriquex/final-project-ia

Artificial Intelligence Project - Analysis of sentiments of news that impact the value of shares.

data-analysis machine-learning supervised-learning

Last synced: 14 Jan 2025

https://github.com/cworld1/da-learning

Some notes and code about CWorld learning Database Analysis

data-analysis data-science jupyter-book jupyter-notebook python r

Last synced: 23 Nov 2024

https://github.com/cworld1/novel-analysis

A simple project for analyzing Chinese novels

data-analysis novel

Last synced: 23 Nov 2024

https://github.com/nirmalvatsyayan/data-analyst-nanodegree

Udacity data analyst nanodegree project submissions and learning

data-analysis numpy pandas python statistics udacity-data-analyst-nanodegree

Last synced: 12 Jan 2025

https://github.com/ac12644/fractz-ai-data-analyst

Analyze data and gain insights instantly with FRACTZ's AI Data Analyst. Flexible, fast analytics tailored to your needs.

ai data-analysis data-visualization

Last synced: 14 Jan 2025

https://github.com/revan-alqahmi/summarize-talabat-company-reviews

Natural Language Processing Project, which is a program that analyzes Arabic comments at Talabat Company and classifies them into positive, negative, and neutral using machine learning algorithms and natural language processing techniques.

artificial-intelligence data-analysis machine-learning-algorithms natural-language-processing python

Last synced: 29 Dec 2024

https://github.com/viseshrp/community_health_indicator

Android app to fetch,organize and represent NYC health data

android data-analysis data-visualization health

Last synced: 13 Jan 2025

https://github.com/sandk21/detection_faux_billets

Algorithme de détection de faux billets selon leurs dimensions géométriques et application web pour générer les prédictions

data-analysis data-science data-visualization machine-learning pandas python scipy sklearn streamlit

Last synced: 07 Dec 2024

https://github.com/dannyben/datamix

DSL for manipulating tabular data

csv data data-analysis data-engineering gem ruby tabular-data

Last synced: 07 Dec 2024

https://github.com/johnsesana/eda-liquor-sales

Exploratory Data Analysis on Public Datasets

data-analysis data-visualization sql tableau-dashboards

Last synced: 16 Nov 2024

https://github.com/pferreirafabricio/data-immersion

🏊🏻‍♂️ Activities and exercises from 'Imersão Dados' event

data data-analysis data-science dataset jupiter-notebook python

Last synced: 14 Jan 2025

https://github.com/johnsesana/eda-video-game-sales

Exploratory Data Analysis on Public Datasets

data-analysis data-visualization excel

Last synced: 16 Nov 2024

https://github.com/aekanshd/crazytics-suicidesindia

Basic interpretation of the Suicides in India data-set using R.

data-analysis data-science graph india r suicides

Last synced: 15 Jan 2025

https://github.com/phomint/udacity_dataanalysis

All projects and activities

data-analysis python udacity-nanodegree

Last synced: 15 Jan 2025

https://github.com/yash22222/data-analysis-with-python

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

binning data data-acquisition data-analysis data-binning data-cleaning data-formatting data-integration data-normalization data-preprocessing data-science data-transformation data-wrangling dataframe description numpy pandas pandas-dataframe python python3

Last synced: 05 Jan 2025

https://github.com/nafisalawalidris/springforth-university-foodbank

Springforth University Food Bank: A collaborative initiative with UNESCO to address student food insecurity. Contains code and resources for the web application, data analysis, and insights into the prevalence and impact of food insecurity on academic performance.

academic-performance collaborative-initiative data-analysis data-visualization excel pivot-tables powerbi springforth-university-food-bank student-food-insecurity unesco

Last synced: 22 Nov 2024

https://github.com/numbersprotocol/dyda

Dynamic data pipeline framework

ai artificial-neural-networks data-analysis data-science

Last synced: 27 Dec 2024

https://github.com/yash22222/tata-data-visualisation-virtual-internship

Data Visualisation: Empowering Business with Effective Insights Gain insights into leveraging data visualisations as a tool for making informed business decisions.

basics ceo charts cmo data-analysis data-interpretation data-science data-visualization graphs machine-learning mcq microsoft-excel microsoft-power-bi microsoft-word powerpoint-presentations python tableau tata tata-data-visualisation

Last synced: 05 Jan 2025

https://github.com/dina-hosny/explore-us-bike-share-data-project

Explore US Bike Share Data project - FWD Data Analysis Professional Track. In this project, I used Python to explore data related to bike share systems for three major cities in the United States and answer questions about it by computing descriptive statistics.

data-analysis data-science numpy pandas python

Last synced: 13 Jan 2025

https://github.com/phammings/sales-management-analysis

Sales management analysis and Power BI dashboard for sample business request and user stories

data-analysis excel powerbi sql

Last synced: 15 Jan 2025

https://github.com/vkbo/osirisanalysis

Matlab toolbox for analysing simulation results from Osiris 3

data-analysis matlab matlab-gui physics-simulation

Last synced: 16 Nov 2024

https://github.com/aravind-selvam/bikeshare-company-analysis

Google Data Analytics Professional Certificate program's Capstone project, of a bike sharing company

analytics business-analytics business-intelligence data data-analysis data-visualization dataanalytics google-data-analytics postgresql sql sql-server

Last synced: 14 Jan 2025

https://github.com/prithivsakthiur/data-board

Data Boards - Visualization of various plots ( Analysis )

data-analysis gradio huggingface keras mathplotlib pandas plots pyplot scikit-learn seaborn spaces

Last synced: 21 Dec 2024

https://github.com/zen204/airbnb_availability

A machine learning model that predicts Airbnb listing availability, utilizing feature engineering and supervised learning techniques to improve guest experience and optimize host management.

binary-classification data-analysis data-preprocessing data-visualization feature-engineering machine-learning matplotlib model-evaluation nlp pandas predictive-modeling python scikit-learn seaborn supervised-learning

Last synced: 03 Nov 2024

https://github.com/jhrcook/wagenmaker-data-analysis

Analysis of Registered Replication Report: Strack, Martin, & Stepper (1988) by Wagenmaker et al.

data-analysis r r-project statistics

Last synced: 13 Jan 2025

https://github.com/dual-points/dplearn

A Python package for data analysis.

data-analysis data-science python python-package

Last synced: 13 Nov 2024

https://github.com/iguptashubham/pizzahut-analysis-sql

best dataset for data analysis. Pizzahut data analysis done by Shubham Gupta in MySql. This dataset is provided by friend of mine intern at pizzahut. In pizzahut, they used this dataset to train and ask question. This data does not reveal anything about the pizzahut. It is safe to share. data

data-analysis data-analytics database dataset datasets mysql mysql-database pizzahut

Last synced: 14 Jan 2025

https://github.com/mengyaohuang/data-manipulation-and-analysis

Data processing implementation with tools in Python

data-analysis nlp-machine-learning pandas-dataframe python

Last synced: 05 Dec 2024

https://github.com/prime-infinity/type-one

Software to visualize and analyze GitHub repos based on certain statistics such as stars, forks and issues

data-analysis data-visualization

Last synced: 24 Nov 2024

https://github.com/rayyan9477/youtube-spam-detection-with-flask-and-machine-learning

This is a web application built using Flask that detects spam comments on YouTube using a Naive Bayes classifier. It leverages techniques such as CountVectorizer for feature extraction and scikit-learn for machine learning. The application reads data from a CSV file and predicts whether a comment is spam or not.

data-analysis data-science machine-learning nlp-machine-learning spam-detection

Last synced: 10 Jan 2025

https://github.com/iguptashubham/ott-churn-eda-ml

Understanding why customers discontinue their subscriptions will be crucial in optimizing the user experience, reducing churn, and maximizing customer lifetime value. By using Machine learning model to predict the Customer Churn.

data-analysis data-analysis-project data-science data-science-portfolio data-science-projects data-visualization machine-learning python

Last synced: 14 Jan 2025

https://github.com/jubinjacob03/heartdiseaseclassify-ml

Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.

data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine

Last synced: 11 Oct 2024