Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/pedrosfaria2/analisandopostshn

Projeto para analisar as postagens da comunidade HackerNews

analise-de-dados data-analysis datetime jupyter-notebook matplotlib python python3

Last synced: 05 Jan 2025

https://github.com/pedrosfaria2/fugascomhelicoptero

Meu primeiro uso do Jupyter Notebook em um projeto

analise-de-dados data-analysis jupyter-notebook matplotlib pandas python

Last synced: 05 Jan 2025

https://github.com/pedrosfaria2/analisetitulosnetflix

Estudo de popularidade dos filmes da Netflix no IMDB.

analise-de-dados data-analysis jupyter-notebook matplotlib numpy pandas python

Last synced: 05 Jan 2025

https://github.com/myles/notebooks

Some of my random Jupyter Notebooks.

data-analysis data-science jupyter-notebooks

Last synced: 16 Dec 2024

https://github.com/akshaypratapsingh09/zomato-blogs-all-links-dataset

Engineering / Culture / Blogs Data gathered for Educational and Learning purposes from Zomato's Blogs and spreading the better problem solving Methodologies adapted by Modern Unicorns

data-analysis dataset regex selenium webdriver zomato-data-analysis

Last synced: 20 Dec 2024

https://github.com/rainbowatcher/simple

Make data work easier, saving your working time

bigdata data-analysis etl

Last synced: 23 Dec 2024

https://github.com/malucor/livros

Programa em Python para fazer uma análise de dados sobre livros, a partir de um arquivo Excel.

analise-de-dados book books bookshelf data-analysis ipynb jupyter-notebook livro livros python

Last synced: 05 Jan 2025

https://github.com/tsbarr/toronto-open-data

Analysis of Toronto's open data initiatives. 🌆 Exploring Toronto's urban systems through data science 📊 Python-based analyses of public datasets 🔍 Focus on community impact and urban patterns 🎓 Academic rigour meets practical insights 🔄 Regularly updated with new analyses

api-integration civic-tech ckan-api data-analysis data-cleaning data-science data-visualization exploratory-data-analysis jupyter-notebook open-data pandas public-data python tableau toronto urban-analytics

Last synced: 19 Dec 2024

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 18 Jan 2025

https://github.com/prekshivyas/cis-595-big-data-analytics

Comprehensive real estate price prediction project, integrating socioeconomic indicators and property features.

data-analysis data-cleaning data-mining data-preprocessing data-science data-visualization data-wrangling exploratory-data-analysis web-scraping

Last synced: 05 Jan 2025

https://github.com/kwonnayeon/medium-post-projects

Contains the code and projects from my Medium posts. I share what I've learned through trial and error to help others tackle similar work smoothly.

data-analysis data-science data-visualization medium-articles python r-language sql

Last synced: 13 Jan 2025

https://github.com/manjit-baishya-datascience/flipkart-laptop-listing-eda

This project analyzes laptop price data from Flipkart using AutoScraper for web scraping. It includes data loading, EDA, cleaning, statistical analysis, and visualization. The goal is to derive insights for pricing strategies and market positioning. Explore the repository for detailed documentation and code.

data-analysis ecommerce-platform flipkart laptop python

Last synced: 13 Jan 2025

https://github.com/anuppm9917/super-store-sales-analysis-power-bi-project

My drive to know which products, regions, categories and customer segments a company should target or avoid, I search and selected an appropriate dataset on kaggle which will match a standard superstore requirement.

data data-analysis data-visualization datacleansing excel exploratory-data-analysis jupyter-notebook numpy pandas plotly powerbi python3

Last synced: 26 Jan 2025

https://github.com/hrosicka/czechpopulationestimation

This GitHub repository contains Python code for data analysis and population prediction in the Czech Republic up to the year 2050. The code is written in Python and utilizes the Pandas and Matplotlib libraries.

data-analysis data-visualization matplotlib matplotlib-figures matplotlib-pyplot pandas pandas-dataframe pandas-library pandas-python python python3

Last synced: 26 Jan 2025

https://github.com/bala-1409/sales-forecasting-datascience-project

Develop a data science project using historical sales data to build a regression model that accurately predicts future sales. Preprocess the dataset, conduct exploratory analysis, select relevant features, and employ regression algorithms for model development. Evaluate model performance, optimize hyperparameters, and provide actionable insights.

data data-analysis data-science data-visualization datacleaning exploratory-data-analysis machine-learning-algorithms modelfitting prediction predictive-analytics predictive-modeling python3 regression-models salesforecast supervised-learning

Last synced: 27 Jan 2025

https://github.com/bala-1409/rafik-s-kitchen-data-analysis

The Project is about the Analysis of the Sales and Expenses Data of a Famous Fast-food Restaurant. This mainly focuses on gaining Insights that will boost the Future Sales and also Business Strategies it Improve the Profit Margins. Handled Tools are SQL, Python, Power BI, MS Office Tools.

business-analytics business-intelligence data-analysis data-analytics data-visualization eda exploratory-data-analysis ms-office powerbi-report powerpoint-presentations python sql-server

Last synced: 27 Jan 2025

https://github.com/jayita11/eda-student-exam-performance

This project performs Exploratory Data Analysis (EDA) and hypothesis testing on student performance data. It explores trends based on attributes like gender, race/ethnicity, parental education, lunch type, and test preparation course completion.

data-analysis eda hypothesis-testing matplotlib pandas python seaborn statsmodels student-performance-analysis

Last synced: 13 Jan 2025

https://github.com/jayita11/atliqo-bank-credit-card-launch-eda

This project involves exploratory data analysis and statistical testing for AtliQo Bank's new credit card launch. Key insights include targeting high-income occupations and the 18-25 age group. Recommendations focus on tailored marketing campaigns, education, and incentives to enhance credit card adoption and usage among young adults.

data-analysis hypothesis-testing matplotlib p-value pandas python seaborn statistics z-test

Last synced: 13 Jan 2025

https://github.com/jayita11/exploring-most-streamed-songs-for-last-four-decades-eda

Perform EDA to uncover trends in streaming patterns, likes, and artists over the last four decades.

data-analysis eda hypothesis-testing matplotlib most-streamed-songs pandas python seaborn

Last synced: 13 Jan 2025

https://github.com/anuppm9917/data-processing-and-csv-to-json-using-python-project

This project guides you through processing data from CSV to JSON format using Python. You'll learn to cleanse, validate, and transform data with pandas, numpy, csv, and json libraries, ensuring it's ready for POS system integration. This will help improve data integrity and streamline integration.

csv-files data data-analysis data-cleaning data-collection data-transformation data-validation python3 transformation

Last synced: 26 Jan 2025

https://github.com/oshinrathor/data-science-systems-and-analytics-projects

Dive into my Data Science Projects Repository, featuring a Spam SMS Classifier, NIA Dashboard, H1N1 Vaccine Prediction, and NYC Taxi Fare Prediction. Each project showcases my skills in data cleaning, exploratory analysis, modeling, and visualization, offering valuable insights and methodologies for data enthusiasts and practitioners.

dashboard data-analysis data-driven-decisions data-presentation data-science data-visualization dataexploration eda insights nia webanalytics

Last synced: 13 Jan 2025

https://github.com/abhroroy365/market_analysis

This project explores customer segmentation and market analysis in the context of online retail using an online retail dataset. By applying advanced analytics, we aim to uncover insights that can drive strategic decisions and enhance business performance.

clustering data data-analysis data-visualization kmeans-clustering machine-learning market-analysis python silhouette-analysis

Last synced: 02 Jan 2025

https://github.com/daniil-oberlev/profit-visualization

A Python project for profit analysis with interactive visualizations using pandas, matplotlib, seaborn, and hvplot.

data-analysis hvplot matplotlib pandas profit-analysis python sales seaborn visualization

Last synced: 05 Jan 2025

https://github.com/chaganti-reddy/weather-prediction-australia

Creating a fully-automated system that can use today's weather data for a given location to predict whether it will rain at the location tomorrow.

data-analysis logistic-regression machine-learning prediction-model python3

Last synced: 20 Jan 2025

https://github.com/brevex/code-complexity-data-analisis

Data collection that shows different complexity scores in an algorithmic dataframe.

code-analysis data-analysis data-science python

Last synced: 05 Dec 2024

https://github.com/chaganti-reddy/ai-prototype-customer-segmentation

Artificial Intelligence Prototype product based model for Customer Segmentation in E-Commerce Industry.

artificial-intelligence cluster-analysis customer-segmentation data-analysis machine-learning product-based prototype

Last synced: 20 Jan 2025

https://github.com/brevex/hotel-booking-demand-data-analysis

Data analysis in Python of demand for urban hotels and resorts showing their causes and relationships

data-analysis data-science hotel-booking-analysis kaggle python

Last synced: 05 Dec 2024

https://github.com/chaganti-reddy/house-price-prediction

Machine Learning Model creation for House Price Prediction

data-analysis deep-learning jupyter-notebook machine-learning python3 regression

Last synced: 20 Jan 2025

https://github.com/attmhd/salary_analysis

Salary Analysis by Undergraduate Major

data-analysis streamlit

Last synced: 05 Dec 2024

https://github.com/thecoderpinar/globalwarmingforecast

🌍 Global Warming Forecast Tool An advanced tool for analyzing and forecasting climate trends using ARIMA and Prophet models, with interactive visualizations and scenario simulations.

arima climate-change data-analysis environmental-science forecasting global-warming machine-learning prophet streamlit time-series-analysis visualization

Last synced: 05 Dec 2024

https://github.com/jerinpious/movie-recommendation-system

A content-based movie recommendation system built using Python. The system processes movie data, extracts relevant features, and provides recommendations based on user preferences

content-based-recommendation data-analysis jupyter-notebook machine-learning pandas python streamlit

Last synced: 05 Dec 2024

https://github.com/jayita11/customer-engagement-insights-for-yelp-restaurant-business-success

This project analyzes Yelp restaurant data using SQLite, Python, and Tableau to explore user engagement, reviews, and ratings. It provides insights into restaurant success across cities, regions, and user behavior.

customer-engagement data-analysis interactive-visualizations json python ratings review sqlite3 tableau-dashboards-for-data-visualization yelp-restaurants

Last synced: 20 Dec 2024

https://github.com/shyamkumarnagilla/ai-powered-forecasting-for-agricultural-productivity

AI Powered Forecasting for Agricultural Productivity is a project that utilizes machine learning to predict crop yields and optimize farming practices. By harnessing historical and real-time data, this model empowers farmers with data-driven insights to enhance productivity and sustainability in agriculture.

data-analysis data-visualization deep-learning flask neural-network

Last synced: 05 Jan 2025

https://github.com/tj2904/lfb-callout-analysis

An investingation into London FIre Brigade's callout data.

data-analysis decsion-tree kmeans lfb-incidents london-fire-brigade pandas python seaborn

Last synced: 28 Dec 2024

https://github.com/bryanfks-dev/klempoken-analysis

Analysis and forcasting model for Klempoken MSMEs

big-data-analytics data-analysis data-forecast data-visualization

Last synced: 14 Dec 2024

https://github.com/alexgenovese/react-charts-covid-19-data

Examples on COVID-19 data using different library charts: G2, G2Plot, Plotly, ApexCharts

data-analysis data-science data-visualization react reactjs

Last synced: 28 Dec 2024

https://github.com/bineet-ratna-shakya/data-science-salary-analysis

analyzing a dataset containing salaries of data science professionals from 2020 to 2023.

data-analysis data-science data-visualization jupyter numpy pandas python

Last synced: 11 Oct 2024

https://github.com/datalopes1/desafio_delivery

Desafio do Clube de Assinaturas da Universidade dos Dados para simular as demandas reais de um analista de dados

data-analysis jupyter python

Last synced: 11 Oct 2024

https://github.com/jhrcook/protein-language-models

Experimenting with protein language model predictions

data-analysis protein-language-model variant-effect-prediction

Last synced: 13 Jan 2025

https://github.com/marwan-ahmed-23/text-sentiment-analysis-api

A lightweight Python project for analyzing the sentiment of textual data using the TextBlob library. This project provides a simple and effective way to measure the polarity and subjectivity of any given text.

data-analysis machine-learning python python-project sentiment-analysis text-analysis text-mining

Last synced: 05 Jan 2025

https://github.com/yash22222/literacy-exploration-analysis

Delve into India's literacy landscape through data analysis. Uncover regional disparities, high/low literacy states & gender imbalances.

csv data-analysis data-visualization government-data india literacy literacy-analysis states

Last synced: 05 Jan 2025

https://github.com/yash22222/data-analysis-on-real-time-social-media-comments

EngageInsight analyzes user interactions in comment data. It provides insights through visualizations created using Python libraries like Pandas and Matplotlib. The project aims to uncover patterns and trends in user engagement. The visualizations provide an overview of comment lengths, the frequency of different types of replies.

data-analysis data-cleaning-and-preprocessing data-visualization matplotlib pandas pattern-recognition real-time-social-media-data seaborn trend-analysis

Last synced: 05 Jan 2025

https://github.com/yash22222/cinesphere-crafting-personalized-movie-experiences

"CineSphere" is a groundbreaking project developing a personalized movie recommendation engine. By analyzing user preferences and viewing history, CineSphere suggests movies tailored to individual tastes, revolutionizing the movie-watching experience.

cinesphere data-analysis imdb machine-learning movie-recommendation-engine movie-recommendation-system movielens real-time

Last synced: 05 Jan 2025

https://github.com/yash22222/pwc-power-bi-virtual-case-experience

The Power BI PwC Virtual Case Experience is an exciting and educational program designed to provide participants with hands-on exposure to Power BI, a prominent business intelligence and data visualization tool, within the context of consulting at PwC.

business-analyst business-analytics business-intelligence dashboard data-analysis data-analyst data-analytics dax microsoft-power-bi powerbi powerbi-dashboards powerbi-visuals pwc

Last synced: 05 Jan 2025

https://github.com/sunsided/esc2024

Exploratory Data Analysis on the ESC 2024 results

csv data-analysis eurovision-song-contest scraping

Last synced: 20 Dec 2024

https://github.com/luminati-io/walmart-dataset-samples

A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.

api data-analysis dataset walmart walmart-scraper web-scraping

Last synced: 23 Jan 2025

https://github.com/nilayhangarge/data-analysis-with-python

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

data-acquisition data-analysis data-analytics data-binning data-cleaning data-engineering data-fundamentals data-insights data-integration data-preprocessing data-science data-wrangling numpy pandas python

Last synced: 06 Jan 2025

https://github.com/rickcontreras/modelos1

Modelo de clasificación para predecir el desempeño de estudiantes en las Pruebas Saber Pro en Colombia. Incluye análisis exploratorio de datos, preprocesamiento y modelos de machine learning.

classification colombia data-analysis data-science education educational-assessment exploratory-data-analysis jupyter-notebook machine-learning python saber-pro scikit-learn student-performance

Last synced: 10 Oct 2024

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 10 Oct 2024

https://github.com/chrispsang/customerchurnanalysis

Predicting customer churn using a RandomForestClassifier with detailed EDA, model evaluation, and visualization. Includes a Tableau dashboard for interactive insights.

customerchurn data-analysis data-visualization datapreprocessing machine-learning python scikit-learn tableau

Last synced: 10 Oct 2024

https://github.com/mosalem149/pythonutilities

A collection of Python scripts for common utility tasks including file manipulation, word counting, longest word detection, and grade categorization. Perfect for quick and easy solutions to everyday programming problems.

data-analysis educational-tools file-io file-manipulation grade-calculation python text-analysis text-processing utility word-counting

Last synced: 06 Jan 2025

https://github.com/alphatwirl/qtwirl

qtwirl (quick-twirl), one-function interface to AlphaTwirl

alphatwirl data-analysis data-frame pandas r root-cern

Last synced: 20 Jan 2025

https://github.com/sebastianurdaneguibisalaya/enfermedades-fissal

Análisis holístico de atenciones por enfermedades raras, huérfanas y transplantes coberturados por FISSAL en el Perú.

data-analysis data-visualization python

Last synced: 06 Jan 2025

https://github.com/sebastianurdaneguibisalaya/colocaciones-de-credito-fondo-mivivienda-peru

Exploro las Colocaciones de Crédito del Fondo MIVIVIENDA S.A. entre 2018 y 2022, con un conjunto de datos descargado del Portal Nacional de Datos Abiertos del Perú. 🏠

data-analysis jupyter-notebook python

Last synced: 06 Jan 2025

https://github.com/satyacoder29/crowdfunding-in-sql

Crowdfunding is a method of raising funds for projects or causes by collecting small contributions from a large group of people, usually through online platforms. It enables individuals, startups, and nonprofits to secure funding, offering rewards or recognition in exchange, and helps bring ideas to life without traditional financing.

data-analysis data-cleaning database-management mysql-database quries sql sql-functions sql-server views

Last synced: 28 Dec 2024

https://github.com/al-ogr/sf_pr1_job_analysis_hh

SkillFactory DataScience PROJECT-1. Анализ резюме из HeadHunter

data-analysis data-science ipynb plotly python

Last synced: 20 Jan 2025

https://github.com/vavarm/data-analysis-french-electric-automobile-infrastructure

Data analysis realized in R Shiny and Python about the French electric vehicle and charging station infrastructure

data-analysis data-science data-visualization factominer geojson ggplot2 plotly python r rshiny

Last synced: 20 Jan 2025

https://github.com/nikbarb810/covid_growth_rate_390.51

Exploring Covid Growth Rate of European Population using genetic data analysis

bioinformatics data-analysis r rcpp

Last synced: 01 Jan 2025

https://github.com/gurpreet17/uc-davis-sql-for-data-science-specialization

Completed the SQL Basics for Data Science Specialization from the University of California, Davis, gaining proficiency in Data Analysis, SQL, Apache Spark, and Delta Lake.

apache-spark bigdata data-analysis data-science delta-lake sqlite

Last synced: 28 Dec 2024

https://github.com/sferez/gradient_descent

Multiple Linear Regression, Gradient Descent with Python

data-analysis data-science gradient-descent linear-regression python

Last synced: 13 Jan 2025

https://github.com/jethronap/jstat-gui

Web-based GUI application for data analysis

data-analysis data-visualization java jstat mongodb

Last synced: 06 Jan 2025

https://github.com/kashirin-alex/thither.direct-onamove

an android skeleton-example application for using data from Thither.Direct platform on mobile applications

android-application data data-analysis data-structures data-visualization mobile-development mobility query research-data-management

Last synced: 19 Dec 2024

https://github.com/rahulsm20/car-data

A data analytics project that involves analyzing a car dataset that includes information on various car brands, years, prices, mileage, and fuel types, in order to gain insights into the car market.

data-analysis data-analytics matplotlib numpy pandas python

Last synced: 06 Jan 2025

https://github.com/rahulsm20/storedata

A data analysis project aimed at analyzing the sales data of the super store and providing useful insight into customer preferences.

data-analysis matplotlib numpy pandas python streamlit

Last synced: 06 Jan 2025

https://github.com/inevolin/multivariate-data-analysis

Showcases of modern multivariate & multidimensional data analysis in industrial and high-tech settings.

analytics data-analysis data-science data-visualization javascript

Last synced: 11 Jan 2025

https://github.com/mr-chang95/udacity_movie_project

Movie Data Analysis and Visualization Project for Udacity's Data Analyst Program. Using Python in Jupyter Notebook.

data-analysis data-visualization jupyter-notebook movie python

Last synced: 26 Jan 2025

https://github.com/mr-chang95/datascience_airbnb

Data Science Project for Udacity's Data Scientist Program. Using Python in Jupyter Notebook.

airbnb data-analysis data-science data-visualization jupyter-notebook numpy pandas python sklearn

Last synced: 26 Jan 2025

https://github.com/mr-chang95/sf_data_visualization

In this personal project, I am interested in examining all of the active businesses in the San Francisco Bay Area while performing some simple data visualizations, mainly on categorical variables.

business data-analysis data-visualization jupyter-notebook pandas python san-francisco

Last synced: 26 Jan 2025

https://github.com/vubacktracking/freecodecamp-data-analysis-with-python

5 Projects in Data Analysis With Python Course on Freecodecamp

data-analysis freecodecamp freecodecamp-project python

Last synced: 06 Jan 2025