An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/kgotsosm/fcc-data-analysis

Notebooks created for the Data Analysis Course on freeCodeCamp

data-analysis data-visualization matplotlib pandas seaborn

Last synced: 17 Apr 2026

https://github.com/royungar/sql_chicago_data_analysis_project

SQL-based data analysis project using SQLite, pandas, and Jupyter SQL magic commands. Analyzes crime, school, and census data from Chicago to explore socioeconomic patterns using filtering, joins, aggregation, and subqueries.

aggregation census-data chicago crime-data data-analysis data-engineering education-data ibm jupyter-notebook pandas sql sqlite subqueries

Last synced: 04 Jun 2026

https://github.com/alinababer/covid19-timeseries-cases-and-deaths-forecasting-

This study is based on confirmed cases and deaths collected from Pakistan. Results demonstrate the promising potential of TIME SERIES model in forecasting COVID-19 cases and highlight the superior performance of the time series compared to the LSTM.we apply AI-based forecasting models such time series ARIMA, LSTM, prophet and VAR.

arima covid-19 data-analysis data-science data-visualization fbprophet forecasting lstm rnn time-series var vectorautoregression

Last synced: 19 Jun 2026

https://github.com/edprice25/us-states-analysis

Presents a series of visualizations for folks looking to relocate to more affordable areas in the US. Click on my link below to see a full analysis.

data-analysis jupyter-notebook matplotlib pandas python us-states

Last synced: 04 Jul 2025

https://github.com/royungar/automotive_sales_insights_dashboard

Data visualization project analyzing automotive sales, recalls, and customer sentiment using IBM Cognos Analytics. Features KPIs, treemaps, heatmaps, and advanced visual storytelling techniques.

automotive-industry business-intelligence cognos-analytics csv customer-sentiment dashboard data-analysis data-engineering data-visualization eda excel heatmap ibm kpi recall-analysis sales-data treemap

Last synced: 04 Jun 2026

https://github.com/davidmalko87/steam-library-exporter

Python script to export your Steam game library to CSV — playtime, genres, reviews, metacritic scores, prices, tags & estimated owners via Steam Web API + Store API + SteamSpy

csv-export data-analysis game-data metacritic playtime-tracker python steam steam-api steam-games steam-library steamspy

Last synced: 04 Apr 2026

https://github.com/santos-k/fashion-recommender-dashboard

The project is a neural network-based fashion recommendation system built using Python. The model used for this system is Resnet50, which is a deep learning model used for image recognition. The data used for training the model is scraped from Flipkart, with a total of 65,000 images.

ann cnn dash dashboard data-analysis data-science deep-learning eda gcp heroku kera machine-learning nueral-networks plolty python tensorflow

Last synced: 04 Apr 2026

https://github.com/sevilaymuni/project-no.3-seaborn-plots

Pandas and Seaborn Mediated Comprehensive Analysis on Differentiated Thyroid Cancer

data-analysis data-structures data-visualization mathplotlib pandas python seaborn

Last synced: 18 Apr 2026

https://github.com/krzysikd/apartment-prices-in-poland-analysis-and-visualization

Data Analyst portfolio project that involves cleaning, transforming, and visualizing data to create an insightful dashboard. The project uses SSIS for ETL processes, SSMS for database management and queries, and Power BI for data visualization, focusing on the analysis of rental and sales apartment prices in Poland.

data-analysis data-cleaning data-visualizations powerbi sql sqlserver ssis

Last synced: 04 Feb 2026

https://github.com/sanam2405/ahs

This contains the analysis of result of AHS Madhyamik Examination 2022

data-analysis data-visualization jupyter-notebook python

Last synced: 18 Apr 2026

https://github.com/yuvrajsaraogi/sales-prediction-using-python

Sales prediction involves estimating future product sales based on factors like advertising spend, target audience, and platform. Businesses rely on data scientists to forecast sales and optimize advertising costs. Machine learning in Python can be used for this task.

data data-analysis data-science data-visualization machine-learning matplotlib natural-language-processing numpy pandas prediction python sales-prediction-using-python sql

Last synced: 19 Apr 2026

https://github.com/prangonghose/wikipedia-blocking-policies

This study investigates the relationship between editors’ disruptive behavior and regulation policies on English Wikipedia, focusing on the Blocking Policy page. The study collects and analyzes data from 2004 to 2022 using the Wikipedia API, page statistics, and keyword extraction.

data-analysis data-visualization matplotlib open-source pandas python3 seaborn

Last synced: 18 Apr 2026

https://github.com/rajeev2806/retail-order-data-analysis

Dataset downloaded from kaggle api and then data cleaning and analysis is performed

data-analysis data-cleaning postgresql

Last synced: 18 Apr 2026

https://github.com/vasulab/knightshock

Shock tube experiment planning and data analysis package.

cantera data-analysis matplotlib numpy shock-tube

Last synced: 18 Jul 2025

https://github.com/borjamome/soho_cholera

Cholera deaths in the Soho District (London)

data-analysis data-visualization london r

Last synced: 04 Sep 2025

https://github.com/bpkaur/exploring-the-evolution-of-linux

This project explores the evolution of the Linux kernel by finding top 10 contributors and visualization of commits over the years.

data-analysis data-science datacamp ipynb-jupyter-notebook python3

Last synced: 21 Feb 2026

https://github.com/master-helix/ibm-data-analyst-certification-stock-analysis-project

This is a mini project repository of my IBM Certification involving stock analysis and plotting of Tesla and GameStop

analytics data data-analysis data-visualization ibm matplotlib pandas python web-scraping

Last synced: 09 May 2026

https://github.com/sarveshdhond/top_25_cad_stocks

In this project I have used Python Jupyter lab and Pandas to import data set from Yahoo stocks website. I have imported the top 25 most active Canadian stocks on 12th July 2024. This project shows skills such as Python, Web Scrapping and Pandas.

data-analysis pandas-dataframe python webscraping

Last synced: 01 Apr 2025

https://github.com/mtimma001/clinical-trial-data-tool

Clinical Trial Data Analysis Tool is a Flask-based web app for healthcare professionals to manage and analyze clinical trial data. It features full CRUD functionality, interactive visualizations (Plotly/Matplotlib), a responsive Bootstrap UI, MySQL database integration, and Heroku deployment for accessible, scalable use.

bootstrap5 clinical-trials crud data-analysis data-visualization flask healthcare heroku mysql pandas plotly python

Last synced: 05 Apr 2026

https://github.com/al-ghaly/prosper-loans-analysis

A statistical Analysis Project, to analyze the data of a finance company’s loans Using Python packages (pandas – NumPy – seaborn – matplotlib)

data-analysis matplotlib numpy pandas python python-data-analysis seaborn statistical-analysis statistics

Last synced: 18 Apr 2026

https://github.com/jordanconallluthaiswright/purchase-behaviour-data-analysis

This project analyzes Black Friday purchase behavior for Company XYZ, uncovering trends by gender, age, and location. Using data cleaning, statistical analysis, and visualization, it evaluates spending patterns, confidence intervals, and category preferences to provide actionable insights for optimizing marketing strategies and targeting.

business-analytics data-analysis jupyter-notebook python

Last synced: 18 Apr 2026

https://github.com/vl1507/data_science_pro_course

Курс "Аналитик данных PRO (PRO DA-6)"

da data-analysis data-science ds jupyter-notebook machine-learning ml pro-da python

Last synced: 18 Apr 2026

https://github.com/andersoncrs/prediccion-del-precio-de-vehiculos-un-enfoque-con-regresion-lineal-y-regularizacion

Este proyecto tiene como objetivo predecir el precio de vehículos usados utilizando técnicas de regresión lineal y regularización Lasso. A través del análisis y procesamiento de datos, se construye un modelo predictivo preciso e interpretable basado en las características más relevantes de cada vehículo.

data-analysis data-exploration lasso-regression machine-learning polinomial-regression regularization-methods

Last synced: 03 Jul 2025

https://github.com/mumtaz4118/nlp-course

Programming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning

course data data-analysis data-analytics data-science data-visualization deep-learning education machine-learning natural-language-processing neural-network transfer-learning

Last synced: 24 Nov 2025

https://github.com/mksingh431/free-data-science-courses

Data science is a rapidly growing tech field that’s transforming business decision-making. To break into this field, you need the right skills. Fortunately, top institutions like Harvard and IBM offer free online courses. These courses cover everything from basic programming to advanced machine learning.

course data data-analysis data-science data-visualization free freecou python

Last synced: 19 Apr 2026

https://github.com/rodriguesl1/analise-ibovespa-fiap

Modelo de previsão do índice IBOVESPA utilizando técnicas de séries temporais. O projeto inclui análise exploratória, decomposição sazonal, testes de estacionariedade e modelagem com Prophet, AutoARIMA e outros modelos estatísticos para apoiar decisões de investimento.

autoarima b3 brasil data-analysis economia finance forecasting ibovespa pandas prophet python statsmodels time-series

Last synced: 19 Apr 2026

https://github.com/robertochiosa/automatic-powerpoint-report-rmd

Automatically generate good looking powerpoint presentations from a csv dataset

data-analysis data-science medium medium-article python r

Last synced: 19 Apr 2026

https://github.com/kheriberto/linear_regression_ecommerce

Simple project showcasing crafting a linear regression model with SciKit Learn

data-analysis jupyter-notebook linear-regression pandas python scikit-learn seaborn

Last synced: 19 Apr 2026

https://github.com/vyjayanthipolapragada/data_analytics_medical_appointments

Analyzing the data set which consists of medical appointments to draw insights about patient's no-show scenarios

data-analysis data-analytics data-cleaning data-visualization data-wrangling jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 19 Apr 2026

https://github.com/mlucifer27/bilateral-visualization

Streamlit app visualizes bilateral relationship scores between 100 countries from 1945 to 2024. It supports interactive heatmaps, network graphs, pairwise comparisons, and more.

d3blocks data-analysis data-visualization plotly-python python streamlit

Last synced: 04 Jun 2026

https://github.com/akash-v7/telecom_customer_churn_prediction

A machine learning project to predict customer churn in the telecom industry using data analysis and classification models. The project includes data preprocessing, exploratory data analysis (EDA), model building, and insights to help telecom companies improve customer retention strategies.

data-analysis data-science data-visualization jupyter-notebook machine-learning predictive-modeling python

Last synced: 20 Apr 2026

https://github.com/montanaz0r/suicide-rate-analysis

Testing a significance of the correlation between a suicide rate and a number of psychiatrists and psychologists working in the mental health sector

analysis correlation data data-analysis data-science jupyter-notebook jupyter-notebooks matplotlib numpy pandas psychology python python-3 seaborn statistics suicide-rate

Last synced: 20 Apr 2026

https://github.com/namratha2301/carprice_analysisandprediction

This project analyzes factors influencing vehicle prices using a dataset of various attributes, including Engine capacity, Power, Mileage, and Seating capacity.

data-analysis data-visualization exploratory-data-analysis machine-learning pandas predictive-modeling random-forest-classifier regression scikit-learn seaborn

Last synced: 20 Apr 2026

https://github.com/jerinpious/movie-recommendation-system

A content-based movie recommendation system built using Python. The system processes movie data, extracts relevant features, and provides recommendations based on user preferences

content-based-recommendation data-analysis jupyter-notebook machine-learning pandas python streamlit

Last synced: 20 Apr 2026

https://github.com/anjaliwork20/moodify

Mood-based music recommendation system that considers a user's emotional state to recommend songs, genres, artists and playlists using Machine learning

artificial-intelligence cnn-keras cnn-model convolutional-neural-networks data data-analysis data-science data-structures data-visualization database deep-learning machine-learning machine-learning-algorithms python recommended song songs

Last synced: 20 Apr 2026

https://github.com/axsk/hotscrap

parse hotslogs data to assist in picking

clojure data-analysis scraper

Last synced: 04 Jun 2026

https://github.com/xre22zax/roller-coaster

Explore award-winning wood and steel coasters from 2013-2018 Golden Ticket Awards & Captain Coaster, all powered by Python and interactive visualizations.

analytics data-analysis data-visualization pandas python python-lambda python3 visualization

Last synced: 20 Apr 2026

https://github.com/sarthakmishraa/bike_rental_predictor

Bike Sharing Dataset : This dataset contains the hourly and daily count of rental bikes between years 2011 and 2012 in Capital bikeshare system with the corresponding weather and seasonal information.

data-analysis machine-learning python xgboost

Last synced: 20 Apr 2026

https://github.com/wtbates99/pandas-monday

Python library that provides seamless integration between pandas DataFrames and Monday.com boards. Easily read Monday.com board data into pandas DataFrames with support for subitems, pagination, and column filtering. Built with the Monday.com GraphQL API.

api-wrapper data-analysis data-integration dataframe graphql monday pandas productivity-tools python

Last synced: 20 Apr 2026

https://github.com/salfaris/toy-data-analysis

Random toy data projects. For my portfolio data projects, see linked website

data-analysis

Last synced: 20 Apr 2026

https://github.com/theveryhim/frequent-item-sets-and-lsh

A practice on finding frequent item sets and similar items in pysaprk framework

big-data data-analysis frequent-itemset-mining locality-sensitive-hashing pyspark text-processing

Last synced: 03 Jul 2025

https://github.com/dulajkavinda/pandas-exploring-data-ml

🐼 Exploring data with pandas library.

data-analysis machine-learning pandas python

Last synced: 09 May 2026

https://github.com/docuvesta/la-mer-skincare-chicago-duty-free-analysis

Comparing La Mer product selection, availability and pricing from 3 different purchase locations ✈️

analytics cremedelamer data-analysis data-analytics data-science data-visualization lamer luxury plotly python seaborn skincare

Last synced: 21 Apr 2026

https://github.com/danpoynor/pet-shelter-data-analysis-notebook

Demonstration of skills analyzing data from a pet shelter. The CSV data contains tables detailing the incoming and outgoing animals and I use my knowledge of Pandas to gather and present the requested information.

csv data-analysis data-cleaning data-science jupyter-notebook matplotlib numpy pandas pet-shelter tabular-data

Last synced: 21 Apr 2026

https://github.com/nxion/sql-data-warehouse-project

Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.

data data-analysis data-analytics data-engineering data-lakehouse data-warehouse datalake datascience etl etl-job medallion-architecture ms mssql sql sql-query sql-server

Last synced: 05 Jun 2026

https://github.com/maddieemihle/home_sales

A PySpark-powered analysis of real estate trends using home sales data. This project explores average prices by year, room configuration, and property features, while demonstrating SparkSQL, caching, and partitioning techniques in a scalable data pipeline—all within Google Colab

apache-spark caching data-analysis googlecolab parquet pyspark sparksql

Last synced: 21 Apr 2026

https://github.com/mhuwaimel/data-analysis-of-students-results-in-qiyas

Analysis of student performance data from Qiyas (قياس), the Saudi Arabian National Center for Assessment

data-analysis jupyter-notebook python

Last synced: 22 Apr 2026

https://github.com/robinmillford/optimizing-treatment-plans-through-data-analysis

The primary focus was on understanding customer health, treatment, and associated charges over multiple years.

data-analysis data-visualization healthcare mysql powerbi sql

Last synced: 22 Apr 2026

https://github.com/thinogueiras/jornada-python

Jornada Python - Hashtag Programação.

data-analysis data-science inteligencia-artificial python rpa

Last synced: 22 Apr 2026

https://github.com/satvikpraveen/rsvp_case_study

A comprehensive IMDB dataset analysis using SQL. Includes database setup, advanced queries, and actionable insights. Organized with files for database creation, queries, and solutions. Features an Entity-Relationship Diagram (ERD), executive summary, and SQL scripts. Perfect for SQL workflows and business intelligence in the film industry.

aggregate-functions business-intelligence common-table-expressions data-analysis data-driven-decisions data-querying database-design entity-relationship-diagram imdb-dataset relational-database sql subqueries-and-joins

Last synced: 11 Jan 2026

https://github.com/kgotsosm/epl-analysis

Preparing data for machine learning algorithms to predict English Premier League match winners.

data-analysis data-cleaning data-modeling

Last synced: 22 Apr 2026

https://github.com/devexpress-examples/web-forms-pivot-grid-export-additional-captions-header-or-footer

This example illustrates how to add a custom header to the document exported to PDF in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 22 Apr 2026

https://github.com/floffah/my-listening

Various ways to analyse your Spotify extended streaming history data

convex data-analysis listening-history spotify

Last synced: 23 Apr 2026

https://github.com/soumya-thoutam/covid-19-impact-on-u.s.-states-and-colleges

Covid-19 analysis and impact on United States Colleges and States using SQL and Tableau.

covid-19 dashboard data-analysis data-visualization dataset sql sql-server tableau

Last synced: 04 Sep 2025

https://github.com/tranngoca5039/bigquery-a5y

📊 Streamline your data analysis with bigquery-a5y, a powerful tool for optimizing BigQuery performance and improving query efficiency.

analytics api big-data bigquery cloud-computing data-analysis data-integration data-management data-pipeline data-visualization data-warehouse google-cloud machine-learning serverless sql

Last synced: 05 Jun 2026

https://github.com/thc1006/nycu_timtable_crawler

🎓 NYCU Course Data Crawler & Timetable System | 國立陽明交通大學課程爬蟲與選課系統 - Python web scraper for course schedules, syllabi & educational data analysis. Crawls 18K+ courses with 98% success rate. Features: interactive timetable, JSON API, Google Colab support, batch processing, resume capability.

academic course course-selection crawler data-analysis education educational-data google-colab json-api nycu open-data python schedule student-tools syllabus taiwan timetable university web-automation web-scraping

Last synced: 24 Apr 2026

https://github.com/strixion/demoversion_ai

The demoversion of StrixionAI

ai csv data-analysis data-analytics json python txt

Last synced: 24 Apr 2026

https://github.com/datalopes1/bank_marketing

Este projeto será baseado no Dataset Bank Marketing encontrado na UC Irvine - Machine Learning Repository e disponibilizado por S. Moro, R. Laureano e P. Cortez

data-analysis data-science data-visualization eda python

Last synced: 24 Apr 2026

https://github.com/marvinmarnold/oipm_stop_search

OIPM's analysis on Stop & Search (frisk) activity by the New Orleans Police Department.

data-analysis frisk new-orleans oipm police search stop

Last synced: 22 Jul 2025

https://github.com/yuvrajsaraogi/-iris-flower-classification

Iris flower has three species; setosa, versicolor, and virginica, which differs according to their measurements. Now assume that you have the measurements of the iris flowers according to their species, and the task is to train a machine learning model that can learn from the measurements of the iris species and classify them.

classification data data-analysis data-science data-visualization flower flower-classification iris iris-classification iris-flower iris-flower-classification knn knn-classification machine-learning machine-learning-algorithms ml natural-language-processing nlp python

Last synced: 24 Apr 2026

https://github.com/muthukumar0908/youtube-data-harvesting-and-warehousing-using-sql-mongodb-and-streamlit

Create a simple and intuitive user interface using Streamlit, From the youtube getting and extracting the data by using API key. That data stored in database.

data-analysis mongodb-atlas python sqldatabase streamlit-webapp youtube-api

Last synced: 24 Apr 2026

https://github.com/manisharora96/data-analysis-of-smartwatch

The project is structured with sample data, step-by-step Jupyter notebooks, and modular Python scripts for automated analysis

data-analysis data-visualization jupyter-notebook python smartwatch-analysis

Last synced: 24 Apr 2026

https://github.com/adarshpheonix2810/fake-job-post-detection

This project focuses on detecting fake job posts using machine learning. Fake job advertisements are often created to scam individuals by stealing personal information or money.

data-analysis deep-learning joblib machine-learning nlp-machine-learning numpy pandas python scikit-learn tkinter

Last synced: 12 Apr 2026

https://github.com/edwinrlambert/emomap-sentiment-analysis

To analyze public sentiment related to specific locations in a city (e.g., parks, transit stations, restaurants, neighborhoods) using geo-tagged social media posts, reviews, and comments. The goal is to visualize how people feel across different areas and times.

data-analysis jupyter-notebook python sentiment-analysis

Last synced: 24 Apr 2026

https://github.com/angelmtenor/idafc

Udacity's Intro to Data Analysis

data-analysis

Last synced: 20 Jun 2026

https://github.com/drill-n-bass/ovh-project

The goal of this task is to prepare statistical analysis of set of data from disks.

anaconda analysis data-analysis data-analysis-python jupyter-notebook matplotlib-python pandas python3 seaborn-plots

Last synced: 09 May 2026

https://github.com/sdley/cas_pratiques_a_rendre

Exercices pratiques de traitement de données avec python.

data-analysis pandas python

Last synced: 09 May 2026

https://github.com/ismielabir/pycsvsummarizer

A lightweight tool to summarize CSV files using various features.

csv data-analysis data-summary python

Last synced: 25 Apr 2026

https://github.com/mehmetkahya0/gallstone_dataset_analysis_project

Safra Taşı Hastalığı (Gallstone-1) Veri Seti Analizi (https://archive.ics.uci.edu/dataset/1150/gallstone-1)

analysis analytics data data-analysis data-science data-visualization database graph matplotlib python

Last synced: 25 Apr 2026

https://github.com/tmoulik/bikeshare-python

Analysis of Bikeshare data from three major cities

data-analysis data-visualization python udacity-nanodegree

Last synced: 25 Apr 2026

https://github.com/pranav016/exploratory-data-analysis-of-sp500-dataset

This a data-analysis that I performed on the S&P 500 dataset and answered a few questions through data visualization techniques.

data-analysis

Last synced: 30 Oct 2025

https://github.com/ddihora1604/iit_patna

A multifaceted project involving applying ML models like Ridge Classifier, RNN, RIDOR, Rotation Forest and RUSBoost, integrating SMOTE for class balancing, and handling diverse datasets including those for seating arrangement tasks.

data-analysis data-visualization datamodelling machine-learning-algorithms python

Last synced: 25 Apr 2026

https://github.com/edwinrlambert/investigating-netflix-movies

Demonstrates data analysis and visualization techniques for Netflix movies using Python in a Jupyter notebook. This is a DataCamp project.

data-analysis data-analysis-python netflix python

Last synced: 25 Apr 2026

https://github.com/chandansoren/customer-personality-analysis

Predict how different customer segments will respond for a particular product or service.

data-analysis data-visualization python

Last synced: 26 Apr 2026