Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/adrianycmc/introducaoadatascience

Explorando dados: Utilizando Python, Pandas e o Colaboratory do Google.

data-analysis data-science jupyter pandas python

Last synced: 12 Feb 2025

https://github.com/noturlee/imdb-dataanalysis

A data model that predicts the IMDb rating of a movie based on features like genre, director, and actors. Using regression techniques to tackle this problem.

data-analysis data-cleaning data-modeling data-science data-visualization

Last synced: 14 Feb 2025

https://github.com/bretsw/subreddits-over-time

Study of the r/Teachers and r/education subreddits over time

data-analysis dataset reddit

Last synced: 06 Feb 2025

https://github.com/savinrazvan/degrees

A program that computes the "degrees of separation" between two actors by identifying the sequence of movies connecting them, inspired by the Six Degrees of Kevin Bacon game. Uses IMDb-based datasets for actors, movies, and their relationships.

actor-connections ai data-analysis degrees-of-separation educational-project graph-theory imdb movie-database python six-degrees-of-kevin-bacon

Last synced: 10 Jan 2025

https://github.com/savinrazvan/heredity

An AI that assesses the likelihood of genetic traits in individuals using a Bayesian Network to analyze family genetic data, modeling genetic inheritance and mutations to infer probabilities of gene presence and trait expression.

ai bayesian-network biological-data-analysis data-analysis educational-project family-genetics genetic-inheritance genetic-traits heredity mutation-modeling probability-calculation python

Last synced: 10 Jan 2025

https://github.com/noodleslove/house-of-representative-analysis-i

This project uses public data about the stock trades made by members of the US House of Representatives.

data-analysis data-science eda kaggle-dataset matplotlib-pyplot pandas python stocks-trading

Last synced: 28 Jan 2025

https://github.com/al-ghaly/power-bi-dashboard

A dashboard to analyze data specializations job market.

dashboard data-analysis powerbi

Last synced: 22 Jan 2025

https://github.com/nysportsfan/Gun-Violence-in-the-US

This repository contains all the relevant files for my first capstone project as part of the Springboard Data Science Career Track.

data-analysis data-science data-visualization machine-learning python3 statistics

Last synced: 16 Nov 2024

https://github.com/jubinjacob03/heartdiseaseclassify-ml

Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.

data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine

Last synced: 12 Feb 2025

https://github.com/sumidcyber/dataviz-master

This Python application provides a user-friendly interface to load and visualize the contents of a CSV file. Users can choose from various types of graphs and perform analyses on the dataset.

data-analysis data-analysis-project data-analysis-python database databases python python3

Last synced: 22 Jan 2025

https://github.com/virajbhutada/coursera-google-data-analytics-capstone

A repository containing the Capstone project for the Google Data Analytics Professional Certificate, focusing on analyzing FitBit fitness tracker usage data to derive insights relevant to Bellabeat, a wellness technology company.

coursera data-analysis data-visualization fitbit fitness-tracker google google-data-analytics-capstone-project python-programming

Last synced: 10 Jan 2025

https://github.com/sarincr/basics-of-julia-programming-language

Julia is a high-level, high-performance, dynamic programming language. While it is a general purpose language and can be used to write any application, many of its features are well-suited for high-performance numerical analysis and computational science.

data data-analysis data-mining data-science data-visualization dataanalysis dataanalytics datascience julia julia-language julia-library julia-package julialang machine-learning

Last synced: 21 Jan 2025

https://github.com/as16082023/hotel-booking-analysis-eda-

Exploratory Data Analysis on hotel booking data using Python

data-analysis data-visualization exploratory-data-analysis jupyter-notebook python

Last synced: 15 Feb 2025

https://github.com/as16082023/restaurant-order-analysis

Analyzing order data to identify the most and least popular menu items and types of cuisine

data-analysis maven-analytics mysql restaurant-order sql

Last synced: 15 Feb 2025

https://github.com/as16082023/music-store-analysis

This project involves analyzing music store data using SQL queries in MySQL workbench to enhance decision-making, identify trends, and understand customer behavior

data-analysis music-store-analysis mysql sql

Last synced: 15 Feb 2025

https://github.com/viper373/163-buff

爬取网易BUFF平台CS:GO武器皮肤交易数据

163 arima crawler-python csgo data-analysis prediction python

Last synced: 05 Feb 2025

https://github.com/hfxbse/dhbw-data-analysis

Exploratory data analysis R notebook for the module T3INF4333 "Grundlagen Data Science" held in 2024 by Lothar B. Blum at the DHBW Stuttgart.

data-analysis data-science dhbw dhbw-stuttgart ggplot2 r r-notebook

Last synced: 12 Feb 2025

https://github.com/ivanildobarauna-dev/currency-quote

Complete solution for extracting currency pair quotes data with comprehensive testing, parameter validation, flexible configuration management, Hexagonal Architecture, CI/CD pipelines, code quality tools, and detailed documentation.

data-analysis data-analytics data-engineering library pypi-packages python

Last synced: 12 Feb 2025

https://github.com/azmainadel/twitter-data-neo4j

Playing with graph database on a large dataset of twitter data.

data-analysis data-visualization neo4j-database snap

Last synced: 12 Feb 2025

https://github.com/discdiver/new-belgium-ratings

Find the most popular New Belgium beers of all time!

beautifulsoup data-analysis pandas python seaborn webscraping

Last synced: 10 Jan 2025

https://github.com/emredurukn/data-analysis

Example notebooks for analyzing data

data-analysis data-visualization python

Last synced: 10 Jan 2025

https://github.com/ifibla/adsdb-project

Algorithms, Data Structures and Databases Project

data-analysis data-engineering python

Last synced: 28 Jan 2025

https://github.com/abhi-lab2/ipl-data-analysis

IPL data analysis for future predictions

data-analysis data-science python

Last synced: 19 Feb 2025

https://github.com/seabbs/explorebcgonoutcomes

Analysis to explore the association of BCG vaccination and TB outcomes.

bcg data-analysis regression rstats tuberculosis

Last synced: 01 Jan 2025

https://github.com/rayyan9477/household-transactions-analysis-and-clustering

This project involves analyzing household transaction data to gain insights into spending patterns and behaviors. The analysis includes data cleaning, exploratory data analysis (EDA), clustering using K-Means, and visualization of customer segments.

customer-segmentation data-analysis data-cleaning data-science exploratory-data-analysis kmeans-clustering machine-learning

Last synced: 10 Jan 2025

https://github.com/rayyan9477/youtube-spam-detection-with-flask-and-machine-learning

This is a web application built using Flask that detects spam comments on YouTube using a Naive Bayes classifier. It leverages techniques such as CountVectorizer for feature extraction and scikit-learn for machine learning. The application reads data from a CSV file and predicts whether a comment is spam or not.

data-analysis data-science machine-learning nlp-machine-learning spam-detection

Last synced: 10 Jan 2025

https://github.com/rayyan9477/coin-detection-project

This Coin Detection Project leverages machine learning techniques to identify coins using a dataset from Kaggle. Key libraries utilized include OpenCV for image processing, TensorFlow for model training, and Pandas for data manipulation. The project also employs NumPy for numerical operations and Matplotlib for visualization.

computer-vision data-analysis data-science data-visualization machine-learning notebook python

Last synced: 10 Jan 2025

https://github.com/rayyan9477/multiple-disease-prediction-system

This repository contains a Multiple Disease Prediction System leveraging machine learning techniques for accurate predictions. It utilizes Python, Pandas, Scikit-learn, and Flask for data preprocessing, model building, and web deployment. Explore the project and connect on LinkedIn for collaborations.

data-analysis data-science machine-learning python streamlit

Last synced: 10 Jan 2025

https://github.com/rayyan9477/diamond-price-forecasting

This is a comprehensive machine learning project focused on predicting diamond prices. Using a dataset of diamond attributes, the project implements various machine learning models to forecast prices. Key features include data preprocessing, exploratory data analysis (EDA), and model training with algorithms such as Linear Regression, Decision Tree

data-analysis data-science decision-trees eda linear-regression machine-learning

Last synced: 10 Jan 2025

https://github.com/aravind-selvam/bikeshare-company-analysis

Google Data Analytics Professional Certificate program's Capstone project, of a bike sharing company

analytics business-analytics business-intelligence data data-analysis data-visualization dataanalytics google-data-analytics postgresql sql sql-server

Last synced: 14 Jan 2025

https://github.com/shelton-beep/predicting-gpa-using-lifestyle-factors

Predicting student GPA using lifestyle factors like study habits, sleep, and stress levels. A machine learning model built to help students and educators understand the impact of lifestyle choices on academic performance.

data-analysis data-preprocessing data-science feature-engineering gpa-prediction machine-learning model-interpretability predictive-modeling python regression-analysis student-performance xgboost

Last synced: 29 Jan 2025

https://github.com/magnaopus1/synthron-cfd-trader-pro

SYNTHRON CFD Trader PRO is a cutting-edge trading platform featuring raw, custom-designed machine learning models. From reinforcement learning for dynamic strategies to predictive analytics, sentiment analysis, and optimization techniques, it empowers trading across stocks, forex, indices, commodities, futures, and crypto with precision.

ai backtesting cfd commodities data-analysis data-science data-structures forex futures indices machine-learning trading

Last synced: 05 Feb 2025

https://github.com/akash1070/data-science-virtual-internship-by-anz

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

data-analysis data-science predictive-analytics presentation-slides

Last synced: 29 Jan 2025

https://github.com/antononcube/wl-quantileregression-paclet

Wolfram Language (aka Mathematica) paclet that provides various Quantile Regression functions.

data-analysis machine-learning quantile-regression time-series time-series-analysis

Last synced: 08 Feb 2025

https://github.com/jatin-mehra119/bike-rentals-dataset

This repository focuses on optimizing bike rental availability during peak hours and days using machine learning techniques. Leveraging publicly available data from the UCI Machine Learning Repository, it includes scripts for data preprocessing, model training, and visualization, along with detailed observations and results.

data-analysis data-science ensemble-model pandas scikitlearn-machine-learning

Last synced: 17 Jan 2025

https://github.com/juliusmarkwei/titanic-data-analysis

Data analysis, data visualization, feature scaling, feature transformation, model selection and model optimization.

data-analysis data-science data-visualization linear-regression model-selection regression

Last synced: 01 Jan 2025

https://github.com/juliusmarkwei/iris-dataset-analysis

Data analysis, data visualization and model training using the popular Iris Dataset

data-analysis data-visualisation linear-regression machine-learning

Last synced: 01 Jan 2025

https://github.com/shreeparab1890/flipkart-laptops-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Laptops listed on Flipkart.

data-analysis eda exploratory-data-analysis matplotlib-pyplot numpy pandas plotly

Last synced: 01 Jan 2025

https://github.com/tim-hub/python-course

A new Python Course, a new trial to offer MOOC style learning resources and content for python learners

data-analysis learning python

Last synced: 23 Jan 2025

https://github.com/ultrasage-danz/weather-data-analysis

Weather Data Analysis notebook project. Created using Google collab

collaboration data-analysis data-science dataset google google-colab-notebook project

Last synced: 30 Jan 2025

https://github.com/vkbo/osirisanalysis

Matlab toolbox for analysing simulation results from Osiris 3

data-analysis matlab matlab-gui physics-simulation

Last synced: 16 Nov 2024

https://github.com/leosimoes/datascienceacademy-powerbi-3.0

Projetos do curso Microsoft Power BI Para Data Science Versão 3.0 da DataScienceAcademy. Dashboards para diversos casos de negócios.

business-intelligence dashboards data-analysis data-visualization microsoft-power-bi

Last synced: 30 Jan 2025

https://github.com/leosimoes/uerj-tcc-analisador-dados

Trabalho de conclusão de curso (TCC) em Engenharia de Computação. Aplicativo Web para preparação e análise de dados, criação de gráficos e modelos de regressão linear e logistica.

computer-engineer data-analysis data-science data-visualization linear-logistic linear-regression python streamlit

Last synced: 30 Jan 2025

https://github.com/aritrakar/statpy

A simple package containing some functions for analysing Gaussian and Binomial distributions. Created for the Udacity AWS MLE Foundations 2021 course.

data-analysis python statistics

Last synced: 01 Jan 2025

https://github.com/leosimoes/udacity-starbucks

Project 3 of the Udacity Machine Learning Engineer Nanodegree Program. Data analysis and machine learning application to Starbukcs data.

aws-iam aws-s3 aws-sagemaker data-analysis data-science machine-learning python

Last synced: 30 Jan 2025

https://github.com/prime-infinity/type-one

Software to visualize and analyze GitHub repos based on certain statistics such as stars, forks and issues

data-analysis data-visualization

Last synced: 24 Nov 2024

https://github.com/mayankyadav23/air-bnb-data-analysis

Data analysis and insights from NYC Airbnb listings, focusing on key metrics such as host performance, neighborhood trends, pricing, and customer reviews. Comprehensive documentation of ETL processes and analytical methodologies is provided. Perfect for understanding Airbnb dynamics and decision-making in the NYC market.

advanced-excel business-intelligence data-analysis data-analytics data-visualization power-bi ppt

Last synced: 10 Jan 2025

https://github.com/vara-co/python-api-challenge

Weather and Perfect Vacationing Spots Worldwide, by using APIs

api apis data-analysis data-visualization hvplot jupyter-notebook matplotlib pandas vacation weather

Last synced: 02 Feb 2025

https://github.com/allanotieno254/employee-performance-tracker-excel-

An Excel-based tool to track and evaluate employee performance, compliance, and skills assessments with summary statistics and visual charts

compliance-tracker data-analysis employee-performance-analysis excel human-resources

Last synced: 17 Feb 2025

https://github.com/allanotieno254/road-accident-data-analysis-dashboard-using-excel

This repository contains the Road Accident Data Analysis Dashboard, a comprehensive Excel-based tool designed to provide in-depth analysis and visualization of road accident data.

dashboards-excel data-analysis excel kpi visualization

Last synced: 17 Feb 2025

https://github.com/anushadatta/airbnb-in-seattle

🏨 Understanding the Airbnb rental landscape in Seattle using data science.

airbnb data-analysis data-exploration data-visualization datascience sentiment-analysis

Last synced: 05 Feb 2025

https://github.com/alphonsg/bimana

Package for performing automated bio-image analysis tasks.

bioimage-analysis bioinformatics data-analysis deep-learning edge-detection image-analysis image-processing

Last synced: 13 Feb 2025

https://github.com/johnsesana/eda-liquor-sales

Exploratory Data Analysis on Public Datasets

data-analysis data-visualization sql tableau-dashboards

Last synced: 17 Jan 2025

https://github.com/quantumudit/sales-statistical-analysis

This project focuses on a statistical analysis (using SQL queries) of various key metrics that impacts the overall sales of a certain fictitious store.

data-analysis postgresql sales-analysis sql statistics

Last synced: 17 Feb 2025

https://github.com/jmssnr/shuffle-kit

shuffle-kit: model and analyze playing card shuffles in Python

data-analysis playing-cards python shuffle statistics

Last synced: 02 Jan 2025

https://github.com/wrighang/shipping-data-analysis

Independent Project: Transit time trends analysis following a major shipping process change.

data-analysis matplotlib numpy pandas python

Last synced: 23 Jan 2025

https://github.com/manikantasanjay/time_series_data_analysis_on_stocks

Time Series Data Analysis project on Daily Stock Prices of the following companies(Apple, Microsoft, Google, Amazon) for a span of 5 years.

data-analysis pandas stock time-series time-series-analysis

Last synced: 14 Feb 2025

https://github.com/titanscouting/tra-superscript

The Red Alliance data analysis package

data-analysis frc-scouting hacktoberfest python

Last synced: 22 Nov 2024

https://github.com/0xpr03/clantool

CF Management & Data Analysis Tool, crawler backend in rust

backend-server crawler data-analysis rust

Last synced: 02 Jan 2025

https://github.com/nomadsdev/sys-moninsight

System Monitoring and Analysis Tool is a utility for real-time performance tracking. It logs CPU, memory, and disk usage, provides visual graphs, and offers performance recommendations. Perfect for optimizing system efficiency.

automation cpu-usage data-analysis data-visualization disk-usage matplotlib memory-usage performance-analysis performance-optimization psutil python real-time-monitoring resource-management sys-moninsight system-metrics

Last synced: 02 Jan 2025

https://github.com/2kabhishek/pyramen

Data Analysis for Ramen 🍜💹

csv data data-analysis fun python report

Last synced: 12 Jan 2025

https://github.com/arhcoder/base-hackathon-2022

💸 Sistema que analiza las facturas de compra-venta de una empresa de importaciones y exportaciones, y crea una base de conocimiento con la que crea sugerencias de abastecimiento para las empresas clientes de Banco BASE, con el fin de ahorrarles dinero.

algorithms bank companies data-analysis decision-making exportation hackaton importation javascript mysql python suggestions

Last synced: 10 Jan 2025

https://github.com/ssreeramj/youtube_channels_analysis

This web app gives a detailed analysis of the videos uploaded in a particular youtube channel.

data-analysis heroku pandas python streamlit youtube

Last synced: 02 Jan 2025

https://github.com/draym/swmanager

Web-app to help you in your daily life raids in SpacesWars thanks to game statistics and data management

dashboard-application data-analysis data-visualization game-data game-utility

Last synced: 10 Jan 2025

https://github.com/jfjlaros/spreadscript

SpreadScript: Use a spreadsheet as a function.

automation command-line data-analysis evaluation function interface spreadsheet

Last synced: 12 Jan 2025

https://github.com/pheithar/socialdata_madridcentral

Social data and visualization course at DTU - 2022. Effectiveness of Madrid Central

data-analysis data-visualization jupyer-notebook madrid python

Last synced: 29 Jan 2025

https://github.com/colburncodes/se_pudding_2023

This project is a React app designed to showcase research conducted by a team of data scientists and data analysts. The app is utilizing React and React-Chartjs-2

chartjs-2 data-analysis data-science data-visualization react-chartjs-2 reactjs

Last synced: 22 Jan 2025

https://github.com/sathyasris27/statistical-analysis-on-rehoming-time-for-different-dog-breeds-in-animal-shelter

The aim of this project is given a collection of records documenting the stray, unwanted, or neglected dogs sent to animal shelters to be rehomed, we analyse their rehoming patterns based on their breeds.

data-analysis r statistical-analysis statistical-inference statistical-models

Last synced: 10 Jan 2025

https://github.com/sathyasris27/time-series-and-spectral-analysis-

The aim of this project involves the analyses the data, removing trends and seasonal effects, identifying the underlying process, understanding the dominant frequencies, and using the residuals to make predictions.

data-analysis data-visualization forecasting r spectral-analysis time-series-analysis

Last synced: 10 Jan 2025

https://github.com/gauravcodepro/numpy-builder

A numpy shell builder to extract and how to use the numpy across the arrays.I am putting the entire manual for those who like to search immediately rather than looking here and there.

bash-prompt bash-script bash-scripting data-analysis data-mining data-science numpy numpy-arrays shell-prompt shell-script

Last synced: 02 Jan 2025

https://github.com/raad07/sql_project-world_layoffs_dataset

This is a SQL project which comprises the Data Cleaning in the first part and Exploratory Data Analysis (EDA) in the second part.

data-analysis database mysql sql

Last synced: 22 Jan 2025

https://github.com/faisal-khann/diwali-sales-analysis

The "Diwali Sales Analysis" project aims to analyze the sales data during the Diwali festival period to uncover insights and trends that can help improve marketing strategies and sales performance in the future

csv data-analysis eda jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 29 Jan 2025

https://github.com/lobooooooo14/badwords-pt-br

💬 Wordlist com palavrões em pt-BR para análise de dados, filtros, ou texto considerado "evitável"

badword-filter badwords brasil data-analysis filter filter-lists filterlist portugues portuguese text-analysis wordlist

Last synced: 30 Jan 2025

https://github.com/phammings/sales-management-analysis

Sales management analysis and Power BI dashboard for sample business request and user stories

data-analysis excel powerbi sql

Last synced: 15 Jan 2025

https://github.com/yonatanadam/film-success-prediction

Analyzing Hollywood movie success based on genre, target audience, and runtime using machine learning

data-analysis ipynb machine-learning

Last synced: 13 Feb 2025

https://github.com/ryanfranklin237/data-visualization-python

A tool that allows you to visualize data from a csv or excel file in a graph or charts form

data-analysis data-science data-visualization matplotlib pandas-dataframe python

Last synced: 10 Jan 2025

https://github.com/ryanfranklin237/data-cleansing

A group of python scripts that clean large data sets by removing duplicate data, putting data in correct formats, and removing redundant cells

data-analysis data-cleaning data-science extract-transform-load pandas-dataframe python

Last synced: 10 Jan 2025