Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/casualcomputer/sql.mechanic

Functions that generate SQL queries that summarize high-dimensional tables stored in various databases (e.g. Microsoft SQL Servers, Netezza, DB2, Postgres, Oracle, MySQL, etc.).

data-analysis data-quality-checks data-science database mysql netezza oracle postgres quality-control r sql sql-server

Last synced: 04 Dec 2024

https://github.com/prithivsakthiur/data-board

Data Boards - Visualization of various plots ( Analysis )

data-analysis gradio huggingface keras mathplotlib pandas plots pyplot scikit-learn seaborn spaces

Last synced: 13 Feb 2025

https://github.com/jimbrig/EDA

Exploratory Data Analysis R Package and Shiny App

data-analysis data-visualization eda r shiny

Last synced: 04 Dec 2024

https://github.com/thecoderpinar/reta

🍃 Explore the world of renewable energy production, analyze historical data, and predict sustainable energy trends. Join us on the journey to a greener future!

arima clean-energy data-analysis data-science data-visualization energy-future forecasting-models innovation renewable-energy sustainability time-series

Last synced: 09 Feb 2025

https://github.com/thealphadollar/messiah

Messiah: The Mighty Son Of God Is Here To Help You Through Times Of Calamity

azure backend data data-analysis flask frontend materialize natural-disasters

Last synced: 13 Feb 2025

https://github.com/nafiealhilaly/analyze-coderhub-sa

A simple web app to analyze/explore coderhub.sa API data, this project was my first real react app.

backend data-analysis eda frontend python react reactjs

Last synced: 08 Feb 2025

https://github.com/zrkhadija/data-analysis-for-financial-time-series

In this notebook, we performed data analysis on financial time series data from Yahoo Finance for the US market. We examined seasonality, trends, stationarity, and other aspects such as outliers and correlations.

autocorrelation correlation-analysis data-analysis financial-analysis time-series-analysis timeseries-forecasting visualization

Last synced: 09 Feb 2025

https://github.com/hvignolo87/ortex-programming-challenge

Coding challenges required for the Python Developer and Data Engineer job positions.

challenge data-analysis finance pandas python scripting sql sqlalchemy

Last synced: 02 Jan 2025

https://github.com/yogeshnile/nifty50-index-time-series-analysis

In this repo i did analysis of Nifty50 five year data from 01-04-2015 to 31-03-2020. Data Downloaded from nse official website.

data-analysis matplotlib nifty numpy pandas plotly python3 time-series-analysis

Last synced: 10 Jan 2025

https://github.com/mindlessmuse666/client-data-analysing-tool

Проект производственной практики: Инструмент для анализа данных, построенный с использованием Python (бэкэнд, фронтэнд PyQt6), Pandas, Matplotlib и SQLite. Это приложение позволяет пользователям загружать данные в формате CSV, фильтровать их, визуализировать ключевые показатели с помощью графиков и создавать отчеты.

data-analysis desktop-application matplotlib pandas pyqt6 pyqt6-desktop-application python sqlite student-project

Last synced: 23 Dec 2024

https://github.com/zelosleone/finncorr

A .NET Core financial analysis tool/API for calculating correlations between time series data with interactive visualizations powered by ML.NET and Plotly.js.

aspnet-core correlation-analysis csv-parser data-analysis dotnet financial-analysis machine-learning ml-net plotly rest-api statistical-analysis swagger time-series visualization

Last synced: 06 Feb 2025

https://github.com/tirendazacademy/data-sets

Data sets for Tirendaz Akademi Youtube

data-analysis dataset

Last synced: 01 Jan 2025

https://github.com/virajbhutada/tableau-data-vizzes

Engage with a growing collection of Tableau dashboards covering financial trends, HR analytics, streaming service insights, real estate dynamics, and more. Meticulously crafted for valuable insights, this repository continues to expand with new and compelling visualizations.

business-analytics data-analysis data-visualization hr-analytics industry-trends netflix performance-metrics stock-market-analysis strategic-analytics tableau visual-insights

Last synced: 10 Jan 2025

https://github.com/jimbrig/eda

Exploratory Data Analysis R Package and Shiny App

data-analysis data-visualization eda r shiny

Last synced: 23 Jan 2025

https://github.com/juliusmarkwei/concrete-data

Data analysis, machine learning, model evaluation and optimization on the Concret_ Dataset

data-analysis data-science data-visualization ensemble-learning machine-learning modeling

Last synced: 01 Jan 2025

https://github.com/quantumudit/analyzing-gamerevolution-games

This project focuses on scraping data related to video games from the GameRevolution website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 17 Feb 2025

https://github.com/verbasik/yandex.practicum.datascience

Портфолио проектов Data Science, выполненных в рамках профессиональной переподготовки в Яндекс.Практикум. Включает исследования в области финансов, недвижимости, кинопроката и других, с использованием статистики, машинного обучения и анализа данных.

data-analysis data-science machine-learning yandex-praktikum

Last synced: 10 Jan 2025

https://github.com/grburgess/gbm_kitty

Database, reduce, and analyze GBM data without having to know anything. Curiosity killed the catalog.

3ml catalogue data-analysis fermi-science grbs pipelines

Last synced: 23 Jan 2025

https://github.com/milind220/hk-air-quality-analysis

My final project for a statistics and data analysis course. Whew that was a lot of graphs!

data-analysis jupyter-notebook numpy pandas python python3 scipy seaborn statistics

Last synced: 03 Jan 2025

https://github.com/timzatko/fifa-19-dataset-machine-learning

Player's value prediction and game position classification on FIFA 19 dataset.

data-analysis fifa19 machine-learning scikit-learn

Last synced: 03 Jan 2025

https://github.com/seyedhosseinzadeh/ws_tm

Weather web scraping and Time series model to predict temperature, humidity and barometer

data-analysis deep-learning lstm-model machine-learning prediction prediction-model weather web-scraping

Last synced: 10 Jan 2025

https://github.com/olow304/goboard

Python Data Analysis Dashboard using Public Dataset, Django

dashboard dashboard-templates data-analysis data-science django jupyter-notebook machine-learning python sklearn

Last synced: 04 Jan 2025

https://github.com/ehtisham-sadiq/ai-pioneers-datascience-arena

This repository is dedicated to the AI Amigos team's participation in the Artificial Intelligence (AI) competition with a focus on Data Science.

artificial-intelligence competition data-analysis data-science data-visualization machine-learning model-building model-evaluation numpy pandas python3 supervised-learning unsupervised-learning

Last synced: 11 Jan 2025

https://github.com/vandita2020/merra2_nasa_wind_speed_analysis

In this study, we aim to explore the vulnerability of power grids in the south-east region of the USA with the help of data analysis tools and machine learning algorithms

data-analysis data-science machine-learning-algorithms python

Last synced: 11 Jan 2025

https://github.com/deep-diver/data-analysis-on-titanic

applying data analysis on titanic data sheet

data-analysis titanic-data

Last synced: 05 Feb 2025

https://github.com/worst001/note_machine_learning

整理了机器学习相关资料与手册,包括数学基础、机器学习模型实现示例、神经网络。

ai data-analysis deep-learning development guide learning machine-learning markdown mkdocs note notebook

Last synced: 12 Jan 2025

https://github.com/mwoss/mlflow-stock-market-example

Stock market prediction - machine learning pipeline using MLFlow.

anaconda data-analysis databricks example lstm mlflow python stock-market stock-price-prediction tutorial

Last synced: 24 Jan 2025

https://github.com/mynenik/xyplot-32

Extensible Plotting and Data Analysis Program for 32-bit x86 GNU/Linux

cpp data-analysis data-manipulation data-visualization forth linux-app motif xwindows

Last synced: 24 Jan 2025

https://github.com/martincastroalvarez/django-data-analytics

Data Analytics, PnL, LTV & retention analysis with Django

analytics beautifulsoup4 d3 d3js data-analysis django ltv rest-api visualization

Last synced: 14 Feb 2025

https://github.com/thennen/py-ivtools

A package for flexible and reproducible measurement and analysis of current-voltage characteristics of electronic devices.

current-voltage data-analysis data-visualization electrical-engineering emerging-technology instrumentation measurements

Last synced: 24 Jan 2025

https://github.com/depressioncenter/data-and-design-core

Code developed by the EFDC Data and Design Core team to support mental health research.

data-analysis data-science efdc inference r statistical-analysis umich

Last synced: 25 Jan 2025

https://github.com/narius2030/sakila-datawarehouse-ssis

Implement a simple data warehouse to store Saklia data - Create data pipelines for extract, transform and load data from source to warehouse - Retrieve data in warehouse to explore and do several analysis

data-analysis data-integration data-modeling data-visualization excel microsoft-sql-server power-bi ssas ssis

Last synced: 07 Feb 2025

https://github.com/ziaeemehr/itng_nest

Nest Simulator quick guides and examples, adding new model using NESTML

computational-neuroscience data-analysis nest-simulator neuroscience

Last synced: 07 Feb 2025

https://github.com/tathithienthanh/datamining-banking-dataset

Implement some learned data mining techniques and predict if the client will subscribe to a term deposit

apriori association-rules classification clustering data-analysis data-mining data-processing google-colab ipynb kmeans naive-bayes py python scikit-learn svm visualization

Last synced: 25 Jan 2025

https://github.com/atxtechbro/flightradar24

Advanced Python application leveraging the power of APIs and the pandas library to retrieve and perform in-depth analysis of flight data from Flightradar24. It uncovers insights such as the most common departure and arrival cities, contributing to the field of aviation data science.

api-integration aviation-data data-analysis data-science data-visualization flightradar24-api pandas-library python requests-library web-scraping

Last synced: 25 Jan 2025

https://github.com/rajshrestha86/police-brutality-data-analysis

In this project, we analyze the events after George Floyd’s death. The protests and riots across the United States and sentiments of news articles of three different news sources that have different political leaning. We will see how these media reacted after Floyd’s death and see the effect of media bias on the sentiments of news for #BlackLivesMatter and #AllLivesMatter movement. We will also see if there is a correlation between the police budget and the number of protests. This analysis will help us to see if there is really a need for defunding police to reduce police brutality and casualties. We will also see the correlation of partisan segregation and number of deaths to see if political preference has an effect on the number of deaths by police.

data-analysis matplotlib pandas python sentiment-analysis web-scraping

Last synced: 07 Feb 2025

https://github.com/zachlagden/spotify-listening-analyzer

A comprehensive Python tool for analyzing your Spotify listening history data.

analytics data-analysis pandas python spotify-web-api spotipy

Last synced: 07 Feb 2025

https://github.com/chaitanyac22/house-price-prediction-project-for-a-us-based-housing-company

The goal of this project is to garner data insights using data analytics to purchase houses at a price below their actual value and flip them on at a higher price. This project aims at building an effective regression model using regularization (i.e. advanced linear regression: Ridge and Lasso regression) in order to predict the actual values of prospective housing properties and decide whether to invest in them or not.

advanced-linear-regression business-analytics data-analysis data-cleaning data-manipulation data-visualization exploratory-data-analysis feature-engineering lasso-regression linear-regression machine-learning model-building model-evaluation prediction-model python3 regularization rfe ridge-regression statistics

Last synced: 01 Feb 2025

https://github.com/bkataru/physics-ia

Programs and files written for Astrostatistics for IB Physics IA. Topic: Visualizing and analyzing the habitable zones for 150,000 stars from the hipparcos catalogue.

astronomical-algorithms astronomy astrophysics astrostatistics data-analysis data-science data-visualization matplotlib plotting

Last synced: 15 Feb 2025

https://github.com/poga/dat-ipynb-demo

use ipython notebook to analyze data in dat archive

dat data-analysis distributed jupyter-notebook

Last synced: 08 Feb 2025

https://github.com/muzammil-13/data_analysis-inmakes

A data-driven project that leverages machine learning to predict Bitcoin price trends. Using historical Bitcoin data, this analysis provides 30-day price forecasts through advanced statistical modeling.

data-analysis data-science machine-learning numpy pandas python python-library

Last synced: 15 Feb 2025

https://github.com/lafayettegabe/g2m-insight-for-cab-investment-firm

📊 Exploratory Data Analysis (EDA) on multiple datasets related to the cab industry in the US, to provide actionable insights and recommendations to a private firm looking to invest in the market. The analysis includes data cleaning, transformation, visualization, and hypothesis testing.

big-data data data-analysis data-science data-visualization eda gotomarket

Last synced: 08 Feb 2025

https://github.com/joanacmbarros/ardm-website

Website to support the R in Pharma 2023 workshop on the ARDM

analysis-results automation clinical-data data-analysis data-model r-in-pharma

Last synced: 09 Feb 2025

https://github.com/rapidsurveys/oldr

An Implementation of the Rapid Assessment Method for Older People (RAM-OP)

assessment data-analysis epidata estimate odk older-people r ram-op ranalyticflow rapid-assessment

Last synced: 16 Feb 2025

https://github.com/invictusaman/socioeconomic-indicators-in-chicago-sql-python

This project displays how to create a database connection in notebook, update database using python and how to run Python program and SQL queries together. It uses SQLite and Chicago dataset for analysis.

data-analysis jupyter-notebook python sql sql-queries sqlite

Last synced: 16 Feb 2025

https://github.com/virajbhutada/movie-rental-store-analytics-sql-powerbi-excel

Dive into the DVD rental industry with my Capstone project, Movie Rental Analytics. Analyzing the Sakila DVD Rental Store Database, I extract insights through exploratory data analysis (EDA) and Power BI visualizations. Findings inform strategies for optimizing film inventory, enhancing business operations, and customer experiences.

business-intelligence capstone-project customer-behavior-analysis data-analysis data-science excel exploratory-data-analysis film-ratings mece movie-database movie-rental mysql powerbi powerbi-visuals revenue-analysis sql sql-database

Last synced: 10 Jan 2025

https://github.com/walidalsafadi/titanic-disaster

In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).

data-analysis data-science decision-trees eda gradient-boosting knearest-neighbors machine-learning-algorithms naive-bayes random-forest titanic-kaggle titanic-survival-prediction

Last synced: 22 Jan 2025

https://github.com/ronaldkanyepi/python-sreamlit-duplicate-records-finder-remover

This is a duplicate remover on csv,excel or txt files based on single or multi columns

css data-analysis data-visualization datascience python streamlit

Last synced: 04 Jan 2025

https://github.com/leosimoes/uerj-tcc-analisador-dados

Trabalho de conclusão de curso (TCC) em Engenharia de Computação. Aplicativo Web para preparação e análise de dados, criação de gráficos e modelos de regressão linear e logistica.

computer-engineer data-analysis data-science data-visualization linear-logistic linear-regression python streamlit

Last synced: 30 Jan 2025

https://github.com/pheithar/socialdata_madridcentral

Social data and visualization course at DTU - 2022. Effectiveness of Madrid Central

data-analysis data-visualization jupyer-notebook madrid python

Last synced: 29 Jan 2025

https://github.com/bishtrishu/pizza_sales_data_analysis_sql

This project is a comprehensive data analysis of pizza sales, aimed at uncovering key insights and trends to inform business decisions. Using a combination of SQL, Python, and data visualization tools, the project analyzes sales data to understand customer preferences, peak sales periods, and the most popular pizza types.

cloud data data-analysis data-science data-visualization dataanalytics database mysql oracle-database

Last synced: 04 Jan 2025

https://github.com/seabbs/explorebcgonoutcomes

Analysis to explore the association of BCG vaccination and TB outcomes.

bcg data-analysis regression rstats tuberculosis

Last synced: 01 Jan 2025

https://github.com/rayyan9477/household-transactions-analysis-and-clustering

This project involves analyzing household transaction data to gain insights into spending patterns and behaviors. The analysis includes data cleaning, exploratory data analysis (EDA), clustering using K-Means, and visualization of customer segments.

customer-segmentation data-analysis data-cleaning data-science exploratory-data-analysis kmeans-clustering machine-learning

Last synced: 10 Jan 2025

https://github.com/rayyan9477/youtube-spam-detection-with-flask-and-machine-learning

This is a web application built using Flask that detects spam comments on YouTube using a Naive Bayes classifier. It leverages techniques such as CountVectorizer for feature extraction and scikit-learn for machine learning. The application reads data from a CSV file and predicts whether a comment is spam or not.

data-analysis data-science machine-learning nlp-machine-learning spam-detection

Last synced: 10 Jan 2025

https://github.com/nelsonkariuki/dataanalysis

This project involves data analysis of vido game sales from https://www.kaggle.com/gregorut/videogamesales/download

data-analysis data-visualization python

Last synced: 10 Jan 2025

https://github.com/rayyan9477/coin-detection-project

This Coin Detection Project leverages machine learning techniques to identify coins using a dataset from Kaggle. Key libraries utilized include OpenCV for image processing, TensorFlow for model training, and Pandas for data manipulation. The project also employs NumPy for numerical operations and Matplotlib for visualization.

computer-vision data-analysis data-science data-visualization machine-learning notebook python

Last synced: 10 Jan 2025

https://github.com/rayyan9477/multiple-disease-prediction-system

This repository contains a Multiple Disease Prediction System leveraging machine learning techniques for accurate predictions. It utilizes Python, Pandas, Scikit-learn, and Flask for data preprocessing, model building, and web deployment. Explore the project and connect on LinkedIn for collaborations.

data-analysis data-science machine-learning python streamlit

Last synced: 10 Jan 2025

https://github.com/agustinmusanti/delitosencaba-proyectofinal-dataanalytics-coderhouse

En este repositorio muestro mi proyecto final en el curso "Data Analytics" de Coderhouse.

data-analysis excel powerbi

Last synced: 15 Feb 2025

https://github.com/rayyan9477/diamond-price-forecasting

This is a comprehensive machine learning project focused on predicting diamond prices. Using a dataset of diamond attributes, the project implements various machine learning models to forecast prices. Key features include data preprocessing, exploratory data analysis (EDA), and model training with algorithms such as Linear Regression, Decision Tree

data-analysis data-science decision-trees eda linear-regression machine-learning

Last synced: 10 Jan 2025

https://github.com/mynenik/xyplot-win32

XYPLOT Plotting and Data Analysis Program for 32-bit Windows

cpp data-analysis data-manipulation data-visualization forth mfc windows-app

Last synced: 24 Jan 2025

https://github.com/oguzgn/budget-checker-for-campaign-budget-allocation

This project focuses on modeling campaign performance data for Looker, helping determine which campaigns to scale up or cut back. It aggregates metrics over the last 7 and 30 days, providing actionable insights for budget optimization and performance improvement.

budget-allocation budget-controller budget-management calculated-fields campaign-analytics data-analysis data-modeling looker-studio sql

Last synced: 07 Feb 2025

https://github.com/kaz-yos/distributed

Comparison of Privacy-Protecting Analytic and Data-sharing Methods: a Simulation Study (Pharmacoepidemiol Drug Saf 2018)

data-analysis epidemiology statistics

Last synced: 11 Jan 2025

https://github.com/ivanildobarauna-dev/currency-quote

Complete solution for extracting currency pair quotes data with comprehensive testing, parameter validation, flexible configuration management, Hexagonal Architecture, CI/CD pipelines, code quality tools, and detailed documentation.

data-analysis data-analytics data-engineering library pypi-packages python

Last synced: 12 Feb 2025

https://github.com/manikantasanjay/time_series_data_analysis_on_stocks

Time Series Data Analysis project on Daily Stock Prices of the following companies(Apple, Microsoft, Google, Amazon) for a span of 5 years.

data-analysis pandas stock time-series time-series-analysis

Last synced: 14 Feb 2025

https://github.com/magnaopus1/synthron-cfd-trader-pro

SYNTHRON CFD Trader PRO is a cutting-edge trading platform featuring raw, custom-designed machine learning models. From reinforcement learning for dynamic strategies to predictive analytics, sentiment analysis, and optimization techniques, it empowers trading across stocks, forex, indices, commodities, futures, and crypto with precision.

ai backtesting cfd commodities data-analysis data-science data-structures forex futures indices machine-learning trading

Last synced: 05 Feb 2025

https://github.com/alexandrelamarre/fission

Data analytics & Structured streaming optimized for the Edge

data-analysis data-engineering rust structured-data unstructured-data

Last synced: 11 Jan 2025

https://github.com/ganesh2409/cricket-player-performance

This repository contains a comprehensive project focused on analyzing cricket player performance using various datasets, including batting, bowling, and match results. The project involves data preprocessing, feature engineering, and model training to predict and evaluate player performance scores. It includes detailed scripts for data analysis

cricket-performance-analysis data-analysis machine-learning sports-analytics

Last synced: 11 Jan 2025

https://github.com/shriram-vibhute/digit_classification

This project demonstrates various machine learning techniques for classifying handwritten digits from the MNIST dataset. It covers data preprocessing, model training, evaluation, and advanced classification strategies.

classification data-analysis data-visualization machine-learning matplotlib numpy pandas sk-learn

Last synced: 15 Jan 2025

https://github.com/leosimoes/udacity-starbucks

Project 3 of the Udacity Machine Learning Engineer Nanodegree Program. Data analysis and machine learning application to Starbukcs data.

aws-iam aws-s3 aws-sagemaker data-analysis data-science machine-learning python

Last synced: 30 Jan 2025

https://github.com/as16082023/coffee-bean-sales-analysis

Analyzing coffee bean sales data to optimize consumer targeting, product offerings, and strategic marketing in the coffee industry.

coffee-bean-sales dashboard data-analysis data-visualization ms-excel

Last synced: 15 Feb 2025

https://github.com/shriram-vibhute/data-analysis

This repository offers a comprehensive collection of data analysis techniques using NumPy Pandas, Matplotlib and Seaborn.

data-aggregation data-analysis data-visualization data-wrangling matplotlib numpy pandas seaborn

Last synced: 15 Jan 2025

https://github.com/ysayaovong/stockroom_management

The Stockroom Management project is a comprehensive tool that automates and simplifies the process of managing inventory in stockrooms. By incorporating features like real-time updates, report generation, and low-stock alerts, it helps businesses save time, reduce errors, and optimize their inventory operations.

business-applications data-analysis data-visualization database-management inventory-control inventory-management logistics sql warehouse warehouse-management

Last synced: 30 Jan 2025