Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/cosmoduende/r-uber-trips-analyisis

Explore your activity on Uber with R: How to analyze and visualize your personal data history. Find out how you consume the Uber App using a copy of your data.

analisis-de-data data-analysis data-analytics data-science data-visualisation data-visualization data-viz eda flexdashboard ggmap ggplot2 mobility-as-a-service qmplot r-language r-programming ridesharing uber uber-data visualizacion-de-datos

Last synced: 27 Dec 2024

https://github.com/kumaranand05/suicide-rate-analysis

Analysis of Mortality data of WHO and visualization using Power BI

analytics data-analysis data-visualization mortality-rates powerbi python suicide-dataset suicide-rate

Last synced: 25 Dec 2024

https://github.com/monddavila/online-retail-data-analysis

Online Retail Exploratory Data Analysis with Python

data-analysis jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 14 Jan 2025

https://github.com/rodrigojunqueiradev/exploracao-e-limpeza-de-dados

Repositório utilizado para estudos de "Exploração e Limpeza de Dados" seguindo como guia o livro "Projetos de Ciência de Dados com Python"

data-analysis data-engineering data-science data-visualization datascience matplotlib matplotlib-pyplot numpy pandas python python-3 python3

Last synced: 20 Jan 2025

https://github.com/rodrigojunqueiradev/python-exercises

Repositório para armazenar exercícios realizados na linguagem Python / Repository to organize exercises with Python language

data-analysis data-science data-structures data-visualization database math pandas pandas-python python python-3 python3 sql statistics

Last synced: 20 Jan 2025

https://github.com/devbigboy/excel-power-query-get-transform

Power Query is a feature in Excel that allows you to quickly import data from multiple sources and easily clean, transform, and reshape it to suit your needs.

data-analysis data-science excel

Last synced: 27 Dec 2024

https://github.com/devbigboy/excel-advanced-formulas-and-functions

how to develop your own style working with formulas and functions. Next, Oz covers a variety of formulas such as the XLOOKUP/VLOOKUP and INDEX functions, counting and statistical functions, text functions, and date/time, array, math, and information functions.

data-analysis excel

Last synced: 27 Dec 2024

https://github.com/karlyndiary/global-electronics-retailer-sales-and-customer-insights

Developed an analysis using Python, SQL, and Excel to examine revenue, customer demographics, and profit drivers for a Global Electronics Retailer. The findings aim to enhance business strategies and improve overall performance.

dashboard data-analysis data-cleaning-and-preprocessing data-pipeline data-visualization etl microsoft-excel microsoft-sql-server python sql

Last synced: 13 Dec 2024

https://github.com/josafary-ds/curso_dnc

Repositório para armazenamento dos arquivos de estudo e projetos DNC - Cientista de Dados

data-analysis data-science data-visualization machine-learning powerbi python

Last synced: 20 Jan 2025

https://github.com/archie-cm/credit_risk_model_vix_id-x_partners

The objective project is to decrease the company's losses by up to 30% through bad loans by creating a machine learning system to assist in automating loan assessments

credit-risk data-analysis data-visualization machine-learning scorecard

Last synced: 20 Jan 2025

https://github.com/archie-cm/a-b-testing-mobile-games

This project have objective to examine what happens when the first gate in the game was moved from level 30 to level 40. When a player installed the game, he or she was randomly assigned to either gate30 or gate40.

abtesting data-analysis python retention-rate

Last synced: 20 Jan 2025

https://github.com/banyc/dfplot

Summarize a data frame by plotting. `cargo install --git https://github.com/Banyc/dfplot.git`.

csv data-analysis plotly plotting statistics

Last synced: 20 Jan 2025

https://github.com/banyc/csv_logger

Long-term logger for data analysis

csv data-analysis logging

Last synced: 20 Jan 2025

https://github.com/mae776569/weratedogs-wrangling

Wrangling WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations

data-analysis data-science data-visualization tweets twitter-api

Last synced: 27 Dec 2024

https://github.com/agustin-caceres/proyecto-data-analyst

Proyecto de Data Analyst sobre servicios de Telecomunicaciones en Argentina

business-analytics business-intelligence data-analysis data-visualization database postgresql python streamlit

Last synced: 11 Nov 2024

https://github.com/kunalkumar2001/sales-project-using-excel-and-sql

Comprehensive sales analysis using SQL, Excel, and PowerPoint to uncover insights on top-sellers, peak times, and branch performance.

data-analysis data-analytics excel mssql sql

Last synced: 27 Dec 2024

https://github.com/multitagging/benchmarks

Provides benchmarks to test the MultiTagging framework

benchmarks data-analysis ethereum smart-contracts vulnerabilities

Last synced: 11 Oct 2024

https://github.com/cecoeco/sas_certificate

my code from Coursera's SAS programming specialization

data-analysis sas

Last synced: 27 Dec 2024

https://github.com/jasoncobra3/whatsapp_chat_analyzer

WhatsApp Chat Analyzer is a powerful tool that provides insightful analytics from your WhatsApp conversations. Whether you're curious about your chatting habits, want to analyze group dynamics, or need to extract meaningful data from your conversations, this tool has got you covered!

data-analysis data-science data-visualization machine-learning streamlit streamlit-webapp whatsapp-chat whatsapp-chat-analyzer

Last synced: 18 Dec 2024

https://github.com/percival33/machine-learning-engineering

Uni project about enhancing fictional music streaming service, by developing machine learning models to generate popular playlists

data-analysis data-science machine-learning python

Last synced: 22 Nov 2024

https://github.com/dionixius7/titanic-disaster-ml-model

This project predicts the survival of passengers on the Titanic by using Kaggle Titanic Disaster Dataset. The dataset contains information related to passengers, such as age, gender, and class. Different machine learning algorithms have been applied for this predictive model to accomplish an accurate prediction that will define the survival chances

data-analysis data-science data-visualization eda knn-classifier machine-learning neural-network python scikit-learn svm tensorflow titanic-kaggle titanic-survival-prediction

Last synced: 18 Jan 2025

https://github.com/devlucho/modelos-predictivos

Modelos predictivos utilizando los algoritmos de Regresión Lineal, Regresión Logística y Árboles de Decisión.

data-analysis jupyter-notebook python3

Last synced: 19 Dec 2024

https://github.com/its-kanii/predictive-maintenance-for-healthcare-equipment

Predictive Maintenance for Healthcare Equipment utilizes machine learning to analyze operational metrics and predict equipment failures. This project leverages a dataset of usage hours, temperature, and maintenance history to enhance equipment reliability and reduce downtime.

data-analysis data-science failure-prediction feature-engineering healthcare-equipment jupyter-notebook machine-learning predictive-maintenance python time-series-analysis

Last synced: 19 Dec 2024

https://github.com/lucycatherine/healthinsuranceproject

This repository contains a machine learning project that analyzes the factors influencing health insurance charges, such as age, smoking status, and medical conditions.

data-analysis data-science data-visualization jupyter-notebook machine-learning python

Last synced: 19 Dec 2024

https://github.com/mohnish88/e-commerce-data-analysis

I analyzed sales data to identify trends and patterns, which significantly enhanced decision-making processes. Additionally, I created interactive visualizations to present these insights clearly and effectively, facilitating better understanding and communication of the data's implications.

data-analysis data-cleaning jupyter-notebook pandas plotly python python-library sales sales-analysis visulaization

Last synced: 20 Jan 2025

https://github.com/mh0386/motorcycle_data_analysis

Data analysis applied to motorcycle dataset.

data-analysis

Last synced: 27 Dec 2024

https://github.com/rakeshkanneeswaran/project-titanic-machine-learning-from-disaster

The Titanic Survival Prediction project uses a Decision Tree algorithm combining both regression and classification to predict passenger survival.

data-analysis data-science data-visualization decision-tree-classifier decision-trees supervised-machine-learning

Last synced: 12 Jan 2025

https://github.com/junpenglao/jaefa

Just Another Eye-movement Filtering Algorithm

data-analysis eye-movement-data eye-tracking

Last synced: 13 Dec 2024

https://github.com/farzeennimran/fashion-mnist-dataset-classification-using-neural-network

Implementation of a Multi-layer Perceptron classifier with hyperparameter tuning and k-fold cross-validation employing GridSearchCV for classifying images on the Fashion MNIST dataset 👗👚👖

artificial-intelligence data-analysis data-mining data-science dataset deep-learning fashion-mnist-dataset gridsearchcv hyperparameter-tuning kfold-cross-validation machine-learning multilayer-perceptron-network neural-network numpy pandas python sklearn

Last synced: 26 Dec 2024

https://github.com/soumya-thoutam/covid-19-impact-on-u.s.-states-and-colleges

Covid-19 analysis and impact on United States Colleges and States using SQL and Tableau.

covid-19 dashboard data-analysis data-visualization dataset sql sql-server tableau

Last synced: 11 Jan 2025

https://github.com/fabioassuncao/desafio-tecnico-iede

Este projeto foi desenvolvido como parte do Desafio Técnico IEDE.

challenge cloudflare data-analysis docker laravel php

Last synced: 26 Jan 2025

https://github.com/ssoehdata/sql_for_data_science_specialization_course

Materials and Certifications from the SQL for DataScience Course

data-analysis data-science database databricks postgresql sql sqlite

Last synced: 26 Jan 2025

https://github.com/tushar2704/sql-query

Repository is designed to help you strengthen your SQL query skills by providing a collection of common and interview-based SQL queries for practice.

artificial-intelligence data-analysis data-engineering data-science database database-management database-schema relational-databases sql sql-database sql-query tushar2704

Last synced: 27 Dec 2024

https://github.com/tushar2704/loan-limits-by-country

This project aims to leverage a diverse dataset encompassing economic indicators, demographic factors, and credit history to establish a predictive model. By establishing appropriate loan limits, financial institutions can enhance risk management, ensure responsible lending, and promote financial inclusivity.

artificial-intelligence data-analysis data-science loan project tushar2704

Last synced: 27 Dec 2024

https://github.com/tushar2704/hiring-process-analytics

In this project, I am analyzing hiring process data to gain insights from about records of previous hires within a multinational company. By analyzing this data, I am aiming to uncover valuable trends and information about the company's hiring process, which can contribute to making informed decisions and improvements for the future.

data-analysis data-cleaning data-science data-wrangling excel tushar2704

Last synced: 27 Dec 2024

https://github.com/tushar2704/instagram-user-analytics

This project revolves around the exploration and analysis of user engagement patterns on the popular social media platform, Instagram. By delving into user data and interaction metrics, this project aims to provide valuable insights into user behavior, content performance, and trends.

artificial-intelligence data-analysis data-science instagram project tushar2704

Last synced: 27 Dec 2024

https://github.com/tushar2704/imdb-movie-analysis

This project extracts meaningful insights, trends, and patterns from the data, shedding light on various aspects of the movie industry. By leveraging this analysis, filmmakers, studios, and enthusiasts can gain valuable information to inform decision-making, understand audience preferences, and contribute to the creation of successful movies.

artificial-intelligence data-analysis data-science imdb project tushar2704

Last synced: 27 Dec 2024

https://github.com/tushar2704/employee-distribution

This repository contains valuable insights and visualizations derived from an extensive HR dataset spanning from 2000 to 2020, with over 22,000 rows.

data-analysis data-visualization excel postgresql powerbi sql tushar2704

Last synced: 27 Dec 2024

https://github.com/tushar2704/consumables_sales_dashboard

Welcome to the Consumable Sales Dashboard, a powerful and intuitive data visualization tool built using Power BI. This dashboard offers a comprehensive view of sales data for consumable products, allowing you to quickly and easily analyze performance and identify trends.

dashboard data-analysis data-analytics data-science excel postgresql powerbi streamlit-tushar2704 tushar2704

Last synced: 27 Dec 2024

https://github.com/ahmedkhaled404/data-cleaning-and-eda-layoffs-mysql

This project involves cleaning a dataset containing information about layoffs from companies around the world.

data data-analysis data-cleaning data-preprocessing datacleaning eda exploratory-data-analysis mysql sql

Last synced: 12 Jan 2025

https://github.com/giatraskon/sandbox.bio-solutions

Bash scripts replicating the commands from sandbox.bio's interactive bioinformatics tutorials, organized by categories such as Data Exploration, File Formats, Quality Control, and Data Analysis.

bam-files bash bed-files bioinformatics bioinformatics-workflows command-line-tools computational-biology data-analysis data-exploration data-wrangling fasta-files fastq-files file-formats genomic-data quality-control sandbox-bio sandbox-bio-tutorials sequence-alignment unix-shell variant-calling

Last synced: 13 Dec 2024

https://github.com/edoaltamura/rotational-ksz-macsis

Repository for suppelementary material from my publication on the rotational kinetic SZ effect in MACSIS

cosmology data-analysis galaxy-clusters high-performance-computing hydrodynamics

Last synced: 05 Jan 2025

https://github.com/greenpau/esqrunner

Run Elasticsearh queries and create metrics based on the result of the queries in Elasticsearch database.

data-analysis elasticsearch query-builder querydsl

Last synced: 26 Jan 2025

https://github.com/meinhere/ta-pendat

Proyek Akhir Mata Kuliah Penambangan Data - Klasifikasi Trauma Pasien Menggunakan Metode Naive Bayes

data-analysis data-mining python

Last synced: 25 Dec 2024

https://github.com/meinhere/dicoding-analisis-data

Submission Analisis Data dengan tema E-Commerce Streamlit App

data-analysis data-mining e-commerce python streamlit

Last synced: 25 Dec 2024

https://github.com/brownred/python-and-sql

Python and SQL (postgreSQL & mySQL) for data analysis.

data-analysis databases python3 sql

Last synced: 26 Jan 2025

https://github.com/aleskandro/r-hadoop-madreduce-examples

A lot of examples about using R with hadoop for MapReduce with and without libraries as rhadoop/rhipe - [email protected] - Advanced Programming Languages

data-analysis hadoop mapreduce r

Last synced: 27 Dec 2024

https://github.com/nemat-al/multivariate_data_analysis

Tasks for Multivariate Data Analysis Course @ ITMO University

data-analysis multivariate-analysis python

Last synced: 23 Jan 2025

https://github.com/jendives2000/regressions

Performing of a Linear Regression analysis to determine the strength of the relationship between the number of reviews and sales for a retail company.

data-analysis linear-regression pearson-correlation-coefficient regression

Last synced: 26 Jan 2025

https://github.com/matteofasulo/cdc-finf

Project of fundamentals of Computer Science

data-analysis data-science data-visualization numpy pandas python python3

Last synced: 20 Jan 2025

https://github.com/sakan811/honkai-star-rail-a-few-fun-insights-with-data-analysis

The project gives insights that delve into the Honkai Star Rail's character's stats of all available characters as of the given date.

data data-analysis data-science data-visualization game honkai honkai-star-rail honkai-starrail webscraping webscraping-data webscraping-selenium

Last synced: 05 Jan 2025

https://github.com/sakan811/stress-pattern-occurrence-in-english-words

This project is intended to provide English learners with data that allows them to make a data-driven guess when encountering words that they aren't sure where to stress

data-analysis data-visualization english english-language english-learning language powerbi powerbi-report powerbi-visuals

Last synced: 05 Jan 2025

https://github.com/rahul-jha98/restauranttrends.stats-backend

Application that scrapes the Zomato Dataset and enables the user to visualise the results.

data-analysis data-extraction firebase-storage web-scraping zomato-api

Last synced: 19 Jan 2025

https://github.com/elzasimoes/challenge-xplab

Challenge for data analysis course.

data-analysis data-science jupyter-notebook python

Last synced: 05 Jan 2025

https://github.com/souravsuvarna/whatsapp-chat-analyzer-and-visualizer-web-application

The WhatsApp chat analyzer and visualizer uses NLP algorithms to analyze chat data, tracking usage patterns and presenting insights through visually appealing charts and graphs. It helps users understand communication patterns and behaviors on WhatsApp.

data-analysis data-science data-visualization python python3 streamlit

Last synced: 05 Jan 2025

https://github.com/nikitalpopov/evotor_champ

solution for evotor data challenge

data-analysis data-science python scikit-learn

Last synced: 25 Jan 2025

https://github.com/emanoelcampos/python-onemonth

This repository contains educational materials and projects developed during a Python course offered by OneMonth. It covers Python basics, intermediate concepts, web development with Flask, and data analysis with pandas. The course is structured into weeks, each focusing on a different aspect of Python programming and its applications.

data-analysis flask jupyter-notebook onemonth python python3

Last synced: 27 Dec 2024

https://github.com/surbhi242singh/pizza_sales_project

Used SQL to analyze pizza sales data

data-analysis mysql pizza-sales sql

Last synced: 26 Jan 2025

https://github.com/marcosvbras/udacity-nd109-project-titanic

Data Analysis project to Udacity Nanodegree's course: Artificial Intelligence Programming with Python.

data-analysis data-analyst-nanodegree data-science jupyter-notebook machine-learning python udacity

Last synced: 05 Dec 2024

https://github.com/mmfava/qualesuapergunta-scripts-base-2015-2018

Este repositório contém scripts R utilizados durante meu trabalho de consultoria em bioestatística. Os scripts abrangem várias análises estatísticas e serviram como base para análises que foram realizadas. Eles não são scripts das consultorias ou assessorias em si.

analytics data-analysis r

Last synced: 20 Jan 2025

https://github.com/mmfava/analises-papers

Script base de alguns papers publicados entre 2019 e 2021.

data-analysis r

Last synced: 20 Jan 2025

https://github.com/mmfava/significados-aulas-biologia-quasiexp-2019

Repositório das análises realizadas para o paper "Construção de significados em aulas práticas de laboratório de biologia: uma avaliação por delineamento quase-experimental".

data-analysis r statistics

Last synced: 20 Jan 2025

https://github.com/shivshah19/movie-recommendation-system

This Movie Recommendation System is designed to provide personalized movie recommendations based on user preferences.

cosine-similarity data-analysis machine-learning pandas python streamlit

Last synced: 19 Dec 2024

https://github.com/silveirinhajuan/rotinapy

RotinaPy: Simplify your daily life and maximize productivity with an integrated app for task management, study tracking, flashcards, and more. Built with Streamlit and Python.

data-analysis flashcards llm-integration llm-ui machine-learning ollama productivity python streamlit study study-project study-tracker task-management task-manager

Last synced: 19 Dec 2024

https://github.com/victorlcastro-dsa/coping_struggles_prediction

Repositório para prever dificuldades de enfrentamento com base em dados de saúde mental. Inclui análise, visualização e modelagem usando aprendizado de máquina. Resultados alcançam 86.58% de acurácia com um Voting Classifier.

classification-algorithm data-analysis data-science data-visualization machine-learning-algorithms problem-solving project-based-learning python

Last synced: 23 Jan 2025

https://github.com/virajbhutada/article-clustered-recommendation-system-ml

This project aims to redefine content discovery by delivering personalized article recommendations tailored to individual user preferences. We use advanced machine learning techniques like PCA and K-means clustering to analyze user behavior and article characteristics to provide highly accurate recommendations.

anaconda article-recommendation clustering-algorithm data-analysis data-science keras-tensorflow machine-learning machine-learning-algorithms ml-models numpy pandas plotly python scikit-learn scipy

Last synced: 15 Oct 2024

https://github.com/dataforgeopenaihub/steam-sales-analysis

This repository features an ETL pipeline for retrieving, processing, validating, and ingesting game metadata and sales data from SteamSpy and Steam APIs. Data is stored in a MySQL database on Aiven Cloud and visualized using Tableau dashboards for insightful analysis of gaming trends and sales performance.

data-analysis data-engineering data-pipepline data-warehousing games mysql-database python steam-api tableau typer-cli

Last synced: 10 Oct 2024

https://github.com/jnyambok/epl_dashboard

English Premier League Dashboard summarizing match data from 2009-2024

data-analysis data-science gcp powerbi

Last synced: 26 Jan 2025

https://github.com/jnyambok/google-data-analytics-capstone-project-bella-beat-fitness-company

The following documentation follows the optional Capstone project provided by the Google Data Analytics Course. It follows through the eight stages of data analysis which are Ask, Prepare, Process, Analyze, Share and Act. This Capstone Project was carried out with the help of R programming language, which is a data-centric, accessible language used to organize, modify, clean data frames and create insightful data visualizations. Let’s get into it!

analytics data-analysis python

Last synced: 26 Jan 2025

https://github.com/ginga1402/youtube_analysis

Exploratory Data Analysis on YouTube data

college-project data-analysis pandas-python

Last synced: 10 Dec 2024

https://github.com/enayar478/nomad_machine_learning_dash_app

An interactive Machine Learning app built with Dash and Plotly, developed as part of the Data Analytics Bootcamp at Le Wagon Bordeaux. It allows users to visualize data, make real-time predictions, and explore various model insights.

analytics cachetools dash dashboard-application data-analysis data-science deployment gunicorn interactive-visualization machine-learning pandas plotly plotly-dash prediction-model python python3 render scikit-learn web-application

Last synced: 22 Jan 2025

https://github.com/drill-n-bass/ovh-project

The goal of this task is to prepare statistical analysis of set of data from disks.

anaconda analysis data-analysis data-analysis-python jupyter-notebook matplotlib-python pandas python3 seaborn-plots

Last synced: 27 Dec 2024

https://github.com/drill-n-bass/dealavo-project

Cartesian product from dictionary to list of dictionaries and faster methods for finding index than the `index` method.

data-analysis data-analysis-python matplotlib pandas python python3 random timeit

Last synced: 27 Dec 2024

https://github.com/satvikpraveen/rsvp_case_study

A comprehensive IMDB dataset analysis using SQL. Includes database setup, advanced queries, and actionable insights. Organized with files for database creation, queries, and solutions. Features an Entity-Relationship Diagram (ERD), executive summary, and SQL scripts. Perfect for SQL workflows and business intelligence in the film industry.

aggregate-functions business-intelligence common-table-expressions data-analysis data-driven-decisions data-querying database-design entity-relationship-diagram imdb-dataset relational-database sql subqueries-and-joins

Last synced: 19 Dec 2024