Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/dcs-training/pca-2023

PCA workshop. In this repo, you are going to find the code and files we are going to use for the practical part of the workshop, together with the ppt associated with this training

data-analysis data-visualisation data-wrangling r statistics

Last synced: 07 Jan 2025

https://github.com/dcs-training/introtodatabases

This repository host the material connected to a training developed by Dave Elsmore (Edina) for CDCS. Go to the readme file

data-analysis data-wrangling databases sql

Last synced: 07 Jan 2025

https://github.com/dcs-training/decode-winterschool

In here you can find material on cluster analysis, data wrangling, and network analysis. Go to the readme file for more info

data-analysis data-visualisation data-wrangling gephi network-analysis python r statistics

Last synced: 07 Jan 2025

https://github.com/dcs-training/r-visualisation-and-stats

This repository contains material from a 8 classes course on Data Visualisation and statistics with R

data-analysis data-visualisation data-wrangling intro-to-programming r statistics

Last synced: 07 Jan 2025

https://github.com/mecha-aima/demographic-analyzer

This project uses pandas to process census data from a csv file and draw useful results from the data by performing various filtering and calculations on it

data-analysis data-science pandas

Last synced: 03 Jan 2025

https://github.com/manisharora96/data-analysis-of-smartwatch

The project is structured with sample data, step-by-step Jupyter notebooks, and modular Python scripts for automated analysis

data-analysis data-visualization jupyter-notebook python smartwatch-analysis

Last synced: 21 Jan 2025

https://github.com/myke003/data-analysis-projects

This repository serves as a collection of all my projects.

data-analysis jupyter-notebook powerbi

Last synced: 21 Jan 2025

https://github.com/haonamnguyen/data-science-job-analysis

Evaluate the factors influencing salary trends in the data science industry, including experience levels, job titles, employment types, company sizes, and remote work arrangements, to help HR teams and hiring managers make data-driven decisions regarding compensation packages and recruitment strategies.

data-analysis data-science data-visualization jupyter-notebook python

Last synced: 21 Jan 2025

https://github.com/matteofasulo/cdc-finf

Project of fundamentals of Computer Science

data-analysis data-science data-visualization numpy pandas python python3

Last synced: 20 Jan 2025

https://github.com/ayeshathoi/simulation-sessional-412

Simulation of SSQS, Inventory System, Transient State, PERT, Monte Carlo Alo etc.

data-analysis excel inventory-system monte-carlo python simulation ssqs triangle-distributions

Last synced: 02 Jan 2025

https://github.com/buildwithlal/introduction-to-data-science-in-python-coursera

introduction to data science in python, part of Applied Data Science using Python Specialization from University of Michigan offered by Coursera

data-analysis matplotlib numpy pandas

Last synced: 02 Jan 2025

https://github.com/ksharma67/eda-on-ipl

In this python notebook, analysis of IPL matches from 2008 to 2020 is done using python packages like pandas, matplotlib and seaborn.

data-analysis data-science eda matplotlib numpy pandas python seaborn

Last synced: 25 Dec 2024

https://github.com/elishah-john/happiness-report-2019

Analysis of "Happiness Report 2019" using python.

data-analysis data-visualization educational jupyter-notebook python

Last synced: 21 Jan 2025

https://github.com/mohamed3nan/udacity

Udacity Data Analysis Nanodegree Program

data-analysis data-visualization numpy pandas python

Last synced: 13 Jan 2025

https://github.com/mumtaz4118/nlp-course

Programming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning

course data data-analysis data-analytics data-science data-visualization deep-learning education machine-learning natural-language-processing neural-network transfer-learning

Last synced: 28 Dec 2024

https://github.com/mumtaz4118/amazon-iphone-12-data-scrapped

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

data-analysis data-extraction data-science data-scraping html mark-up python

Last synced: 28 Dec 2024

https://github.com/demon-2-angel/product-customer-acquisition-analysis-using-behaviour

The database encompasses eight tables with varied attributes and rows. Key analyses include product restocking needs, top VIP customers' contributions, and an average customer profit of $39,039.59. Recommendations emphasize strategic marketing to new customers and incentives for existing VIP clients based on acquisition costs and profit insights.

customer-products customer-segmentation data-analysis database sqlite

Last synced: 14 Dec 2024

https://github.com/hadson0/chess-live-ratings-data

A study project focused on web scraping the live chess ratings from chess.com, with data analysis and visualization on nearly 5000 players in the classical world ranking.

beautifulsoup chess data-analysis data-visualization numpy pandas python seaborn web-scraping

Last synced: 21 Jan 2025

https://github.com/tmmvn/analytics-notebooks

A bunch of data analytics notebooks done testing out JetBrains DataLore

ai algorithms data-analysis datalore elements-of-ai helsinki-university-mooc python

Last synced: 28 Dec 2024

https://github.com/nerooc/device-downtime-detection

Repozytorium dotyczące projektu z przedmiotu "Sztuczne Sieci Neuronowe"

data-analysis detection-model recurrent-neural-networks

Last synced: 26 Jan 2025

https://github.com/zen204/renewable-energy-usage-v-electricity-access

Interactive data visualization project created for COSI 116A: Introduction to Information Visualization at Brandeis University (Fall 2024). The project showcases data-driven insights using advanced visualization techniques and user interactivity. Hosted on GitHub Pages.

d3js data-analysis data-visualization electricity github-pages html-css-javascript information-visualization interactive python renewable-energy tableau web-development

Last synced: 30 Dec 2024

https://github.com/bishopce16/school_district_analysis

The school board requested an analysis on the various performance metrics for the school district.

data-analysis jupyter-notebook numpy pandas python visual-studio-code

Last synced: 06 Jan 2025

https://github.com/spaghettifunk/gvb

Analysis of GVB in Amsterdam

data-analysis public-transportation

Last synced: 02 Jan 2025

https://github.com/darkdk123/house-valuation-model

A Challenge Project in a Boot-Camp to create a ML Model to predict the prices of houses in Boston Massachusetts from multiple parameters Using Multivariable Regression.

data-analysis data-science data-visualization matplotlib-pyplot multivariate-regression predictive-modeling statistics

Last synced: 28 Dec 2024

https://github.com/darkdk123/handwashing-discovery-analysis

A Guided Project in a Boot camp to Analyse the Original Data used in the Discovery of Viruses & Hand Washing By Dr. Ignaz Semmelweis in Vienna General Hospital in the 1840s.

data-analysis data-science data-visualization matplotlib-pyplot numpy pandas plotly-python python seaborn-plots

Last synced: 28 Dec 2024

https://github.com/vbhvsingh0/nflteam_corr_population

The goal of this project is to find the correlation in between NFL teams' win and loss with the population of the city.

data-analysis data-cleaning-and-preprocessing data-manipulation-with-pandas numpy-library pandas-python pearson-correlation python3

Last synced: 14 Jan 2025

https://github.com/bishopce16/pyber_analysis

The purpose of this project was to complete an exploratory analysis and create visualizations of the 2019 ride sharing data from PyBer.

data-analysis data-visualization jupyter-notebook matplotlib pandas python

Last synced: 06 Jan 2025

https://github.com/vbhvsingh0/coulombic_dyn_formaltetra

The Python code simulates a formaldehyde tetra-cation molecule using Coulombic forces

data-analysis physics-simulation python shell-scripting

Last synced: 14 Jan 2025

https://github.com/vbhvsingh0/matplotlib__egs

The codes here are examples of Matplotlib

data-analysis matplotlib-pyplot numpy-library pandas-python python3

Last synced: 14 Jan 2025

https://github.com/noeyislearning/credit-card-fraud

This project analyzes credit card transactions in the western U.S. to detect fraud. It utilizes a dataset with detailed transaction info including date, time, merchant, purchase type, amount, customer location, and fraud status. The goal is to enhance security measures and ensure customer trust.

credit-card-fraud data-analysis data-science jupyter-notebook python3

Last synced: 06 Dec 2024

https://github.com/matheusafonseca/c111

Este repositório é dedicado ao armazenamento e organização dos códigos desenvolvidos na disciplina C111 - Análise de Dados, oferecida pelo Instituto Nacional de Telecomunicações (INATEL).

data-analysis matplotlib numpy pandas python

Last synced: 02 Jan 2025

https://github.com/anjaliwork20/moodify

Mood-based music recommendation system that considers a user's emotional state to recommend songs, genres, artists and playlists using Machine learning

artificial-intelligence cnn-keras cnn-model convolutional-neural-networks data data-analysis data-science data-structures data-visualization database deep-learning machine-learning machine-learning-algorithms python recommended song songs

Last synced: 06 Jan 2025

https://github.com/mihaildoman/formula-one-insights-with-python-and-sql

Using Python and SQL to analyze the achievements of drivers and constructors throughout Formula One's history.

data data-analysis data-visualization formula-one formula1

Last synced: 26 Jan 2025

https://github.com/mihaildoman/formula-one-insights-with-tableau

Using Tableau to analyze the achievements of drivers and constructors throughout Formula One's history.

data data-analysis data-visualization formula-one formula1

Last synced: 26 Jan 2025

https://github.com/touradbaba/multi-page_dash_application

This repository contains a Multi-Page Dash Application designed to provide interactive visualizations of geo-spatial data, focusing on population and GDP. The app offers insights into demographic and economic trends through interactive maps and various types of charts. It is built with Python, using Plotly and Dash, and is deployed on Heroku.

dash dashboard data-analysis data-visualization exploratory-data-analysis heroku-deployment plotly pythonanywhere

Last synced: 21 Jan 2025

https://github.com/apfirebolt/numpy-and-pandas-examples

Some examples and sample datasets to learn numpy, pandas and other data science libraries in Python

data-analysis jupyter-notebook numpy pandas python

Last synced: 25 Jan 2025

https://github.com/roland045/smart_fluid_sedimentation_tester

Control program for custom developed smart fluid sedimentation tester system

arduino data-analysis instrumentation measurement sensor

Last synced: 06 Jan 2025

https://github.com/stoll-jonathan/sorting_algorithm_analyzer

C++ program which analyses the performance of different sorting algorithms on a dataset of random numbers

bubble-sort data-analysis insertion-sort merge-sort sorting-algorithms

Last synced: 14 Dec 2024

https://github.com/alan-oliveir/state-of-data-2022

Neste projeto faço a análise da distribuição das faixas salariais para os profissionais de nível júnior para o cargo de analista, cientista e engenheiro de dados.

data-analysis jupyter-notebook pandas-python seaborn-python

Last synced: 13 Jan 2025

https://github.com/chrispsang/healthcare-dataanalysis

Analyze synthetic patient data to identify trends, improve healthcare delivery, and predict patient outcomes using machine learning models. Includes data exploration, preprocessing, model building, and visualizations.

data-analysis data-science data-visualization healthcare jupyter-notebook machine-learning python

Last synced: 25 Dec 2024

https://github.com/badranalyst/e-commerce-customer-analysis-data-science-foundations-case-study

This case study explores e-commerce customer data through data exploration, pre-processing, and splitting. It includes model building and training to analyze customer behavior. Python libraries like Pandas, NumPy, Matplotlib, and Seaborn are used for the analysis and model development.

data-analysis data-science dataset eda exploratory-data-analysis machine-learning matplotlib ml model-building model-training numpy pandas pre-processing python seaborn

Last synced: 30 Dec 2024

https://github.com/mumtaz4118/scraping-medium-and-data-analytics

The file DataExtraction.py extracts information from the json files scrapped by the scrapper medium_scrapper_post.py. To extract information from json files scrapped by medium_scrapper_tag_archive.py (scrapping from tags archive) then use Data_Extraction_Archive_Tags.py

data data-analysis data-analytics data-extraction data-preprocessing data-science data-scraping deep-learning machine-learning python

Last synced: 28 Dec 2024

https://github.com/badranalyst/restaurant-reviews-sentiment-analysis-nlp-case-study

This project analyzes restaurant reviews using Natural Language Processing (NLP) for sentiment analysis. It covers data exploration, pre-processing (NLTK text cleaning), model building, prediction, and deployment. The goal is to predict sentiment from reviews using Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.

data-analysis data-science eda exploratory-data-analysis matplotlib-pyplot model model-building numpy pandas pre-processing predictive-modeling python seaborn

Last synced: 30 Dec 2024

https://github.com/badranalyst/titanic-survival-prediction-full-data-science-project-classification

This project predicts Titanic survivors using classification models. It includes data cleaning, pre-processing, exploratory data analysis (EDA), categorical feature conversion, model building, and evaluation. Python libraries like Pandas, NumPy, Matplotlib, and Seaborn are used to analyze and predict survival outcomes.

classification data-analysis data-science eda exploratory-data-analysis machine-learning matplo matplotlib-pyplot ml model numpy pandas predictive-modeling python seaborn

Last synced: 30 Dec 2024

https://github.com/jendives2000/regressions

Performing of a Linear Regression analysis to determine the strength of the relationship between the number of reviews and sales for a retail company.

data-analysis linear-regression pearson-correlation-coefficient regression

Last synced: 26 Jan 2025

https://github.com/mysftz/numerical-methods-in-matlab

Multiple MatLab scripts over multiple data analysis assignments.

data-analysis data-science matlab university university-assignment

Last synced: 26 Dec 2024

https://github.com/leeway64/lwwordcounter

C++ application that analyzes the frequency of words in a text file

bson cmake conan cpp data-analysis json json-schema text-analysis ubjson

Last synced: 02 Jan 2025

https://github.com/chdre/data-analyzer

A small package to analyze and preprocess data.

data-analysis python

Last synced: 25 Jan 2025

https://github.com/danielrosehill/data-projects-index

Data apps and datasets deployed to Streamlit Community Cloud, Hugging Face, and elsewhere.

data-analysis data-science data-visualization

Last synced: 20 Jan 2025

https://github.com/shrutiijoshi/coffee_sales

This project aims to analyze coffee sales data to identify key trends, patterns, and factors influencing sales performance.

data-analysis microsoft-excel

Last synced: 18 Jan 2025

https://github.com/ryuzen6/kaggle-series

This is a series of Machine Learning/Deep Learning Models made for practice.

artificial-intelligence data-analysis data-science deep-learning machine-learning python3

Last synced: 20 Jan 2025

https://github.com/edwinrlambert/investigating-netflix-movies

Demonstrates data analysis and visualization techniques for Netflix movies using Python in a Jupyter notebook. This is a DataCamp project.

data-analysis data-analysis-python netflix python

Last synced: 18 Jan 2025

https://github.com/mugambi645/exploring-ebay-car-sales-data

Exploring ebay car sales dataset

car-sales data-analysis numpy pandas

Last synced: 28 Jan 2025

https://github.com/sieunnnn/data-lab

데이터 분석을 연습하는 Repository 입니다.

data-analysis jupyter-notebook python

Last synced: 13 Jan 2025

https://github.com/riju18/data-analysis-and-visualizaton

Most complex data analyzing for clustering, preparing, complex calculation, joining, cross-over & more for Data science.

data-analysis data-mining data-science data-visualization powerbi tableau

Last synced: 28 Jan 2025

https://github.com/shoebjoarder/superstore

A Dash app to analyze Superstore dataset.

dashboard data-analysis data-visualization python-3

Last synced: 15 Dec 2024

https://github.com/rahulsm20/insurance-data

A data analytics project dealing with risk assessment and it's effects in health insurance.

data-analysis data-analytics machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 06 Jan 2025

https://github.com/rahulsm20/trackbyte

A full-stack web application that helps users keep track of their playlist and provides analytics based on their music taste. Built using React, Node.js, Express.js, MySQL and Bootstrap.

bootstrap data-analysis expressjs mysql nodejs reactjs sql

Last synced: 06 Jan 2025

https://github.com/albertomorini/policesviolence

Repository for the project of the course Data Science (Fondamenti di Scienza dei Dati) at UniUD.

data-analysis data-science data-visualization r

Last synced: 17 Jan 2025

https://github.com/vubacktracking/freecodecamp-data-analysis-with-python

5 Projects in Data Analysis With Python Course on Freecodecamp

data-analysis freecodecamp freecodecamp-project python

Last synced: 06 Jan 2025

https://github.com/luminati-io/Target-dataset-samples

A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.

api data-analysis data-mining datasets target web-scraper web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Amazon-dataset-samples

A sample dataset of over 1,000 Amazon product listings, extracted using the Bright Data API, perfect for competitive analysis, market trends, and eCommerce insights.

amazon api data-analysis data-science dataset ecommerce products web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Indeed-dataset-samples

A sample dataset of over 1000 Indeed job listings, extracted using the Bright Data API, ideal for market analysis and growth.

api data-analysis datasets indeed jobs web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Walmart-dataset-samples

A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.

api data-analysis dataset walmart walmart-scraper web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Shopee-dataset-samples

A sample dataset of over 1000 Shopee products, extracted using the Bright Data API, ideal for pricing optimization, gap analysis, and market strategy refinement..

api data-analysis data-mining datasets products shopee web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Airbnb-dataset-samples

A sample dataset of over 1000 Airbnb listings, extracted using the Bright Data API, ideal for competitor tracking, brand reputation, and market analysis.

airbnb airbnb-listings api data-analysis datasets web-scraper web-scraper-api web-scraping

Last synced: 06 Nov 2024

https://github.com/edwinrlambert/exploring-airbnb-market-trends

Dive into NYC's Airbnb market trends through detailed analysis of listings data, including prices, types, and review dates. This is a DataCamp project.

airbnb data-analysis jupyter-notebook market-trends python

Last synced: 18 Jan 2025

https://github.com/lunarwhite/lake-george-viz

Geroge Lake data analysis and visualization, ANU COMP1730/6730

data-analysis python

Last synced: 26 Dec 2024

https://github.com/mr-chang95/sf_data_visualization

In this personal project, I am interested in examining all of the active businesses in the San Francisco Bay Area while performing some simple data visualizations, mainly on categorical variables.

business data-analysis data-visualization jupyter-notebook pandas python san-francisco

Last synced: 26 Jan 2025

https://github.com/mr-chang95/datascience_airbnb

Data Science Project for Udacity's Data Scientist Program. Using Python in Jupyter Notebook.

airbnb data-analysis data-science data-visualization jupyter-notebook numpy pandas python sklearn

Last synced: 26 Jan 2025