Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/terrelbrinkley/r-projects

Data Analyst & Visualization Projects

data-analysis data-science data-visualization

Last synced: 10 Nov 2024

https://github.com/ygalvao/uow_ai_final_project

This was my Final Project for the Artificial Intelligence Diploma program of The University of Winnipeg - Professional, Applied and Continuing Education (PACE).

data-analysis data-analytics dbscan elections k-means k-means-clustering machine-learning som som-clustering

Last synced: 12 Nov 2024

https://github.com/preetesh21/spotme

This repository is using the web-based API provided by Spotify to retrieve data and then analyse it.

api data-analysis

Last synced: 08 Nov 2024

https://github.com/fmind/malpop

Rank the popularity of malware applications by their occurrence on VirusTotal

data-analysis malware popularity ranking virustotal

Last synced: 06 Nov 2024

https://github.com/andimashkulli/vpms

Vehicle Parking Management System for Gjon Buzuku Gymnasium

backend-api data-analysis databases frontend-react mongodb nodejs software

Last synced: 31 Oct 2024

https://github.com/marielachirinosr/bellabeat-wellness-data-trends

Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.

data data-analysis data-visualization pandas python python3 tableau tableau-public

Last synced: 07 Nov 2024

https://github.com/marielachirinosr/hotel-data-analysis

Pandas & Matplotlib Learning Analysis. Repository featuring data analysis projects using Pandas and Matplotlib libraries

data data-analysis matplotlib pandas python

Last synced: 07 Nov 2024

https://github.com/nikbarb810/covid_growth_rate_390.51

Exploring Covid Growth Rate of European Population using genetic data analysis

bioinformatics data-analysis r rcpp

Last synced: 08 Nov 2024

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 07 Nov 2024

https://github.com/wilfordaf/dataanalyst-test

Test task for Junior Data Analyst position

data-analysis pandas python trading-data

Last synced: 12 Nov 2024

https://github.com/nikbarb810/motif_detection_in_r

Motif Detection for TFBS in Glycolysis and Glyconeogenesis pathways

bioinformatics data-analysis null-hypothesis pwm r

Last synced: 08 Nov 2024

https://github.com/weybsonalves/prevendo-o-atrito-de-clientes

Projeto em que percorro as etapas que compõem o ciclo de vida da ciência de dados a fim de prever o atrito de clientes do serviço de cartões de crédito de um banco.

data-analysis data-science data-visualization machine-learning python

Last synced: 16 Nov 2024

https://github.com/gattiharishkumar/blinkit-sales-analysis-dashboard

This project presents a comprehensive sales analysis dashboard for Blinkit, an Indian last-minute delivery app. The dashboard was created using Power BI and provides a detailed overview of the company's sales performance across various outlets and product categories.

dashboard data-analysis data-transformation data-visualization ms-excel-data-analytics power-query powerbi powerbi-visuals

Last synced: 07 Nov 2024

https://github.com/gattiharishkumar/employee-attendance-leaves-analytics-dashboard

This project showcases a Power BI dashboard created to analyze employee attendance and leaves over a three-month period. The data was sourced from Excel datasets available on the Codebasics website.

dashboards data-analysis data-cleaning data-transformation data-visualization power-query-editor powerbi

Last synced: 07 Nov 2024

https://github.com/victorlcastro-dsa/pbl-datacamp

This repository features projects from DataCamp's Project-Based Learning (PBL) courses, showcasing practical applications of data analysis, machine learning, and visualization. Explore real-world datasets and interactive results that highlight the skills gained through hands-on learning.

data-analysis data-science data-visualization datacamp-projects hypothesis-testing machine-learning project-based-learning

Last synced: 14 Nov 2024

https://github.com/jhrcook/protein-language-models

Experimenting with protein language model predictions

data-analysis protein-language-model variant-effect-prediction

Last synced: 13 Nov 2024

https://github.com/linguini1/coopscraper

Scrapes the co-op job board provided by Carleton for jobs on my shortlist, then saves the jobs to a CSV file so that I can manipulate them with Excel.

csv data-analysis python selenium webscraper webscraping

Last synced: 07 Nov 2024

https://github.com/linguini1/tangerineanalyzer

Command line tool for analyzing transactions in CSV format provided by Tangerine Banking. Transactions can be downloaded in CSV format on your Tangerine account.

analysis analytics argparse banking cli command-line command-line-tool csv data-analysis data-analytics finance pandas python tangerine transactions

Last synced: 07 Nov 2024

https://github.com/linguini1/edueval

The BorealisAI Let's Solve It mentorship project: summarizing student feedback submissions on their professor into one cohesive paragraph for faculty consideration during performance reviews.

ai data data-analysis data-science machine-learning machinelearning nlp python pytorch sentiment-analysis

Last synced: 07 Nov 2024

https://github.com/itrauco/data-dirtying-tool

a simple command line tool to generate dirty data and do common data things in google cloud

data data-analysis data-engineering data-ops data-pipeline data-science data-visualization data-wrangling dirty-data google-cloud machine-learning

Last synced: 10 Nov 2024

https://github.com/edumoraes1/comissao-reduzida

Criação de segmentação de publico via SQL para nova feature do enjoei de comissão reduzida

bq data-analysis salesforce sql

Last synced: 12 Oct 2024

https://github.com/an0n1mity/spamclassifiereval

A repository for evaluating the misclassification rate of spam classification models using a threshold-based approach.

data-analysis machine-learning natural-language-processing python-programming spam-classification text-classification

Last synced: 07 Nov 2024

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 12 Oct 2024

https://github.com/tanaybhadula/twitter-trends-dashboard

An interactive dashboard to visualizes data on current Twitter trends by country and globally. Collects data of over 60 countries using the python Tweepy library, processed it,and visualized it in the form of bar chart and pie chart using the Plotly Dash framework.

dash dashboard data-analysis data-visualization plotly python trends twitter

Last synced: 12 Nov 2024

https://github.com/chitranjan806/greyatom_learning_repo

A Collection of Projects, Tasks and Challenges as part of Data Science Masters - Transition Program at GreyAtom.

data-analysis data-science greyatom python3

Last synced: 10 Nov 2024

https://github.com/jakobzmrzlikar/pca-on-genomes

An analysis of human genome mutations from different populations.

data-analysis genome-analysis pca-analysis

Last synced: 07 Nov 2024

https://github.com/karencofre/riesgorelativo-lookerstudio

proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab

bigquery data-analysis data-science machine-learning matplotlib python sklearn sql

Last synced: 13 Oct 2024

https://github.com/jakobzmrzlikar/trg-dela

Data analysis of student job offers.

data-analysis ipython-notebook web-scraping

Last synced: 07 Nov 2024

https://github.com/navp7/hr_analysis_excel

This project utilizes Microsoft Excel to conduct a comprehensive analysis of HR data, focusing on identifying the various reasons for employee attrition and evaluating job satisfaction

dashboards data-analysis excel visualization

Last synced: 07 Nov 2024

https://github.com/valyaevgeorgiy/r_basic

Работа с основами среды R и тем самым изучения нового языка программирования, связанного непосредственно с анализом данных и построением графиков и диаграмм.

coding data data-analysis r rstudio

Last synced: 07 Nov 2024

https://github.com/harkishen-singh/agriculture-ds

An Agricultural based Mtech project, on Data Science, which predicts the growth of crops based on previous year records.

data-analysis pandas python

Last synced: 07 Nov 2024

https://github.com/xre22zax/roller-coaster

Explore award-winning wood and steel coasters from 2013-2018 Golden Ticket Awards & Captain Coaster, all powered by Python and interactive visualizations.

analytics data-analysis data-visualization pandas python python-lambda python3 visualization

Last synced: 06 Nov 2024

https://github.com/bhaskaracharjee/student-results-analysis

Analyzing student results to uncover insights

data-analysis student-results

Last synced: 06 Nov 2024

https://github.com/willie-conway/datavista

A robust 🐍Python application for data analysis that provides a wide range of tools for 🔃loading, 🧹cleaning, and 🔃preprocessing data. It includes features for 📈statistical analysis, 👨🏿‍🔬hypothesis testing, 🦾machine learning, clustering, ⏳time series forecasting, and 📊data visualization, all designed to enhance your analytical workflow.

analytics big-data command-line data-analysis data-cleaning data-driven data-mining data-pipeline data-preprocessing data-science data-scientist data-visualization data-wrangling exploratory-data-analysis machine-learning pandas predictive-analytics python statistics visualization-tools

Last synced: 11 Nov 2024

https://github.com/yandexdataschool/ml-sweights-experiments

Experiments for the "Machine Learning on data with sPlot background subtraction" paper

data-analysis high-energy-physics machine-learning statistics

Last synced: 06 Nov 2024

https://github.com/pseudomanifold/us-inauguration-speeches

Data & feature extraction for U.S. inauguration speeches

data-analysis data-science inauguration politics speech speeches

Last synced: 06 Nov 2024

https://github.com/netesf13d/expt-sequence-analysis

Data processing, analysis and visualization package for atomic physics experiments in the single-atom regime.

cold-atoms data-analysis data-visualization optical-tweezers

Last synced: 11 Nov 2024

https://github.com/amyanchen/sf-airbnb

Exploratory Data Analysis of San Francisco Airbnb's

data-analysis data-science data-visualization r rmarkdown statistics

Last synced: 07 Nov 2024

https://github.com/aymane-maghouti/sentiment-analysis-for-jumia-reviews-and-smartphone-price-prediction-system

The project focuses on customer sentiment analysis for Jumia, aiding informed online decisions. It collects and analyzes product comments to determine sentiments and implements a decision-making algorithm. Additionally, it includes product price prediction system using regression techniques.

beutifulsoup data-analysis data-cleaning data-collection data-preprocessing data-scraping data-visualization eda falsk machine-learning python web-application

Last synced: 16 Nov 2024

https://github.com/lucs1590/agidatatest

This is a repository with data analysis and data science tests.

data-analysis data-science python test

Last synced: 13 Nov 2024

https://github.com/jayita11/exploring-most-streamed-songs-for-last-four-decades-eda

Perform EDA to uncover trends in streaming patterns, likes, and artists over the last four decades.

data-analysis eda hypothesis-testing matplotlib most-streamed-songs pandas python seaborn

Last synced: 13 Nov 2024

https://github.com/jayita11/eda-student-exam-performance

This project performs Exploratory Data Analysis (EDA) and hypothesis testing on student performance data. It explores trends based on attributes like gender, race/ethnicity, parental education, lunch type, and test preparation course completion.

data-analysis eda hypothesis-testing matplotlib pandas python seaborn statsmodels student-performance-analysis

Last synced: 13 Nov 2024

https://github.com/edumoraes1/republicacao-produtos

SQL Query realizada para criação de automação de disparo de push via salesforce

bq data-analysis salesforce sql

Last synced: 12 Oct 2024

https://github.com/edumoraes1/journey_active_users

Segmentação de base via SQL para jornada de vendedores ativos

bq data-analysis salesforce sql

Last synced: 12 Oct 2024

https://github.com/danmadeira/algoritmos-estatistica-python

Demonstração de Algoritmos de Estatística em Python

algorithms data-analysis data-science python statistics

Last synced: 10 Nov 2024

https://github.com/jayita11/atliqo-bank-credit-card-launch-eda

This project involves exploratory data analysis and statistical testing for AtliQo Bank's new credit card launch. Key insights include targeting high-income occupations and the 18-25 age group. Recommendations focus on tailored marketing campaigns, education, and incentives to enhance credit card adoption and usage among young adults.

data-analysis hypothesis-testing matplotlib p-value pandas python seaborn statistics z-test

Last synced: 13 Nov 2024

https://github.com/hyperentangledqubit/shellplot

shellplot -- Generate plot(s) directly from terminal via matplotlib or ggplot2 (plotnine)!

data-analysis ggplot2 graphics matplotlib plotnine plotting pyplot terminal

Last synced: 11 Nov 2024

https://github.com/inevolin/multivariate-data-analysis

Showcases of modern multivariate & multidimensional data analysis in industrial and high-tech settings.

analytics data-analysis data-science data-visualization javascript

Last synced: 12 Nov 2024

https://github.com/quantumudit/groceries-basket-analysis

This project performs market basket analysis using Power BI and Python to reveal associations between grocery items. It involves transforming raw transaction data into a processed dataset, creating interactive Power BI reports, and generating key insights through Python, enabling data-driven decision-making.

data-analysis data-visualization pandas powerbi python

Last synced: 06 Nov 2024

https://github.com/mirokeimioniemi/classifying-software-pirates

Exploring the factors driving people into software piracy by training two machine learning models to predict whether a person with certain characteristics and sentiments is likely to possess any pirated software or not using a dataset collected via a survey targeting users of music production software.

data-analysis data-science decision-tree-classifier logistic-regression machine-learning piracy python software-piracy survey

Last synced: 12 Nov 2024

https://github.com/lunarwhite/lake-george-viz

Simple data analysis and visualization of Lake Geroge with Python.

data-analysis python

Last synced: 06 Nov 2024

https://github.com/silianpan/python-data-analysis-course

python data analysis course of drotion-lega

data-analysis jupyter-notebook panda

Last synced: 06 Nov 2024

https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-

This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.

bigquery consumer-insights data-analysis database sql target

Last synced: 12 Oct 2024

https://github.com/luminati-io/Airbnb-dataset-samples

A sample dataset of over 1000 Airbnb listings, extracted using the Bright Data API, ideal for competitor tracking, brand reputation, and market analysis.

airbnb airbnb-listings api data-analysis datasets web-scraper web-scraper-api web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Shopee-dataset-samples

A sample dataset of over 1000 Shopee products, extracted using the Bright Data API, ideal for pricing optimization, gap analysis, and market strategy refinement..

api data-analysis data-mining datasets products shopee web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Walmart-dataset-samples

A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.

api data-analysis dataset walmart walmart-scraper web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Indeed-dataset-samples

A sample dataset of over 1000 Indeed job listings, extracted using the Bright Data API, ideal for market analysis and growth.

api data-analysis datasets indeed jobs web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Amazon-dataset-samples

A sample dataset of over 1,000 Amazon product listings, extracted using the Bright Data API, perfect for competitive analysis, market trends, and eCommerce insights.

amazon api data-analysis data-science dataset ecommerce products web-scraping

Last synced: 06 Nov 2024

https://github.com/luminati-io/Target-dataset-samples

A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.

api data-analysis data-mining datasets target web-scraper web-scraping

Last synced: 06 Nov 2024

https://github.com/lopez86/datascienceexamples

Examples of various data science & data analysis topics using various sources of data.

data-analysis data-science pandas scikit-learn tutorial visualization

Last synced: 06 Nov 2024

https://github.com/lopez86/rust-mlearn

Machine Learning Tools in Rust

data-analysis data-science machine-learning rust

Last synced: 06 Nov 2024

https://github.com/motapinto/agent-based-simulation-conquest

Agent-based simulation modelation of the conquest Battlefield gamemode

agent-based-simulation data-analysis jade java sajas swing

Last synced: 06 Nov 2024

https://github.com/ireneflorez/nypd-mvc

Analysis of NYPD Motor Vehicle Collisions

basemap data-analysis folium jupyter-notebook matplot pandas python

Last synced: 12 Nov 2024

https://github.com/chrispsang/customerchurnanalysis

Predicting customer churn using a RandomForestClassifier with detailed EDA, model evaluation, and visualization. Includes a Tableau dashboard for interactive insights.

customerchurn data-analysis data-visualization datapreprocessing machine-learning python scikit-learn tableau

Last synced: 10 Oct 2024

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 10 Oct 2024

https://github.com/lintangwisesa/ujian_analyticsvisualization_jcds07

Panduan Soal Ujian Data Analytics & Visualization Job Connector Data Science batch 7

data-analysis data-science data-visualisation exam

Last synced: 11 Nov 2024

https://github.com/khanovico/python-stock-analyzer

This is a Webapp implemented by python and several data science frameworks, enabling online stock trend analyzing.

amcharts-js-charts data-analysis data-visualization flask javascript pandas python scikit-learn

Last synced: 03 Nov 2024

https://github.com/dataforgeopenaihub/steam-sales-analysis

This repository features an ETL pipeline for retrieving, processing, validating, and ingesting game metadata and sales data from SteamSpy and Steam APIs. Data is stored in a MySQL database on Aiven Cloud and visualized using Tableau dashboards for insightful analysis of gaming trends and sales performance.

data-analysis data-engineering data-pipepline data-warehousing games mysql-database python steam-api tableau typer-cli

Last synced: 10 Oct 2024

https://github.com/shuddha2021/stellar-candidate-selector

A sophisticated candidate selection algorithm leveraging multi-criteria analysis and machine learning to identify top software engineering candidates. This tool features flexible filtering, score adjustment, and detailed visualizations to streamline the recruitment process.

candidate-selection data-analysis data-visualization machine-learning pandas plotting-in-python python python-data-analysis recruitment scikit-learn

Last synced: 06 Nov 2024

https://github.com/danpoynor/data-analysis-spotify-songs-2010-2019

Spotify data analysis for songs between 2010 and 2019 using Jupyter Notebooks including pandas and Seaborn plots.

data-analysis jupyter-notebook matplotlib pandas-dataframe python3 seaborn-plots spotify

Last synced: 16 Nov 2024

https://github.com/m0saan/python-for-data-analysis

Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney,

data-analysis data-science ipython-notebook machine-learning matplotlib numpy pandas python

Last synced: 16 Nov 2024

https://github.com/quocduyenanhnguyen/roi-modeling-and-analysis-of-sports-dataset

In this project, you will find my ROI model for retirement savings and PowerPoint presentation of my ROI model, as well as my data analysis/visualization of Sports Ticket Sales dataset that I concluded with a PDF group written report

data-analysis data-visualization microsoft-excel rate-of-return-modeling sports-ticket-sales-dataset

Last synced: 08 Nov 2024

https://github.com/chingu-voyages/v47-tier3-team-30

An easily accessible tool for calculating electricity-related carbon emissions, along with insights for reducing environmental impact. | Voyage-47 | https://chingu.io/ | Twitter: https://twitter.com/ChinguCollabs

carbon-emissions carbon-footprint data-analysis data-engineering data-science

Last synced: 14 Nov 2024

https://github.com/rickcontreras/modelos1

Modelo de clasificación para predecir el desempeño de estudiantes en las Pruebas Saber Pro en Colombia. Incluye análisis exploratorio de datos, preprocesamiento y modelos de machine learning.

classification colombia data-analysis data-science education educational-assessment exploratory-data-analysis jupyter-notebook machine-learning python saber-pro scikit-learn student-performance

Last synced: 10 Oct 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 12 Oct 2024

https://github.com/leeway64/lwwordcounter

C++ application that analyzes the frequency of words in a text file

bson cmake conan cpp data-analysis json json-schema text-analysis ubjson

Last synced: 08 Nov 2024

https://github.com/samuelsoaress/wkd-default-reduction

reduction of default from 35% to 25% or less with machine learning techniques

data-analysis data-exploration data-science machine-learning-algorithms

Last synced: 10 Nov 2024

https://github.com/mysftz/statistical-analysis

A in-depth review of statistical analysis in Python from datasets.

data-analysis python python3 statistics university university-project

Last synced: 06 Nov 2024

https://github.com/mysftz/statistics-analysis

A python statistical analysis of a dataset and probability.

data-analysis matplotlib python python3 statistical-analysis

Last synced: 06 Nov 2024

https://github.com/mysftz/numerical-methods-in-matlab

Multiple MatLab scripts over multiple data analysis assignments.

data-analysis data-science matlab university university-assignment

Last synced: 06 Nov 2024

https://github.com/saitoxu/data-analysis-workspace

Docker image for data analysis

data-analysis docker python

Last synced: 10 Nov 2024