Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/katiesaund/tidy_tuesday

A weekly data project in R from the R4DS online learning community

data-analysis data-visualization datascience plot r rstats tidytuesday

Last synced: 14 Oct 2024

https://github.com/adrianlardies/feelms_predict_by_emotion

Feelms is a mood-based movie recommendation app that uses collaborative filtering and machine learning to suggest films based on your emotions. Built with Streamlit and powered by AWS, Feelms personalizes each user's experience through simulated interactions and tailored predictions. AWS, Feelms brings

aws-ec2 aws-rds data-analysis data-science machine-learning python streamlit

Last synced: 31 Oct 2024

https://github.com/akshaypratapsingh09/zomato-blogs-all-links-dataset

Engineering / Culture / Blogs Data gathered for Educational and Learning purposes from Zomato's Blogs and spreading the better problem solving Methodologies adapted by Modern Unicorns

data-analysis dataset regex selenium webdriver zomato-data-analysis

Last synced: 01 Nov 2024

https://github.com/prangonghose/wikipedia-blocking-policies

This study investigates the relationship between editors’ disruptive behavior and regulation policies on English Wikipedia, focusing on the Blocking Policy page. The study collects and analyzes data from 2004 to 2022 using the Wikipedia API, page statistics, and keyword extraction.

data-analysis data-visualization matplotlib open-source pandas python3 seaborn

Last synced: 20 Oct 2024

https://github.com/erickchacon/day2day

Functions that can be useful in the day-to-day data analysis. It comprehends functions to find paths for projects, make summaries of databases inside folder and so on.

data-analysis exploratory-data-analysis simulation spatial-analysis

Last synced: 17 Nov 2024

https://github.com/wadeChriestenson/Main_Application

A Django application to host my personal resume.

data-analysis data-visualization django plotly python ui-design

Last synced: 23 Oct 2024

https://github.com/thijswillemmoens/historical_document_analysis

Trying to do some Data Science with OpenAI and LLMs.

data-analysis llama2 ollama-api openai openai-api python

Last synced: 08 Nov 2024

https://github.com/jayita11/healthcare-management-optimization-analysis-and-visualization

This project analyzes healthcare data from 2019 to May 2024, optimizing patient care, resource allocation, and financial management. Insights include billing trends, blood bank management, doctor performance, and medication demand, supported by excel,interactive Tableau dashboards and SQL analysis.

data-analysis excel healthcare interactive-dashboards mysql sql tableau-dashboards

Last synced: 14 Oct 2024

https://github.com/ashwin331133/sql-healthcare-data

This repository contains SQL queries designed to analyze health care data. The queries focus on patient demographics, encounter costs, and flu shot statistics, aiming to provide insights into patient behavior and financial impacts. The datasets include information on patient encounters, flu shots, and hospital admissions.

data-analysis mysql sql

Last synced: 14 Oct 2024

https://github.com/touchesir/twitter_physicalactivity

Companion Data / Analysis for "Monitoring Physical Activity Levels using Social Media Data"

data-analysis twitter

Last synced: 15 Oct 2024

https://github.com/lit26/novel-corona-virus-2019

Data Analysis for Novel Corona Virus 2019

analysis coronavirus-case data-analysis sir-model

Last synced: 15 Oct 2024

https://github.com/lit26/data_jobs_analyzing

Data analysis for data jobs

data-analysis topic-modeling

Last synced: 15 Oct 2024

https://github.com/pedrosfaria2/analisandopostshn

Projeto para analisar as postagens da comunidade HackerNews

analise-de-dados data-analysis datetime jupyter-notebook matplotlib python python3

Last synced: 09 Nov 2024

https://github.com/farzeennimran/fashion-mnist-dataset-classification-using-neural-network

Implementation of a Multi-layer Perceptron classifier with hyperparameter tuning and k-fold cross-validation employing GridSearchCV for classifying images on the Fashion MNIST dataset 👗👚👖

artificial-intelligence data-analysis data-mining data-science dataset deep-learning fashion-mnist-dataset gridsearchcv hyperparameter-tuning kfold-cross-validation machine-learning multilayer-perceptron-network neural-network numpy pandas python sklearn

Last synced: 07 Nov 2024

https://github.com/saymyname1337/bachelor-s-thesis

Bachelor's thesis of a student of the MPEI of Shevts G. V.

data-analysis ml python

Last synced: 05 Nov 2024

https://github.com/shyamkumarnagilla/ai-powered-forecasting-for-agricultural-productivity

AI Powered Forecasting for Agricultural Productivity is a project that utilizes machine learning to predict crop yields and optimize farming practices. By harnessing historical and real-time data, this model empowers farmers with data-driven insights to enhance productivity and sustainability in agriculture.

data-analysis data-visualization deep-learning flask neural-network

Last synced: 09 Nov 2024

https://github.com/kheriberto/knn_project

This is a simple project that uses dummie data to practice and demonstrate my knowledge of the KNN algorithm.

data-analysis knn-classifier numpy python scikit-learn seaborn

Last synced: 18 Oct 2024

https://github.com/vishal-verma-96/capstone_project_by_skill_academy

Capstone Project by skill Academy- Exploratory Analysis, Visualization and Prediction of Used Car Prices. Deploying the highest-scoring model with Streamlit web app

data-analysis data-science jupyter-notebook machine-learning machine-learning-algorithms matplotlib numpy pandas python3 regression-algorithms scikit-learn seaborn

Last synced: 18 Oct 2024

https://github.com/eesunmoon/genai_cor-recom

[Project] Outfit Coordination Recommender System using KoAlpaca

data-analysis fine-tuning generative-ai huggingface keyword llm numpy pandas python python3 selenium

Last synced: 18 Oct 2024

https://github.com/bpkaur/exploring-the-evolution-of-linux

This project explores the evolution of the Linux kernel by finding top 10 contributors and visualization of commits over the years.

data-analysis data-science datacamp ipynb-jupyter-notebook python3

Last synced: 13 Nov 2024

https://github.com/tusharpandey003/chat_analysis

Analysis of group chat with respect to individual member of group

chat-analysis chat-analyzer data-analysis data-science streamlit whatsapp whatsapp-chat whatsapp-web

Last synced: 14 Oct 2024

https://github.com/i-e-b/dynamictimewarp

A quick C# implementation of https://jeremykun.com/2012/07/25/dynamic-time-warping/

data-analysis pattern-matching working

Last synced: 14 Oct 2024

https://github.com/jimmymugendi/email-sms-spam_classifier

Email SMS Spam Classifier is a cutting-edge machine learning solution designed to combat spam messages in email and SMS communications. Leveraging advanced Natural Language Processing (NLP) techniques, the system preprocesses text data by tokenizing, removing stopwords, and stemming, ensuring the most accurate classification results.

data-analysis data-visualization machine-learning-algorithms pandas seaborn-plots sklearn-library

Last synced: 16 Nov 2024

https://github.com/nitins17/tableauvisualizations

Visualizations I created while learning to work with Tableau

data-analysis data-science data-visualization tableau visualization

Last synced: 28 Oct 2024

https://github.com/mxagar/data_science_udacity

My personal notes, code and projects of the Udacity Data Science Nanodegree.

dashboard data-analysis data-engineering data-science machine-learning-pipelines

Last synced: 05 Nov 2024

https://github.com/pedrosfaria2/analisetitulosnetflix

Estudo de popularidade dos filmes da Netflix no IMDB.

analise-de-dados data-analysis jupyter-notebook matplotlib numpy pandas python

Last synced: 09 Nov 2024

https://github.com/manish506/loan-approval-prediction

Explore predictive modeling in this project by applying classification techniques to a loan approval dataset. Analyze and preprocess the data, then use models like K-Nearest Neighbors, Random Forest, SVC, and Logistic Regression to predict loan outcomes. Gain insights into approval factors and enhance prediction accuracy.

classification classification-models data-analysis data-science jupyter-notebook loan-approval-prediction machine-learning predictive-analytics predictive-modeling project python

Last synced: 31 Oct 2024

https://github.com/billgewrgoulas/hypothesis-testing-on-37-seasons-of-nba

First assignment for the course Data Mining @CSE.UOI

data-analysis data-science numpy scipy seaborn statistics

Last synced: 16 Nov 2024

https://github.com/risdorn/restaurant-delivery-platforms-analysis-bdm-project

This project analyzes restaurant delivery platforms to understand customer preferences, industry competition, and expansion opportunities. Conducted as part of the BDM project from IITM, it includes descriptive stats, distribution, correlation, regression, and geospatial analysis using multiple datasets.

data-analysis data-visualization jupyter-notebook kaggle

Last synced: 31 Oct 2024

https://github.com/pedrosfaria2/fugascomhelicoptero

Meu primeiro uso do Jupyter Notebook em um projeto

analise-de-dados data-analysis jupyter-notebook matplotlib pandas python

Last synced: 09 Nov 2024

https://github.com/ct83/become-a-data-analyst-udacity

This repository contains all of the code, projects and reports that I wrote as I pursued my Udacity - Data Analyst NanoDegree.

data-analysis data-analysis-python data-analyst data-visualisation data-visualization-project datascience python udacity udacity-data-analyst-nanodegree

Last synced: 13 Nov 2024

https://github.com/roland045/smart_fluid_sedimentation_tester

Control program for custom developed smart fluid sedimentation tester system

arduino data-analysis instrumentation measurement sensor

Last synced: 10 Nov 2024

https://github.com/mxagar/eda_fe_summary

An 80/20 guide for Data Processing: Data Cleaning, Exploratory Data Analysis, Feature Engineering, Feature Selection.

data-analysis data-cleaning data-modeling data-science data-visualization eda exploratory-data-analysis feature-engineering feature-selection machine-learning pandas

Last synced: 05 Nov 2024

https://github.com/elrf3lipes/ramon-s_portfolio

I'm passionate about Cloud and DevOps, and for the moment I'm posting some of my work and personal projects here to showcase that. If its useful for you, feel free to integrate or contribute!

api-integration biopython clinical-trials data-analysis data-extraction data-parsing django docker entrez ipython medline-xml pandas pubmed-parser requests rest-api

Last synced: 12 Oct 2024

https://github.com/cecoeco/sas_certificate

my code from Coursera's SAS programming specialization

data-analysis sas

Last synced: 07 Nov 2024

https://github.com/salfaris/toy-data-analysis

Random toy data projects. For my portfolio data projects, see linked website

data-analysis

Last synced: 15 Nov 2024

https://github.com/mxagar/space_exploration

This repository is a collection of mini-projects and tutorials related to space images and geo-spatial data.

data-analysis deep-learning geospatial machine-learning

Last synced: 05 Nov 2024

https://github.com/reddyprasade/r-program

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

data-analysis data-science r-programming

Last synced: 12 Oct 2024

https://github.com/lulloooo/bizdata-nexus

Collection of my Business & Data Analysis projects, from professional/academic endeavors to passion-driven explorations 📊

business-analysis data-analysis economics etl excel finance mysql python r risk-analysis

Last synced: 28 Oct 2024

https://github.com/gracysapra/r-in-data-science

This repository contains essential guides for data analysis using R, covering topics like data preparation, data reshaping, and data visualization. Each file focuses on fundamental techniques to manipulate, clean, and visualize data effectively using R programming.

data-analysis data-preparation data-reshaping data-science data-visualization data-visualizations ggplot r r-for-data-science

Last synced: 28 Oct 2024

https://github.com/jhrcook/checkplease

Analysis of an immune checkpoint-blockade screen.

bayesian-statistics data-analysis pymc3 python python3 r

Last synced: 13 Nov 2024

https://github.com/rijul007/market-basket-analysis-using-r

Market Basket Analysis using association rules, leveraging R’s powerful tools for data-driven retail strategies.

data-analysis data-science r

Last synced: 28 Oct 2024

https://github.com/datastalker/survival-cox

This repository contains an R script for performing survival analysis on breast cancer surgery data from the University of Chicago's Billings Hospital. The analysis includes Kaplan-Meier estimation and Cox Proportional Hazards modeling to assess patient survival.

breast-cancer-prediction cox-model data-analysis data-science data-visualization epidemiology kaplan-meier r survival-analysis

Last synced: 28 Oct 2024

https://github.com/dimits-ts/sport-repression-repl-study

A replication Study for the recent paper "International Sports Events and Repression in Autocracies: Evidence from the 1978 FIFA World Cup" paper.

data-analysis jupyter regression-models replication-study statistical-analysis

Last synced: 07 Nov 2024

https://github.com/wardenkenny/data-analyst-portfolio

A repository I have created to show and explore data analytics.

data-analysis excel r spreadsheets sql tableau

Last synced: 28 Oct 2024

https://github.com/se7en69/rna-seq-data-processing-and-analysis-pipeline

This pipeline automates essential steps for RNA-Seq data analysis, including quality control, read trimming, alignment to a reference genome, and coverage quantification. It leverages tools like FastQC, fastp, STAR, and bedtools to ensure high-quality results, with MultiQC reports providing an overview at each stage.

bioinformaitcs-scripting bioinformatics bioinformatics-pipeline data-analysis linux scripts shell

Last synced: 31 Oct 2024

https://github.com/dimits-ts/visualization-assignments

Visualizing and analyzing results from the PISA-2018 competitions with regards to Greek performance and gender gap.

data-analysis data-visualization interactive-graphs presentation-slides r-language tableau

Last synced: 07 Nov 2024

https://github.com/dimits-ts/college_analysis

A statistical study about US college admissions, featuring a full report in LaTeX.

anova data-analysis exploratory-data-analysis linear-regression statistics

Last synced: 07 Nov 2024

https://github.com/saiteja-talluri/data-analytics-assignement

Report on World Happiness Data (Data Analysis and Visualisation of the data)

data-analysis data-visualization ipynb-jupyter-notebook

Last synced: 05 Nov 2024

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 12 Oct 2024

https://github.com/navp7/pizzasales_powerbi

This project involves creating a comprehensive sales performance dashboard using Power BI to visualize and analyze the sales data of an Italian pizza company.

data-analysis ms-sql-server ms-word powerbi visualization

Last synced: 12 Oct 2024

https://github.com/shridhar1504/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation eda etl-framework microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 12 Oct 2024

https://github.com/bala-1409/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation database eda etl-framework exploratory-data-analysis microsoft-sql-server query-language sql sql-server sql-server-database sql-server-management-studio

Last synced: 12 Oct 2024

https://github.com/karlyndiary/adidas-sales-analysis

Analyzing Adidas' sales performance and profitability across US retailers by exploring regional trends, product performance, and sales channels, using Python for data cleaning, SQL for querying, and Excel for visualizations.

adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python

Last synced: 12 Oct 2024

https://github.com/singhs05/global-youtube-trends

Understand the impact of Likes, comments, dislikes on the video consumption for the videos that were trending.

data-analysis mssqlserver query sql

Last synced: 12 Oct 2024

https://github.com/krzysikd/apartment-prices-in-poland-analysis-and-visualization

Data Analyst portfolio project that involves cleaning, transforming, and visualizing data to create an insightful dashboard. The project uses SSIS for ETL processes, SSMS for database management and queries, and Power BI for data visualization, focusing on the analysis of rental and sales apartment prices in Poland.

data-analysis data-cleaning data-visualizations powerbi sql sqlserver ssis

Last synced: 12 Oct 2024

https://github.com/samruddhi3012/rfm-sales-analysis

Hi there! In this project I have performed Sales Analysis (RFM Analysis) using SQL and Tableau.

data-analysis data-visualization mssqlserver rfm-analysis segmentation tableau

Last synced: 12 Oct 2024

https://github.com/shubham200137/spotify-listening-habits-analytics

Spotify Listening Habits Analytics is a project aimed at analyzing personalized Spotify listening habits and music trends. It involves Exploratory Data Analysis (EDA) with Python Pandas, data processing using SQL Server, and creating visualizations with Power BI. The goal is to uncover insights into listening patterns, track popularity, and artist.

data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas power-bi-dashboard sqlserver

Last synced: 12 Oct 2024

https://github.com/aavishkarmahajan/sql

SQL code assignments and practice questions from SQL courses, SQL data analysis

data-analysis sql sql-server

Last synced: 12 Oct 2024

https://github.com/kheriberto/linear_regression_ecommerce

Simple project showcasing crafting a linear regression model with SciKit Learn

data-analysis jupyter-notebook linear-regression pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/kmbuki/uk_police_data

R programming - Using open data about crime and policing in England, Wales and Northern Ireland.

data-analysis data-visualization r

Last synced: 17 Nov 2024

https://github.com/sreekar0101/-movie-recommendation-system-using-python

The Movie Recommendation System is designed to suggest personalized movie recommendations by analyzing extensive datasets containing movie details and credits.ultilizes python libraries numpy pandas and scikit learn.The system achieved a 15% improvement in accuracy compared to the baseline model by identifying key factors that influence user choice

data-analysis data-visualization numpy-library pandas-dataframe scikit-learn seaborn-python

Last synced: 13 Oct 2024

https://github.com/aldrinjenson/smart-qa

Query any structured data and find relations using natural language

data-analysis llm nlp sql

Last synced: 01 Nov 2024

https://github.com/kheriberto/logistic_regression_project

A project that analyses dummie data from an advertising company using logistic regression

data-analysis logistic-regression pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/ngangawairimu/linear-regression-

This project builds a linear regression model in Python to predict outcomes and derive insights from feature data. It covers data cleaning, feature analysis, and model evaluation, showcasing predictive modeling techniques using scikit-learn, pandas, and visualization libraries.

data-analysis linear-regression machine-learning predictive-modeling python scikit-learn

Last synced: 31 Oct 2024

https://github.com/kahleryasla/car-sales-data-analysis

This project retrieves and cleans car sales data from a Google Sheets document, and performs various analyses on the data, including calculating the average price of cars for each year and the standard deviation of the prices of each brand. The cleaned and modified data is then saved to an Excel file.

data data-analysis data-cleaning data-manipulation excel google-sheets pandas pandas-python

Last synced: 13 Nov 2024

https://github.com/gabboraron/biostatisztika_es_alkalmazasai

"A statisztika a matematika azon ága, melynek feladata, hogy eszközt adjon a politikusok kezébe, mellyel tetszőleges állítás és annak ellentéte is tudományos alapon igazolható"

biostatistics data-analysis data-visualization r statistics statistics-course

Last synced: 01 Nov 2024

https://github.com/timkong21/aws-batch-processing

Big data analysis with AWS services, filtering the Wikiticker dataset with Apache Spark on Amazon EMR, storing data in S3, cataloging with AWS Glue, and querying with Amazon Athena. This end-to-end pipeline exemplifies handling and analyzing big data in the cloud.

apache-spark aws aws-athena aws-emr aws-glue aws-s3 big-data cloud-computing data-analysis data-processing

Last synced: 16 Nov 2024

https://github.com/noeldevelops/stem-degrees-analysis-cpp

C++ Data Analysis, I/O - takes an external data file for processing, performs some statistical analysis, and displays the results in the console

cpp data-analysis

Last synced: 16 Nov 2024

https://github.com/maazie-khan/austin-housing-insights-powerbi

Worked with a real estate dataset, we will build a tool to evaluate trends and drivers of house prices around Austin, Texas.

dashboard data-analysis data-science data-visualization database powerbi

Last synced: 13 Oct 2024

https://github.com/maazie-khan/power-bi-projects

Welcome to my personal Power BI portfolio repository! Here you will find a collection of Power BI projects and dashboards that demonstrate my skills and expertise in data visualization, business intelligence, and analytics using Power BI.

dashboard data-analysis data-science data-visualization database excel powerbi

Last synced: 13 Oct 2024

https://github.com/puspacempaka/hackerrank-sql-challenges-intermediate

This repository features solutions to various intermediate-level SQL challenges from HackerRank. It includes efficient SQL queries, problem-solving techniques, and well-documented scripts. Explore these solutions to understand different SQL problems and enhance your skills.

challenges data-analysis database hackerrank-solutions queries sql sql-intermediate-level

Last synced: 13 Oct 2024