Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/tmmvn/analytics-notebooks

A bunch of data analytics notebooks done testing out JetBrains DataLore

ai algorithms data-analysis datalore elements-of-ai helsinki-university-mooc python

Last synced: 07 Nov 2024

https://github.com/jbn/vaquero

A Python library for iterative and interactive data wrangling at laptop-scale.

data data-analysis data-cleaning data-mining dirty-data elt etl etl-framework

Last synced: 13 Nov 2024

https://github.com/chrispsang/healthcare-dataanalysis

Analyze synthetic patient data to identify trends, improve healthcare delivery, and predict patient outcomes using machine learning models. Includes data exploration, preprocessing, model building, and visualizations.

data-analysis data-science data-visualization healthcare jupyter-notebook machine-learning python

Last synced: 06 Nov 2024

https://github.com/thisisashukla/survival-analysis

Hands-On Survival Analysis in Python

data-analysis data-science survival-analysis

Last synced: 07 Nov 2024

https://github.com/kheriberto/pandas_and_seabron_project

In this project I showcase my ability using pandas and seaborn to mold, transform and plot data.

data-analysis pandas python seaborn

Last synced: 10 Nov 2024

https://github.com/kwonnayeon/medium-post-projects

Contains the code and projects from my Medium posts. I share what I've learned through trial and error to help others tackle similar work smoothly.

data-analysis data-science data-visualization medium-articles python r-language sql

Last synced: 13 Nov 2024

https://github.com/dataforgeopenaihub/steam-sales-analysis

This repository features an ETL pipeline for retrieving, processing, validating, and ingesting game metadata and sales data from SteamSpy and Steam APIs. Data is stored in a MySQL database on Aiven Cloud and visualized using Tableau dashboards for insightful analysis of gaming trends and sales performance.

data-analysis data-engineering data-pipepline data-warehousing games mysql-database python steam-api tableau typer-cli

Last synced: 10 Oct 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 12 Oct 2024

https://github.com/sunsided/esc2024

Exploratory Data Analysis on the ESC 2024 results

csv data-analysis eurovision-song-contest scraping

Last synced: 02 Nov 2024

https://github.com/banyc/csv_logger

Long-term logger for data analysis

csv data-analysis logging

Last synced: 19 Nov 2024

https://github.com/darkdk123/handwashing-discovery-analysis

A Guided Project in a Boot camp to Analyse the Original Data used in the Discovery of Viruses & Hand Washing By Dr. Ignaz Semmelweis in Vienna General Hospital in the 1840s.

data-analysis data-science data-visualization matplotlib-pyplot numpy pandas plotly-python python seaborn-plots

Last synced: 07 Nov 2024

https://github.com/pranav016/exploratory-data-analysis-of-google-app-store-dataset

This is a data analysis done on the Google app store dataset to answer a few questions related to the data through data visualization techniques.

data-analysis

Last synced: 05 Nov 2024

https://github.com/darkdk123/house-valuation-model

A Challenge Project in a Boot-Camp to create a ML Model to predict the prices of houses in Boston Massachusetts from multiple parameters Using Multivariable Regression.

data-analysis data-science data-visualization matplotlib-pyplot multivariate-regression predictive-modeling statistics

Last synced: 07 Nov 2024

https://github.com/pranav016/exploratory-data-analysis-of-sp500-dataset

This a data-analysis that I performed on the S&P 500 dataset and answered a few questions through data visualization techniques.

data-analysis

Last synced: 05 Nov 2024

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 12 Oct 2024

https://github.com/thesfinox/fit-the-data

Data analysis using Wolfram Mathematica

analysis data data-analysis lab mathematica wolfram wolfram-mathematica

Last synced: 06 Nov 2024

https://github.com/thesfinox/mltools

A collection of simple tools for data science and machine learning projects.

ai data-analysis data-science data-visualization logging machine-learning matplotlib neural-network python toolbox

Last synced: 06 Nov 2024

https://github.com/spaghettifunk/gvb

Analysis of GVB in Amsterdam

data-analysis public-transportation

Last synced: 08 Nov 2024

https://github.com/marcogdepinto/olympichistoryanalysis

Python visual analysis of the Olympic Games history. Kaggle gold medal with 15000+ views, 200+ upvotes and 100+ comments.

data-analysis data-science jupyter-notebook olympic-games python seaborn

Last synced: 10 Nov 2024

https://github.com/madeiradata/microsoft-data-analysts-club

Open-source Repository of Useful Scripts and Solutions for Microsoft Data Analysts

data-analysis data-visualization microsoft-data-analysis powerbi powerbi-report

Last synced: 06 Nov 2024

https://github.com/banyc/dfplot

Summarize a data frame by plotting. `cargo install --git https://github.com/Banyc/dfplot.git`.

csv data-analysis plotly plotting statistics

Last synced: 19 Nov 2024

https://github.com/findmyway/dataframe-in-julia

A quick introduction of DataFrame in Julia for users from Python

data-analysis dataframe julia jupyter-notebook

Last synced: 12 Oct 2024

https://github.com/ljadhav25/django-data-analyzer

Django Data Analyzer is a web application built using the Django framework, designed to streamline data analysis tasks. Users can upload CSV files containing data for analysis. The application utilizes the powerful data manipulation capabilities of Python libraries like pandas and numpy to perform various analyses on the uploaded data.

data-analysis data-visualization django-application matplotlib numpy pandas python seaborn

Last synced: 10 Nov 2024

https://github.com/emanoelcampos/python-onemonth

This repository contains educational materials and projects developed during a Python course offered by OneMonth. It covers Python basics, intermediate concepts, web development with Flask, and data analysis with pandas. The course is structured into weeks, each focusing on a different aspect of Python programming and its applications.

data-analysis flask jupyter-notebook onemonth python python3

Last synced: 07 Nov 2024

https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance

Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.

bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse

Last synced: 12 Oct 2024

https://github.com/ksharma67/eda-on-ipl

In this python notebook, analysis of IPL matches from 2008 to 2020 is done using python packages like pandas, matplotlib and seaborn.

data-analysis data-science eda matplotlib numpy pandas python seaborn

Last synced: 06 Nov 2024

https://github.com/buildwithlal/introduction-to-data-science-in-python-coursera

introduction to data science in python, part of Applied Data Science using Python Specialization from University of Michigan offered by Coursera

data-analysis matplotlib numpy pandas

Last synced: 08 Nov 2024

https://github.com/faezeh-gholamrezaie/visual-google-scholar-search

A Python script that searches Google Scholar for specific keywords and visually presents the results in various chart formats, enabling researchers to analyze trends and insights in academic literature.

academic academic-research academic-trends ai ai-research bibliometrics data-analysis data-visualization google-scholar publication-analysis python research-trends scholarly scholarly-data word-cloud

Last synced: 07 Nov 2024

https://github.com/virajbhutada/article-clustered-recommendation-system-ml

This project aims to redefine content discovery by delivering personalized article recommendations tailored to individual user preferences. We use advanced machine learning techniques like PCA and K-means clustering to analyze user behavior and article characteristics to provide highly accurate recommendations.

anaconda article-recommendation clustering-algorithm data-analysis data-science keras-tensorflow machine-learning machine-learning-algorithms ml-models numpy pandas plotly python scikit-learn scipy

Last synced: 15 Oct 2024

https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

bigquery data-analysis sql tableau

Last synced: 12 Oct 2024

https://github.com/maazie-khan/olympics-data-enigeering

Worked with Azure Data Factory, Databricks, Data Lake Storage, and Synapse Analytics to build an ETL pipeline for processing and analyzing Olympic Games data from Kaggle.

azure big-data data-analysis dataengineering devops pipeline

Last synced: 14 Oct 2024

https://github.com/an4pdm/relatorio-de-vendas

O presente projeto foi feito através das ferramentas oferecidas pelo Power BI afim de aprimorar meus conhecimentos sobre ETL. Os dados utilizados foram de origem do site "Kaggle".

data-analysis data-visualization database etl powerbi

Last synced: 13 Nov 2024

https://github.com/dionixius7/titanic-disaster-ml-model

This project predicts the survival of passengers on the Titanic by using Kaggle Titanic Disaster Dataset. The dataset contains information related to passengers, such as age, gender, and class. Different machine learning algorithms have been applied for this predictive model to accomplish an accurate prediction that will define the survival chances

data-analysis data-science data-visualization eda knn-classifier machine-learning neural-network python scikit-learn svm tensorflow titanic-kaggle titanic-survival-prediction

Last synced: 17 Nov 2024

https://github.com/mrendiks/analyst-data-survey-monkey

Learn how to analyst data from dataset surver monkey using Excel and Python

data-analysis ipynb-jupyter-notebook python

Last synced: 13 Nov 2024

https://github.com/the-pinbo/dimensionalityredux-pca-vs-autoencoders

Comparative study of PCA and Autoencoders for effective dimensionality reduction, assessed through PSNR and SSIM metrics.

autoencoder-mnist autoencoders data-analysis dimensionality-reduction image-compression mnist neural-networks pca psnr ssim

Last synced: 06 Nov 2024

https://github.com/abidshafee/google.colaboratory_projects

This repository contains the collections of interactive python notebooks (ipynb) that are some of my projects on Data Science, Machine Learning (ML), and Natural Language Processing (NLP).

colaboratory data-analysis data-science lstm machine-learning nlp statistics time-series

Last synced: 06 Nov 2024

https://github.com/nickenshidqia/sql-for-financial-data-analysis

Design SQL queries to generate accurate and timely financial reports including Profit and Loss statements, Balance Sheets, and Cash Flow statements

azure-data-studio data-analysis finance microsoft-sql-server sql

Last synced: 17 Nov 2024

https://github.com/alinenog/desenvolve_gb_2022

Formação Desenvolve 2022 do Grupo Boticário na área de dados

data-analysis data-science googlesheet machine-learning numpy pandas python

Last synced: 06 Nov 2024

https://github.com/datalopes1/desafio_delivery

Desafio do Clube de Assinaturas da Universidade dos Dados para simular as demandas reais de um analista de dados

data-analysis jupyter python

Last synced: 11 Oct 2024

https://github.com/alexgenovese/react-charts-covid-19-data

Examples on COVID-19 data using different library charts: G2, G2Plot, Plotly, ApexCharts

data-analysis data-science data-visualization react reactjs

Last synced: 07 Nov 2024

https://github.com/bineet-ratna-shakya/data-science-salary-analysis

analyzing a dataset containing salaries of data science professionals from 2020 to 2023.

data-analysis data-science data-visualization jupyter numpy pandas python

Last synced: 11 Oct 2024

https://github.com/ramapinnimty/udacity-mlfoundation-nanodegree

This is a repository containing solutions to the assignments that are a part of the Udacity Machine Learning Foundation Nanodegree program.

assignments data-analysis python3 statistics udacity-machine-learning-nanodegree

Last synced: 08 Nov 2024

https://github.com/zimmi48/nixpkgs-issues

Analysis on nixpkgs issue lifetime.

data-analysis github-api nixpkgs

Last synced: 06 Nov 2024

https://github.com/itsmeyogesh22/people-s-analytics-case-study

Part of Danny Ma's virtual apprenticeship online program, "People's Analytics Case Study" aims to demonstrate practical use of various SQL Concepts like Materialized Views, Snapshot Data and Historical Data

danny-ma data-analysis dataanalysis datascience mssqlserver pgadmin4 postgresql snapshot-data sqlserver t-sql

Last synced: 17 Nov 2024

https://github.com/prakashjha1/new-analysis-using-llm-locally

An interactive news analysis tool built with Streamlit and local LLMs. This app allows users to analyze and gain insights from the latest news articles using advanced language models, all running locally. Explore trends, sentiment, and key topics with an intuitive interface.

artificial-intelligence data-analysis data-science llms ollama python streamlit

Last synced: 12 Oct 2024

https://github.com/baguilar6174/python-jupyter-notebooks

Explore data analysis projects with Python, Jupyter and more tools. Discover stunning visualizations and reveal meaningful information in datasets to make informed decisions.

data-analysis jupyter-notebook kaggle pandas python

Last synced: 07 Nov 2024

https://github.com/shreeparab1890/chat-analyzer

This project is a Data Analysis project to analyze the WhatsApp chats.

data-analysis numpy pandas python

Last synced: 08 Nov 2024

https://github.com/shreeparab1890/flipkart-laptops-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Laptops listed on Flipkart.

data-analysis eda exploratory-data-analysis matplotlib-pyplot numpy pandas plotly

Last synced: 08 Nov 2024

https://github.com/shreeparab1890/indian-elections-2019-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Indian Lok Sabha Elections 2019.

data data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib numpy pandas plotly python python3 visualization

Last synced: 08 Nov 2024

https://github.com/shreeparab1890/unicorns-of-india-till-sep-2022-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Unicorns of India till Sep 2022.

analysis data-analysis eda exploratory-data-analysis matplotlib-pyplot numpy pandas plotly

Last synced: 08 Nov 2024

https://github.com/tj2904/lfb-callout-analysis

An investingation into London FIre Brigade's callout data.

data-analysis decsion-tree kmeans lfb-incidents london-fire-brigade pandas python seaborn

Last synced: 07 Nov 2024

https://github.com/astropenguin/optimap

Optimized integrated intensity map method for spectral cubes

astronomy data-analysis data-science python python3 radio-astronomy spectral-cubes

Last synced: 05 Nov 2024

https://github.com/madhursinghbhadoriya/data_analysis_fifa-players

• Using NumPy, Matplotlib, Pandas, etc processed important Information and Characteristic traits on Jupyter Notebook.

analysis data-analysis data-science graphs jupyter-notebook pandas python

Last synced: 05 Nov 2024

https://github.com/madhursinghbhadoriya/data_analysis_sales_insights_using_tableau

• Performed Data Cleaning using MySQL. • Data analysis and ETL in Tableau. • Created an Interactive Dashboard with significant information about the Sales Insights, Profit and Revenue Analysis.

data-analysis data-visualization dataanalysis etl mysql tableau-dashboards tableau-desktop

Last synced: 05 Nov 2024

https://github.com/josafary-ds/curso_dnc

Repositório para armazenamento dos arquivos de estudo e projetos DNC - Cientista de Dados

data-analysis data-science data-visualization machine-learning powerbi python

Last synced: 19 Nov 2024

https://github.com/alinababer/data-science-and-insight-agent-rag-llama3-lava-llm

Data-Science-and-Insight-Agent-RAG-LLama3-Lava-LLM-Django-WebApplication is an advanced AI-driven chatbot designed to assist in data science, document analysis, and image interpretation. This repository contain the Datascience Agent of this project.

artificial-neural-networks classifcation data-analysis data-engineering data-visualization datascience large-language-models llama2 lstm machine-learning python random-forest regression

Last synced: 12 Oct 2024

https://github.com/changyeop-yang/study-datasciencefoundation

Big Data Science and its Analytics plays a major role in this decade. How to clean and prepare your data for analysis is still a challenge, like How to perform basic visualization of your data, How to model your data, How to curve-fit your data, And finally, how to present your findings and wow the audience

data-analysis ios kyungpook-national-university swift

Last synced: 06 Nov 2024

https://github.com/tsffarias/my-books

Exploratory analysis of my Dataset 'All_the_Books_I_read' which contains all the books I've read

books data-analysis python tableau

Last synced: 05 Nov 2024

https://github.com/virajbhutada/diamond-price-estimator

This project develops a predictive model to estimate diamond prices based on characteristics like carat, cut, color, and clarity. It covers data preprocessing, feature engineering, model selection, training, and evaluation. The final product is a web app where users can input diamond attributes to get accurate and instant price predictions.

cross-validation css data-analysis data-science-projects data-visualization eda feature-engineering html hyperparameter-tuning jupyter-notebooks machine-learning ml-algorithms model-deployment model-selection performance-optimization predictive-modeling python python-app user-interface

Last synced: 11 Nov 2024

https://github.com/virajbhutada/hr-analytics-excel-sql-tableau-powerbi

Explore a comprehensive HR Analytics portfolio showcasing data analysis and visualization skills. Featuring dashboards in Power BI, Excel, and Tableau, along with SQL queries for deeper insights. A holistic view of expertise in HR analytics, data visualization, and database management. Let's dive into the game of data insights!

data-analysis data-management data-visualization excel hr-analytics interactive-dashboards portfolio-project postgresql powerbi powerbi-visuals sql sql-queries tableau tableau-public

Last synced: 11 Nov 2024

https://github.com/greenpau/esqrunner

Run Elasticsearh queries and create metrics based on the result of the queries in Elasticsearch database.

data-analysis elasticsearch query-builder querydsl

Last synced: 13 Oct 2024

https://github.com/aniketmondal/dataanalysis

Contains cleaning, transformation, and exploratory analysis of various data sets using Python Pandas, NumPy, re, random, etc.

analysis data-analysis data-science pandas python

Last synced: 07 Nov 2024

https://github.com/tomijuarez/lemmatisation

Lemmatisation fully implemented in Java.

algorithms data-analysis data-science java-8 lemmatization oop

Last synced: 05 Nov 2024

https://github.com/abdelrahmanbayoumi/titanic-machine-learning-from-disasters

Knowing from a training set of samples listing passengers who survived or did not survive the Titanic disaster, can our model determine based on a given test dataset not containing the survival information, if these passengers in the test dataset survived or not.

data-analysis data-science data-visualization machine-learning pandas

Last synced: 05 Nov 2024

https://github.com/cassiofb-dev/fide-rating-analysis

The plot speaks for itself

chess data-analysis fide hans rating

Last synced: 07 Nov 2024

https://github.com/airscholar/data_analysis_with_ai

A repository showing how to use AI and ChatGPT for Data Analysis with Pandas and Python

chatgpt data-analysis gpt4 openai pandas pandasai python

Last synced: 14 Nov 2024

https://github.com/malucor/analise_exploratoria_dados

Programa em Python para fazer uma Análise Exploratória de Dados de Logística.

analise-de-dados analise-exploratoria analise-exploratoria-de-dados data-analysis ebac exploratory-data-analysis python

Last synced: 09 Nov 2024

https://github.com/camara94/data_analyse_series_temporelles

Dans ce tutoriel, nous allons répondre aux questions suivantes: 1. Lire les données Microsoft à l'aide du package **Pandas Data reader** 2. Obtenez le **prix maximum** de l'action de **2017 à 2022** 3. Quelle est la **date du cours le plus élevé** de l'action ? 4. Quelle est la **date du cours le plus bas** de l'action ?

data-analysis data-analysis-python data-science data-structures-and-algorithms data-visualization serie series-forecasting

Last synced: 05 Nov 2024

https://github.com/kishlayjeet/zomato-data-exploration

In this project, we will be exploring a dataset containing information on various restaurants and their ratings, location, and other attributes.

data-analysis eda zomato-data-exploration

Last synced: 06 Nov 2024

https://github.com/manikantasanjay/time_series_data_analysis_on_stocks

Time Series Data Analysis project on Daily Stock Prices of the following companies(Apple, Microsoft, Google, Amazon) for a span of 5 years.

data-analysis pandas stock time-series time-series-analysis

Last synced: 05 Nov 2024

https://github.com/aleskandro/r-hadoop-madreduce-examples

A lot of examples about using R with hadoop for MapReduce with and without libraries as rhadoop/rhipe - [email protected] - Advanced Programming Languages

data-analysis hadoop mapreduce r

Last synced: 07 Nov 2024

https://github.com/malucor/livros

Programa em Python para fazer uma análise de dados sobre livros, a partir de um arquivo Excel.

analise-de-dados book books bookshelf data-analysis livro livros python

Last synced: 09 Nov 2024