An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/sevdanurgenc/r-programming-for-data-science-lecture-notes

In this repo, I have the course contents of R Programming For Data Science training, which will be given to Sigorta Bilgi ve Gözetim Merkezi by the cooperation of Academy Peak Information Technologies Training and Consultancy between 21 - 23 March 2023.

data-analysis data-science data-visualization r r-programming r-programming-projects

Last synced: 11 Oct 2025

https://github.com/martinthoma/shell-history-analysis

Analyze how you use your shell

data-analysis python shell

Last synced: 24 Apr 2025

https://github.com/sayakpaul/analysis-of-college-database-of-2017-passouts

Contains my analysis of a database containing information about the students of an engineering college.

data-analysis data-visualization matplotlib python-3

Last synced: 12 Jun 2025

https://github.com/cafali/pathscan

PathScan exports information about the contents of directories and hard drives. With a single click, you can create a complete list of all files and paths within a specific folder or across an entire hard drive.

backup command-line data data-analysis data-migration data-mining data-recovery directory folder-management folders forensics hard-drive keyword-extraction logging pathfinding recovery string-search tools utility windows

Last synced: 10 Oct 2025

https://github.com/patilni3/project_sql

Data Analysis using SQL

census-data data-analysis sql

Last synced: 08 Oct 2025

https://github.com/spacetelescope/stistools

Tools for HST/STIS.

astronomy data-analysis hst stis

Last synced: 10 Oct 2025

https://github.com/iondv/report

IONDV. Framework: Report module is to form the analytical reports.

analytics businessintelligence css data data-analysis data-visualization iondv iondv-module reporting

Last synced: 19 Oct 2025

https://github.com/aglove2189/appias

Machine learning workflow toolkit ✨🦋✨

appias data-analysis data-science machine-learning pandas python sklearn workflow

Last synced: 19 Oct 2025

https://github.com/mahshaaban/intro_data_r

A gentle introduction to data analysis in R

data-analysis image-analysis qpcr-analysis r

Last synced: 28 Apr 2025

https://github.com/theengineeringworld/numpy-data-science

NumPy Data Science Essential Traing COurse. Part of Youtube Course Offered by TheEngineeringWorld.

data-analysis data-science numpy numpy-exercises numpy-library numpy-tutorial python python-3-6 python3 scipy2018

Last synced: 09 Oct 2025

https://github.com/afsalashyana/whatsapp-chat-analyzer

Analyze WhatsApp chats with beautiful graphs. Written in JavaFX

data-analysis data-visualization javafx javafx-14 javafx-application whatsapp

Last synced: 04 Sep 2025

https://github.com/darsan-in/rumour-monger-spotter

Rumour Monger Spotter is a prototype developed during a national-level cyber hackathon to identify false information on Twitter. Using the Google Fact Check API and a Multinomial Naive Bayes classifier, the tool analyzes tweet content to assess the likelihood of misinformation. Despite a development window of less than 24 hours, the project won a t

ai data-analysis fact-checking hackathon india naive-bayes national-competition natural-language-processing prototype real-time-analysis social-media text-classification tweet-content twitter

Last synced: 12 Oct 2025

https://github.com/lucasbotang/real_estate_management_data_analysis

Data analysis for real estate management

data-analysis excel mysql tableau

Last synced: 06 Oct 2025

https://github.com/fatihilhan42/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-engineering data-mining data-science data-structures data-visualization database datascience fatihilhan fortytwo fortytwofficial jupyter-notebook python

Last synced: 11 Oct 2025

https://github.com/1ayanabil1/healthcare-machine-learning

Explore our open-source repository focused on healthcare machine learning. We've developed predictive models for cardiovascular disease, diabetes, breast cancer, and more. Our projects employ diverse machine learning algorithms and data science techniques, enhancing early detection, diagnosis, and patient outcomes.

data-analysis data-science deep-learning disease disease-detection disease-modeling disease-prediction eda healthcare-application heathcare jupyter-notebook machine-learning machine-learning-algorithms machinelearning-python python

Last synced: 28 Apr 2025

https://github.com/neutrinoceros/gpgi

A lightweight Python library for efficient in-RAM particle deposition on rectilinear, unrefined grids.

data-analysis grid particles performance

Last synced: 22 Apr 2025

https://github.com/sondosaabed/introduction-to-data-analysis-with-pandas-and-numpy

Learning the data analysis process of questioning, wrangling, exploring, analyzing, and communicating data. Working with data in Python using libraries like NumPy and pandas.

data-analysis data-analyst-nanodegree data-wrangling numpy pandas python

Last synced: 09 Apr 2025

https://github.com/globeandmail/startr-cli

A command-line scaffolder for the startr R project template

data-analysis data-journalism data-visualization journalism r

Last synced: 23 Apr 2025

https://github.com/efharkin/ez-ephys

Easy IO, inspection, and manipulation of electrophysiological data.

data-analysis electrophysiology neurophysiology neuroscience patch-clamp python

Last synced: 14 Jan 2026

https://github.com/itzmeanjan/indian-railway

Exploring Indian Railways time table dataset, with :heart:

data-analysis data-visualization indian-railways matplotlib python python3 railway

Last synced: 17 Oct 2025

https://github.com/taylorteixeira/projeto-ed-satc

Este projeto foi desenvolvido para demonstrar as funcionalidades e práticas de Data Engineering por meio da integração eficiente de infraestrutura, ingestão, processamento e análise de dados em larga escala

azure data-analysis databricks mongodb terraform

Last synced: 09 Oct 2025

https://github.com/sondosaabed/data-visualization-with-matplotlib-and-seaborn

Learning to apply sound design and data visualization principles to the data analysis process. Also learning how to use analysis and visualizations to tell a story with data.

data-analysis data-analyst-nanodegree data-visualization matplotlib python seaborn seaborn-plots

Last synced: 09 Apr 2025

https://github.com/dcs-training/datavisualisationwithr

Data Visualisation with R Workshop (delivered by the Centre in December 2020). This workshop is focusing on visualising your data. Go to the readme file

data-analysis data-visualisation data-wrangling r

Last synced: 25 Apr 2025

https://github.com/sushantdhumak/traffic-forecasting-using-iot-sensor-data

Demonstrates how to utilize XGBoost for traffic forecasting using data gathered from IoT sensors, highlighting its efficiency in processing complex datasets and delivering accurate predictions.

data-analysis data-visualization exploratory-data-analysis feature-engineering feature-importance feature-selection gridsearchcv hyperparameter-optimization hyperparameter-tuning iot random-search xgboost-regression

Last synced: 26 Mar 2025

https://github.com/jackfiszr/pl2xl

Nodejs-polars wrapper with `readExcel` and `writeExcel` methods.

data-analysis data-science deno excel excel-reader excel-writer nodejs polars

Last synced: 21 Jan 2026

https://github.com/dcs-training/digital-method-of-the-month

In this repository you are going to find the documents we produced to support the discussion in our Digital Methods of the Month. These documents will help you orienting yourself if you want to pickup the method in your research. Go to the readme file

3d-data data-analysis data-visualisation data-wrangling geographical-data gis good-practices-digital-research machine-learning network-analysis open-research preregistration statistics text-analysis

Last synced: 28 Oct 2025

https://github.com/sonigarima/donation-management-system

A donation management system for NGOs and Donors. The project is designed for Cognizance IITR 2021 - Salesforce Codathon.

data-analysis donation-management reactjs

Last synced: 07 Sep 2025

https://github.com/dsnchz/solid-g6

A SolidJS component library for graph visualization, powered by @antv/g6

analysis data-analysis data-visualization graph graph-visualization node-ui solidjs visualization

Last synced: 13 Oct 2025

https://github.com/paezha/edashop

An open educational resource to teach a workshop on Exploratory Data Analysis in R

data-analysis exploratory-data-analysis open-educational-resources package r rstats workshop-materials

Last synced: 18 Mar 2025

https://github.com/maksimekin/umd_data_challange_2020

Ocean Clean up data analysis project for the UMD Data Challenge 2020. Data Exploration for a Sustainable Planet.

cleanup competition data-analysis data-science folium geolocation machine-learning ocean planet pollution sklearn sustainability time-series trash umd

Last synced: 05 Jul 2025

https://github.com/waveform80/structa

A small utility for analyzing data structures (e.g. JSON files)

csv data-analysis data-visualization datajournalism datawrangling json yaml

Last synced: 06 Sep 2025

https://github.com/chaitanyac22/house-price-prediction-project-for-a-us-based-housing-company

The goal of this project is to garner data insights using data analytics to purchase houses at a price below their actual value and flip them on at a higher price. This project aims at building an effective regression model using regularization (i.e. advanced linear regression: Ridge and Lasso regression) in order to predict the actual values of prospective housing properties and decide whether to invest in them or not.

advanced-linear-regression business-analytics data-analysis data-cleaning data-manipulation data-visualization exploratory-data-analysis feature-engineering lasso-regression linear-regression machine-learning model-building model-evaluation prediction-model python3 regularization rfe ridge-regression statistics

Last synced: 03 Jul 2025

https://github.com/srinivasrm/mutual-funds-analysis-and-prediction

In this project I have performed analysis and prediction on 1,3,and 5 year returns on 1064 mutual funds in India. I have scraped data from a website which is the most visited website for mutual fund investments.I have tested regression models linear model,SGD Regressor , Random Forest Regressor,Decision Tree Regressor,Ridge,MLP Regressor and linear model (Lasso).After which I have selected the best perorming model and performed Hyper parameter tuning and then deployed an interactive application which can generate the visualization and send an email with the visualization to the users email address.

beautifulsoup data-analysis data-base data-cleaning data-science deployment etl finanace frontend funds machine-learning mutual mutual-funds pgsql python scikit-learn sql streamlit web webapplication

Last synced: 27 Oct 2025

https://github.com/johnsell620/sentiment-analysis-goodreads-reviews

Document-level sentiment analysis of book reviews scraped from the Goodreads website. Technologies used include TensorFlow, Spark, HDFS, Sqoop, Scrapy, and D3.js.

data-analysis data-visualization recurrent-neural-networks web-scraping

Last synced: 30 Apr 2025

https://github.com/richiejp/jdp

Automatically collect and normalise data, then run algorithms on it.

automation-framework data-analysis suse-qa

Last synced: 02 Jan 2026

https://github.com/yash22222/tata-data-visualisation-virtual-internship

Data Visualisation: Empowering Business with Effective Insights Gain insights into leveraging data visualisations as a tool for making informed business decisions.

basics ceo charts cmo data-analysis data-interpretation data-science data-visualization graphs machine-learning mcq microsoft-excel microsoft-power-bi microsoft-word powerpoint-presentations python tableau tata tata-data-visualisation

Last synced: 22 Jul 2025

https://github.com/quantumudit/consumer-goods-sales-analysis

This project focuses on analyzing and visualizing the consumer goods sales in the United States between 2015-2016 using Python & Power BI.

data-analysis data-visualization database jupyter-notebook python sqlite

Last synced: 01 Nov 2025

https://github.com/c0deta1ker/matbasex

MatBaseX is an all-in-one database and analytical tool for photoelectron spectroscopy (PES) analysis, focused on materials and their X-ray interactions. It offers features like a Materials Properties Database, IMFP & XPS Sensitivity Factor Calculator, and PES N-Layer Simulations & Curve Fitting utilities. Explore its powerful capabilities today!

cross-sections crystal-structure crystallography data-analysis data-fitting database electron imfp imfp-calculator-matlab material material-database matlab matlab-application matlab-gui matlab-toolbox pes-modelling photoelectron-spectroscopy photoionization simulation xps

Last synced: 01 Jul 2025

https://github.com/tuliosg/cdp

Repositório do curso "Ciência de Dados para Pesquisa".

data-analysis data-manipulation data-science data-visualization google-colab jupyter-notebook python

Last synced: 14 Jul 2025

https://github.com/cosmoduende/r-ufo-sightings

Are we alone in the universe? - Data Analysis and Data Visualization of UFO sightings with R. How to analyze and visualize data of UFO sightings of the last century in the USA and the rest of the world with R language.

data-analysis data-analytics data-science data-visualisation data-visualization data-visualizations dataviz ovni ovni-dataset r-code r-language r-programming r-stats ufo ufo-analysis ufo-dataset ufo-sighting ufo-sightings

Last synced: 13 May 2025

https://github.com/accurat/react-dataviz

⚛📊🚀 React components to build powerful interactive data visualizations

d3 data-analysis data-visualization react react-components

Last synced: 19 Jun 2025

https://github.com/trainingbypackt/splunk-7-essentials-elearning

Build an elaborate Splunk enterprise environment that will extract powerful insights from your machine-generated big data

data-analysis eventgen indexing machine-learning splunk sub-search visualization

Last synced: 10 Apr 2025

https://github.com/i10mm/gpt-arxiv-fetcher

Revolutionize your research with our GitHub repository, where GPT meets arXiv API for seamless access and analysis of the latest academic papers!

artificial data-analysis intelligence llm machine-learning

Last synced: 14 Jul 2025

https://github.com/franpog859/top-of-the-world

🌍🔝 Proof that your country is the top of the world using GeoTIFF images and a little bit of geometry. Data mining project

data data-analysis data-mining elevation geometry geotiff image-processing matplotlib nvector rasterio

Last synced: 30 Apr 2025

https://github.com/avinashkranjan/basic-data-analysis-and-visualization-in-python

📊 Some of the most important python tools in data science for Data Analysis and Data Visualization.

data-analysis data-science matplotlib matplotlib-pyplot numpy pandas plotly seabourne

Last synced: 30 Oct 2025

https://github.com/asifdotexe/sentimentscoringmodel

This project focuses on performing sentiment analysis on Amazon reviews using natural language processing (NLP) techniques. It includes various steps, from data exploration and preprocessing to building and evaluating sentiment models.

data-analysis data-visualization natural-language-processing sentiment-analysis

Last synced: 28 Oct 2025

https://github.com/super-lou/exstat

🌾 R package to provide an efficient and simple solution to aggregate and analyze the stationarity of time series

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae low-water mann-kendall mann-kendall-tests r stationarity-test statistics time-series

Last synced: 13 Apr 2025

https://github.com/irfanchahyadi/ml-notes

Complete personal notes for performing Data Analysis, Preprocessing, and Training ML model.

data-analysis machine-learning plotting python

Last synced: 11 Jul 2025

https://github.com/rikard-helgegren/leverage_analysis_tool

Analyst tool for portfolio construction. How can levereged certificates be used to increase returns in a portfolio while keeping the risk as low as possible. Use the tool and find out.

cpp data-analysis investment kivy-framework python3

Last synced: 12 Apr 2025

https://github.com/pythondeveloper6/store-sales-eda

simple EDA with some insights on Store Sales

data-analysis eda matplotlib numpy pandas seaborn

Last synced: 11 Apr 2025

https://github.com/louis-heraut/exstat

🌾 R package to provide an efficient and simple solution to aggregate and analyze the stationarity of time series

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae low-water mann-kendall mann-kendall-tests r stationarity-test statistics time-series

Last synced: 16 Jun 2025

https://github.com/kellyjadams/spotify-data-analyze

A serverless data pipeline that logs my Spotify listening history to BigQuery using Cloud Run, then visualizes trends with Looker Studio. Built with Python, Flask, Docker, and GCP..

data-analysis data-engineering

Last synced: 07 May 2025

https://github.com/scottgriv/river-charts

🌊 📉 A Python, Django, Plotly, and Pandas web application that visualizes river data in real time, pulled using an API from the United States Geological Survey (USGS).

api charts data data-analysis data-visualization dataset django pandas plotly python usgs usgs-api visualization webapp

Last synced: 12 Aug 2025

https://github.com/pepe-god/dataprophet

Extracts the identity information citizens from MySQL, creates a family network based on TC ID No. and exports it to CSV

101m 109m adres data-analysis data-extraction database-connector family-tree genealogy gsm hsys identity mysql-database python-script pyton

Last synced: 13 Jul 2025

https://github.com/negativenagesh/spam-ham_email_detection_machine_learning

This project focuses on classifying spam/ham emails, using machine learning algorithms like LGR, NB, RF, DT etc.. and based on the accuracy score and precision score I chose logistic regression for the classification. And I have used streamlit for frontend.

app data-analysis data-cleaning data-engineering data-science data-visualization data-visualizations jupyter-notebook logistic-regression machine-learning modeling naive-bayes-classifier nlp python

Last synced: 12 Apr 2025

https://github.com/rhenkin/visxhclust

A Shiny app and functions for visual exploration of hierarchical clustering.

clustering data-analysis data-science r r-package r-shiny rstats shiny-apps

Last synced: 02 Apr 2025

https://github.com/sushant1827/traffic-forecasting-using-iot-sensor-data

Demonstrates how to utilize XGBoost for traffic forecasting using data gathered from IoT sensors, highlighting its efficiency in processing complex datasets and delivering accurate predictions.

data-analysis data-visualization exploratory-data-analysis feature-engineering feature-importance feature-selection gridsearchcv hyperparameter-optimization hyperparameter-tuning iot random-search xgboost-regression

Last synced: 17 Jul 2025

https://github.com/ynikitenko/lena

Lena is an architectural framework for data analysis

analysis-framework analysis-pipeline data-analysis data-science

Last synced: 30 Apr 2025

https://github.com/quantumudit/movie-ratings-analysis

This project focuses on analyzing and finding correlations between the audience and critic ratings for some of the popular movies released between 2009-2011 using Python & Power BI

data-analysis data-visualization jupyter-notebook power-bi python

Last synced: 01 Nov 2025

https://github.com/orkunaktas/sofascore-webscraping

⚽️I scraped the shot data of the Fenerbahçe - Adana Demirspor match from Sofascore⚽️

beautifulsoup data-analysis football-analytics football-data selenium webscraping

Last synced: 28 Oct 2025

https://github.com/sowinskibraeden/schedulegeneratorapp

The Desktop Application for my schedule-generator algorithm, allowing users to easily interact with the algorithm and its variables to generate schedules as documents for students individually as well as the master timetable

algorithm csv data-analysis dataclasses python-docx python-typing python311 xlsxwriter

Last synced: 09 Jul 2025

https://github.com/coderjolly/player-market-value-prediction

There is an intense transfer speculation that surrounds all major player transfers today. An important part of negotiations is predicting the fair market price for a player. Therefore, we are predicting this Market Value of a player using the data provided in csv format.

data-analysis data-visualization decision-tree-regression machine-learning xgboost-regression

Last synced: 22 Jun 2025

https://github.com/saksham-joshi/sentiment_analyzer

Analyze the sentiment of a text stored in a string or file and understand the reason why your blogs and posts are not ranking up.

data-analysis data-analytics python sentiment-analyser sentiment-analysis sentiment-analysis-without-nltk

Last synced: 22 Aug 2025

https://github.com/gxjansen/user-analysis-with-r-google-analytics

Analyzing user behavior of an E-commerce website with R and (mainly) Google Analytics Data

analytics analytics-api conversion-rate-optimization data-analysis ecommerce google google-analytics r

Last synced: 27 Mar 2025

https://github.com/stimulsoft/stimulsoft.dashboards.php

Dashboards.PHP is a complete software package for designing and viewing dashboards. Includes the JS data analysis engine, dashboard designer and viewer. Support PHP 5, PHP 7, and PHP 8 versions.

charts dashboard-builder dashboards data-analysis data-grid data-visualization datatable dynamic-dashboard interactive-dashboards live-data mysql-data php php-bi-tools php-dashboard php-kpi php7 php8 pivot-tables sql-datasources statistics

Last synced: 14 Oct 2025

https://github.com/thecoderpinar/earthquake_prediction_analysis_project

🌍 Welcome to the Earthquake Prediction Analysis Project! 🚀 This project aims to predict earthquake magnitudes using LSTM neural networks and analyze seismic data. Explore, analyze, and forecast earthquakes with ease! 📈🔮

analysis data-analysis data-science earthquake-prediction geocoding geology lstm lstm-neural-networks machine-learning matlab matlab-deep-learning open-source time-series visualization

Last synced: 16 Aug 2025

https://github.com/apoorvalal/lalrutils

Misc utility functions in R for personal use.

data-analysis r r-package

Last synced: 04 Oct 2025

https://github.com/jonzeolla/lab-securitydataanalysis

An introductory lab to Security Data Analysis (using Apache Metron (incubating)).

apache-metron data-analysis lab metron security

Last synced: 03 Jul 2025

https://github.com/lisa-ho/three-investigators

Respository for scraping and analysing fan data on a German audio drama called 'Die Drei Fragezeichen' (the three investigators).

data-analysis data-viz datawrapper python webscraping

Last synced: 25 Oct 2025

https://github.com/alexeyev/hse-spb-bigdata-python-fall2016

Материалы к курсу по программированию и инструментам анализа данных, прочитанному в петербургском филиале НИУ ВШЭ осенью 2016 года

course-materials data-analysis numpy pandas python scikit-learn sklearn

Last synced: 26 Feb 2025

https://github.com/robinmillford/cortex-ai-multi-model-insights-hub

Cortex AI: Multi-Model Insights Hub is an advanced platform that leverages cutting-edge AI to empower your research, analysis, and data exploration. By integrating multiple Large Language Models (LLMs) with a sophisticated Retrieve-and-Generate (RAG) system

article-extractor chatbot data-analysis data-visualization deepseek-chat deepseek-r1 llama3 llm pdf-document-processor rag streamlit-webapp summarizer vector-database

Last synced: 28 Oct 2025