An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/astrodynamic/retailanalitycs-in-postgresql

Develop a SQL script to create a database with tables, views, roles, and functions. Form personalized offers to increase average check, frequency of visits, and cross-selling.

bd csv data-analysis data-export data-input data-manipulation data-validation database-management functions git margin offers postgresql retail role-permission-management selling sql transaction tsv views

Last synced: 06 Apr 2026

https://github.com/asifdotexe/sentimentscoringmodel

This project focuses on performing sentiment analysis on Amazon reviews using natural language processing (NLP) techniques. It includes various steps, from data exploration and preprocessing to building and evaluating sentiment models.

data-analysis data-visualization natural-language-processing sentiment-analysis

Last synced: 28 Oct 2025

https://github.com/johnsell620/sentiment-analysis-goodreads-reviews

Document-level sentiment analysis of book reviews scraped from the Goodreads website. Technologies used include TensorFlow, Spark, HDFS, Sqoop, Scrapy, and D3.js.

data-analysis data-visualization recurrent-neural-networks web-scraping

Last synced: 30 Apr 2025

https://github.com/i10mm/gpt-arxiv-fetcher

Revolutionize your research with our GitHub repository, where GPT meets arXiv API for seamless access and analysis of the latest academic papers!

artificial data-analysis intelligence llm machine-learning

Last synced: 14 Jul 2025

https://github.com/ynikitenko/lena

Lena is an architectural framework for data analysis

analysis-framework analysis-pipeline data-analysis data-science

Last synced: 30 Apr 2025

https://github.com/quantumudit/movie-ratings-analysis

This project focuses on analyzing and finding correlations between the audience and critic ratings for some of the popular movies released between 2009-2011 using Python & Power BI

data-analysis data-visualization jupyter-notebook power-bi python

Last synced: 19 Apr 2026

https://github.com/richiejp/jdp

Automatically collect and normalise data, then run algorithms on it.

automation-framework data-analysis suse-qa

Last synced: 02 Jan 2026

https://github.com/pepe-god/dataprophet

Extracts the identity information citizens from MySQL, creates a family network based on TC ID No. and exports it to CSV

101m 109m adres data-analysis data-extraction database-connector family-tree genealogy gsm hsys identity mysql-database python-script pyton

Last synced: 13 Jul 2025

https://github.com/ficaan/data-analysis-with-python-2023_2024-mooc.fi

These are all the solutions for exercises from Data Analysis with Python 2023/2024, a course offered by the University of Helsinki, Finland.

data-analysis machine-learning mooc-fi programming python

Last synced: 08 Jun 2026

https://github.com/orkunaktas/sofascore-webscraping

⚽️I scraped the shot data of the Fenerbahçe - Adana Demirspor match from Sofascore⚽️

beautifulsoup data-analysis football-analytics football-data selenium webscraping

Last synced: 28 Oct 2025

https://github.com/negativenagesh/spam-ham_email_detection_machine_learning

This project focuses on classifying spam/ham emails, using machine learning algorithms like LGR, NB, RF, DT etc.. and based on the accuracy score and precision score I chose logistic regression for the classification. And I have used streamlit for frontend.

app data-analysis data-cleaning data-engineering data-science data-visualization data-visualizations jupyter-notebook logistic-regression machine-learning modeling naive-bayes-classifier nlp python

Last synced: 12 Apr 2025

https://github.com/saksham-joshi/sentiment_analyzer

Analyze the sentiment of a text stored in a string or file and understand the reason why your blogs and posts are not ranking up.

data-analysis data-analytics python sentiment-analyser sentiment-analysis sentiment-analysis-without-nltk

Last synced: 22 Aug 2025

https://github.com/thecoderpinar/earthquake_prediction_analysis_project

🌍 Welcome to the Earthquake Prediction Analysis Project! 🚀 This project aims to predict earthquake magnitudes using LSTM neural networks and analyze seismic data. Explore, analyze, and forecast earthquakes with ease! 📈🔮

analysis data-analysis data-science earthquake-prediction geocoding geology lstm lstm-neural-networks machine-learning matlab matlab-deep-learning open-source time-series visualization

Last synced: 16 Aug 2025

https://github.com/jimbrig/lossrunAnalyzer

R Package and Shiny App to Analyze Insurance Lossruns

actuarial data-analysis data-mining data-science insurance r record-linkage risk-management shiny

Last synced: 30 Jul 2025

https://github.com/franpog859/top-of-the-world

🌍🔝 Proof that your country is the top of the world using GeoTIFF images and a little bit of geometry. Data mining project

data data-analysis data-mining elevation geometry geotiff image-processing matplotlib nvector rasterio

Last synced: 16 Feb 2026

https://github.com/rikard-helgegren/leverage_analysis_tool

Analyst tool for portfolio construction. How can levereged certificates be used to increase returns in a portfolio while keeping the risk as low as possible. Use the tool and find out.

cpp data-analysis investment kivy-framework python3

Last synced: 12 Apr 2025

https://github.com/yash22222/tata-data-visualisation-virtual-internship

Data Visualisation: Empowering Business with Effective Insights Gain insights into leveraging data visualisations as a tool for making informed business decisions.

basics ceo charts cmo data-analysis data-interpretation data-science data-visualization graphs machine-learning mcq microsoft-excel microsoft-power-bi microsoft-word powerpoint-presentations python tableau tata tata-data-visualisation

Last synced: 22 Jul 2025

https://github.com/alexeyev/hse-spb-bigdata-python-fall2016

Материалы к курсу по программированию и инструментам анализа данных, прочитанному в петербургском филиале НИУ ВШЭ осенью 2016 года

course-materials data-analysis numpy pandas python scikit-learn sklearn

Last synced: 07 Apr 2026

https://github.com/csparpa/last.fm-stats

Exercise on Last.fm data aggregation

data-analysis exercise lastfm lastfm-api python

Last synced: 21 May 2026

https://github.com/coderjolly/player-market-value-prediction

There is an intense transfer speculation that surrounds all major player transfers today. An important part of negotiations is predicting the fair market price for a player. Therefore, we are predicting this Market Value of a player using the data provided in csv format.

data-analysis data-visualization decision-tree-regression machine-learning xgboost-regression

Last synced: 22 Jun 2025

https://github.com/scottgriv/river-charts

🌊 📉 A Python, Django, Plotly, and Pandas web application that visualizes river data in real time, pulled using an API from the United States Geological Survey (USGS).

api charts data data-analysis data-visualization dataset django pandas plotly python usgs usgs-api visualization webapp

Last synced: 12 Aug 2025

https://github.com/srinivasrm/mutual-funds-analysis-and-prediction

In this project I have performed analysis and prediction on 1,3,and 5 year returns on 1064 mutual funds in India. I have scraped data from a website which is the most visited website for mutual fund investments.I have tested regression models linear model,SGD Regressor , Random Forest Regressor,Decision Tree Regressor,Ridge,MLP Regressor and linear model (Lasso).After which I have selected the best perorming model and performed Hyper parameter tuning and then deployed an interactive application which can generate the visualization and send an email with the visualization to the users email address.

beautifulsoup data-analysis data-base data-cleaning data-science deployment etl finanace frontend funds machine-learning mutual mutual-funds pgsql python scikit-learn sql streamlit web webapplication

Last synced: 27 Oct 2025

https://github.com/gxjansen/user-analysis-with-r-google-analytics

Analyzing user behavior of an E-commerce website with R and (mainly) Google Analytics Data

analytics analytics-api conversion-rate-optimization data-analysis ecommerce google google-analytics r

Last synced: 27 Mar 2025

https://github.com/nicucalcea/raise

An R library that uses ChatGPT / GPT to generate data

chatgpt chatgpt-api chatgpt-app data-analysis gpt gpt-35-turbo openai openai-chatgpt parsing r

Last synced: 05 Mar 2025

https://github.com/rhenkin/visxhclust

A Shiny app and functions for visual exploration of hierarchical clustering.

clustering data-analysis data-science r r-package r-shiny rstats shiny-apps

Last synced: 02 Apr 2025

https://github.com/lisa-ho/three-investigators

Respository for scraping and analysing fan data on a German audio drama called 'Die Drei Fragezeichen' (the three investigators).

data-analysis data-viz datawrapper python webscraping

Last synced: 25 Oct 2025

https://github.com/mch-fauzy/data-science

Repository containing portfolio of data science and machine learning projects. Presented in the form of iPython Notebooks

data-analysis data-science data-visualization ipython-notebooks machine-learning natural-language-processing portfolio

Last synced: 24 Sep 2025

https://github.com/accurat/react-dataviz

⚛📊🚀 React components to build powerful interactive data visualizations

d3 data-analysis data-visualization react react-components

Last synced: 19 Jun 2025

https://github.com/baci-ak/b-vista

Interactive EDA tool to explore pandas DataFrames — via Python, notebooks & Docker

analytics data-analysis data-science data-visualization dataframe docker eda flask flexible ipython jupyter notebook pandas python react visualization

Last synced: 22 Jul 2025

https://github.com/artdgn/pages

Auto-updating dashboards about COVID-19 https://artdgn.github.io/pages

covid-19 data-analysis modeling

Last synced: 17 Jan 2026

https://github.com/juangesino/behaviouraleconomics

All the files and data for the experiment performed during the course Behavioural Economics @ University of Amsterdam

behavioral-economics behavioural-economics data-analysis economics game-theory statistics

Last synced: 28 Oct 2025

https://github.com/ac-gomes/data-engineering-with-databricks

A simple boilerplate for data engineering and data analysis training in Databricks.

data-analysis data-engineering databricks databricks-notebooks pyspark python unit-testing

Last synced: 30 Apr 2025

https://github.com/mirdan08/crafty

Data analysis project i've developed for the web scraping course.

blockchain data-analysis webscraping

Last synced: 30 Jul 2025

https://github.com/magnaopus1/synthron-cfd-trader-pro

SYNTHRON CFD Trader PRO is a cutting-edge trading platform featuring raw, custom-designed machine learning models. From reinforcement learning for dynamic strategies to predictive analytics, sentiment analysis, and optimization techniques, it empowers trading across stocks, forex, indices, commodities, futures, and crypto with precision.

ai backtesting cfd commodities data-analysis data-science data-structures forex futures indices machine-learning trading

Last synced: 30 Apr 2025

https://github.com/alhankeser/citibike-analysis

Extracting and Transforming Citi Bike Data for Analysis

citibike data-analysis data-science data-visualization etl sql

Last synced: 25 Jan 2026

https://github.com/mohammadkarbalaee/python-for-data-analysis-book

All the practice and code that I am doing while I read the book called, Python for data analysis

data-analysis data-science python

Last synced: 27 Mar 2025

https://github.com/thecoderpinar/credit-card-fraud-detection-project

This project focuses on the detection of credit card fraud using various data science and machine learning techniques. The dataset includes a record of credit card transactions over a specific period, with the goal of accurately identifying fraudulent activities. 🚀✨

anamoly-detection classification-algorithms credit-card-transactions data-analysis data-preprocessing data-science data-visualization fraud-detection machine-learning python

Last synced: 30 Apr 2025

https://github.com/bdslab-upv/dashi

A flexible and powerful Python toolkit for dataset shift analysis and characterization, providing supervised and unsupervised evaluation of temporal and multi-source data shifts, visualization tools, and statistical insights for data integrity and model performance monitoring

data-analysis data-science dataset-shift python temporal-analysis

Last synced: 13 Dec 2025

https://github.com/storopoli/r_scripts

Couple of handy R Scripts that I use in a daily basis for Scientific Research

data-analysis data-science data-visualization r scientific

Last synced: 08 Jul 2025

https://github.com/nragland37/event-optimization-tool

R-based Shiny application that maps availability and identifies optimal engagement times to enhance participation within an organization

data-analysis data-cleaning data-preparation heatmap r shiny shiny-app tidyverse

Last synced: 02 Feb 2026

https://github.com/nikolas-virionis/polynomial-regression

Python package that analyses the given datasets and comes up with the best regression representation with either the smallest polynomial degree possible, to be the most reliable without overfitting or other models such as exponentials and logarithms

data-analysis exponential-regression flexibility logarithmic-regression logistic-regression polynomial-regression python sinusoisdal-regression statistics

Last synced: 06 Apr 2026

https://github.com/elkronos/anovatoolbox

This GitHub repository contains a collection of functions for performing various statistical analyses and generating visualizations. The functions are designed to work with different types of data and provide comprehensive outputs for data analysis.

anova anova-model data-analysis r statistics

Last synced: 17 Mar 2025

https://github.com/lafayettegabe/nlp-resume-extraction

📝 NER (Named Entity Recognition) project aimed at solving the problem of manually shortlisting resumes by automating the process. This project proposes using NLP techniques and NER model to classify and extract relevant entities from resumes such as person name, college name, academics information, relevant experiences, skill set, etc.

big-data data data-analysis data-science eda ner nlp resume-extractor

Last synced: 03 Apr 2025

https://github.com/rikulauttia/ai-commercial-decisionmaking

AI-Driven Large Dataset Analysis & Commercial Decision-Making: Research on predictive analytics, machine learning strategies, and real-world business applications [Python, TensorFlow, PyTorch] 🤖📊

artificial-intelligence big-data business-intelligence business-strategy commercial-decision-making data-analysis data-science decision-making deep-learning machine-learning neural-networks predictive-analytics python research thesis

Last synced: 03 Mar 2026

https://github.com/valeriopagliarino/tcf-2021-unito-public

Exam project of the course "Computing Tecniques for Physics" - Università degli Studi di Torino - Physics department - 2021

cern-root data-analysis geant4-simulation monte-carlo-simulation object-oriented-programming physics

Last synced: 27 Mar 2025

https://github.com/bradleyboehmke/uc-bana-6043

Additional resources for the UC BANA 6043 Statistical Computing course

data-analysis data-science data-visualization python

Last synced: 10 Jul 2025

https://github.com/priyanka7411/dataspark-electronics-retail-analytics

DataSpark is a data analysis project using Python, SQL, and Power BI to analyze global electronics retail sales, focusing on customer behavior, sales performance, product profitability, and store performance to optimize sales strategies.

analytics-providers business-intelligence customer-segmentation data data-analysis electronics-industry global-sales pandas powerbi powerbi-visuals product-profitability python retail-analytics sales-performance sql store-analysis visualization

Last synced: 10 Jul 2025

https://github.com/manikantasanjay/loan_repayment_regression_project

Prediction Of Loan Repayment using Sequential Neural Networks on Lending Club Dataset.

data-analysis data-visualization lending-club loan-repayment matplotlib numpy pandas-library seaborn tensorboard tensorflow

Last synced: 04 Apr 2025

https://github.com/super-lou/card

🎴 Card of Analyse and Diagnostic in R for a user-friendly experience of data aggregation with EXstat

aggregation climate-change climate-data climate-science data-analysis data-science diagnostic environment environment-variables hydrology hydrology-statistical inrae r statistics tools user-friendly

Last synced: 13 Apr 2025

https://github.com/abhiksark/udacity-dataanalyst-nanodegree

All my codes that were submitted during Udacity Nanodegree - Data Analyst Course.

data-analysis data-visualization matplotlib pandas python seaborn statistics udacity udacity-nanodegree

Last synced: 07 Apr 2026

https://github.com/jpvt/data_science

Portfolio with my Data Science Projects.

data-analysis data-science deep-learning machine-learning portfolio xgboost

Last synced: 19 Jun 2025

https://github.com/varunbanka/data-insights

Data Insights is a user-friendly tool for analyzing large CSV files. Its advanced analytics helps uncover hidden patterns and trends, making it perfect for data scientists and analysts.

artificial-intelligence automation data-analysis data-science dataanalysis datahive numpy pandas python

Last synced: 22 Jun 2025

https://github.com/gabriel-dp/mineirando_github

Project to mine and analyze public GitHub data. Practical work of the Social Network Mining and Analysis subject at UFSJ

data-analysis data-mining github ufsj

Last synced: 07 Mar 2026

https://github.com/cjunwon/youtube-data-analysis

End-to-end Youtube data analysis project using Youtube Data API, MySQL, AWS, Flask

aws-rds data-analysis datapipeline flask nlp pandas python shell sql vader-sentiment-analysis youtube youtube-api

Last synced: 17 Feb 2026

https://github.com/benjamindpb/wikidata-preprocessing

Wikidata dump preprocessing & analysis of georreferencial entities

data-analysis preprocessing wikidata wikidata-dump

Last synced: 15 Jul 2025

https://github.com/andreantonacci/eu2019

Social Network Analysis of Twitter Topic-Network Structures during the 2019 European Elections

data-analysis election-analysis european-elections gephi latex master-thesis social-network-analysis

Last synced: 17 Jul 2025

https://github.com/romac/adaproject

🔬 Project proposal for the Applied Data Analysis course at EPFL

data-analysis

Last synced: 31 Dec 2025

https://github.com/DataHerb/dataherb-python

Python Package for DataHerb: create, search, and load datasets.

data data-analysis data-mining database dataset python

Last synced: 08 May 2025

https://github.com/omarelgabry/insights.py

A Python package for reading, storing, & analyzing data from Public Data APIs

data-analysis

Last synced: 14 Jul 2025

https://github.com/michaelnabil230/laravel-analytics

A Laravel package to retrieve pageviews and other data from Database

data-analysis data-structures database laravel php

Last synced: 25 Jan 2026

https://github.com/tanaylab/naryn

Native Access medical record Retriever for high Yield aNalytics

data-analysis medical-records

Last synced: 20 Jul 2025

https://github.com/louis-heraut/AEAG_toolbox

🛠️ R toolbox to provide a simple way of interacting with all the code necessary to carry out hydrological stationnarity analysis for the Agence de l'Eau Adour-Garonne (AEAG)

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae low-water mann-kendall mann-kendall-tests r stationarity-test statistics

Last synced: 30 Oct 2025

https://github.com/crafterkolyan/applied-statistical-data-analysis

Курс прикладного статистического анализа данных. ВМК МГУ. Весна 2020

autoexec-scripts autotest data-analysis github-actions statistics university

Last synced: 17 Jun 2025

https://github.com/farfarfun/fundata

数据处理工具包 - 提供数据清洗、转换和分析功能

data-analysis data-processing farfarfun numpy pandas python

Last synced: 17 Feb 2026

https://github.com/yashika-malhotra/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 18 Apr 2026

https://github.com/cosmoduende/r-holy-books-sentiment-data-analysis

What's the most positive or negative religion? . Sentiment and Data Analysis of Holy Books with R. Analysis of religious dogmas by exploring their Holy Books (The Bible, The Quran, The Dhammapada, and The Book of Mormon) with R

bible book-of-mormon data-analysis data-analytics data-visualisation data-visualization dataviz dhammapada holy-scriptures quran religions-studies religious religious-studies sentiment-analysis sentiment-polarity sentimental-analysis text-analysis text-analytics text-mining text-mining-analysis

Last synced: 07 Mar 2026

https://github.com/tideland/go-cells

Light-weight event-processing based on the idea of meshed cells with different pluggable behaviors

cep data-analysis data-stream event-processing events golang

Last synced: 05 Apr 2025

https://github.com/jaybird1291/anki-llm-review-stats-exporter

Export your Anki review history (revlog) as JSONL so you can analyze it with an LLM (ChatGPT, Claude, local models, etc.) without using any API.

anki anki-addon chatgpt data-analysis data-export jsonl llm review-stats revlog statistics

Last synced: 24 Dec 2025

https://github.com/zackakil/hot-shot-basketball-tracker

Mini web app for displaying basketball practice metrics.

basketball chartjs data-analysis html sport visualization

Last synced: 28 Jan 2026

https://github.com/louis-heraut/aeag_toolbox

🛠️ R toolbox to provide a simple way of interacting with all the code necessary to carry out hydrological stationnarity analysis for the Agence de l'Eau Adour-Garonne (AEAG)

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae low-water mann-kendall mann-kendall-tests r stationarity-test statistics

Last synced: 08 Apr 2026

https://github.com/quantumudit/analyzing-cleanaway-services

This project focuses on scraping all the service locations across Australia and their associated attributes from "Cleanaway" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 25 Jan 2026

https://github.com/quantumudit/analyzing-suez-services

This project focuses on scraping all the service locations across Australia & New Zealand and their associated attributes from "Suez" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 18 May 2026

https://github.com/forhadulislam/sna-project

A project for Social Network Analysis. Analyzed yahoo query logs

data-analysis data-mining social-network

Last synced: 13 May 2026

https://github.com/gher-uliege/seadatacloud

Tools and interfaces to work with DIVA interpolation software tool.

data-analysis data-visualization interpolation nco netcdf ocean-sciences oceanography

Last synced: 30 Mar 2025

https://github.com/al-ghaly/airline-company-data-warehouse

Data Warehouse modeling, design, implementation, and analysis for an Airline Company.

data-analysis data-warehousing database-modeling sql-server

Last synced: 14 Apr 2025