An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/phanxuanquang/askdb

Interact with your relational databases using natural language, and even more

csharp dapper data-analysis database dotnet gemini genai generative-ai llm llms sql windows winui winui3

Last synced: 05 Apr 2025

https://github.com/ujjwalkarn/xda

R package for exploratory data analysis

data-analysis data-science exploratory-data-analysis r

Last synced: 26 Apr 2025

https://github.com/abhiamishra/ggshakeR

An analysis and visualization R package that works with publicly available soccer data

analysis data-analysis data-visualization football-analytics library machine-learning plotting r soccer soccer-analytics visualization

Last synced: 13 Nov 2024

https://github.com/bccp/nbodykit

Analysis kit for large-scale structure datasets, the massively parallel way

astrophysics clustering cosmology data-analysis large-scale-structure mpi mpi4py parallel-computing python

Last synced: 29 Nov 2024

https://github.com/Nesvilab/philosopher

PeptideProphet, PTMProphet, ProteinProphet, iProphet, Abacus, and FDR filtering

bioinformatics data-analysis go mass-spectrometry ms-data proteomics

Last synced: 19 Apr 2025

https://github.com/imsanjoykb/data-science-regular-bootcamp

Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.

artificial-intelligence data-analysis data-science data-science-notebook data-science-projects data-visualization database-connection deep-learning etl-pipeline etl-process feature-engineering machine-learning mysql-database neural-network numpy pandas postgresql python python-automation sqlite

Last synced: 14 Feb 2025

https://github.com/innat/ML-Resource

A concise resource repository for machine learning

data-analysis data-science deep-learning kaggle machine-learning python spark

Last synced: 29 Apr 2025

https://github.com/deanmarchiori/analysis-flow

Data Analysis Workflows & Reproducibility Learning Resources

data-analysis reproducibility reproducible-data-science reproducible-science tooling workflow

Last synced: 04 Dec 2024

https://github.com/basedosdados/analises

📊 Repositório de códigos simples e replicáveis das análises publicadas.

data-analysis data-visualization open-source

Last synced: 05 Apr 2025

https://github.com/spectrochempy/spectrochempy

SpectroChemPy is a framework for processing, analyzing and modeling spectroscopic data for chemistry with Python

chemistry data-analysis datasets ftir ftir-data-analysis infrared nmr nmr-data nmr-spectroscopy processing python raman raman-spectra raman-spectroscopy spectroscopy uv-vis

Last synced: 06 Apr 2025

https://github.com/aershov24/machine-learning-ds-interview-questions

🔴 1704 Machine Learning, Data Science & Python Interview Questions (ANSWERED) To Kill Your Next ML & DS Interview. Get All Answers + PDFs on MLStack.Cafe. Post your ML Jobs 👉

algorithms-and-data-structures data-analysis data-science interview-practice interview-preparation interview-questions machine-learning machine-learning-algorithms machinelearning

Last synced: 12 Mar 2025

https://github.com/sanmeet007/logger

Logger is a Flutter-based Android app that enables you to view and export call logs in CSV or JSON format and perform lightweight on-device analysis.

andriod call-data-record-analysis call-logs csv-export csv-import data-analysis flutter json-export open-source

Last synced: 06 Apr 2025

https://github.com/dcwuser/metanumerics

Meta.Numerics is library for advanced numerical computing on the .NET platform. It offers an object-oriented API for statistical analysis, advanced functions, Fourier transforms, numerical integration and optimization, and matrix algebra.

csharp-library data-analysis dotnet math math-library matrix matrix-algebra matrix-factorization matrix-library matrix-multiplication numerical-analysis numerical-integration numerical-optimization numerics optimization scientific-computing special-functions statistical-analysis statistical-tests statistics

Last synced: 09 Apr 2025

https://github.com/tidypyverse/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 11 Apr 2025

https://github.com/talegari/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 03 Mar 2025

https://github.com/sciruby/daru-view

daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.

charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra

Last synced: 24 Jan 2025

https://github.com/SciRuby/daru-view

daru-view is for easy and interactive plotting in web application & IRuby notebook. daru-view is a plugin gem to the existing daru gem.

charts daru daru-view data-analysis data-visualization graphs iruby-notebook nanoc plot-library rails ruby sinatra

Last synced: 27 Mar 2025

https://github.com/coorsaa/shinymlr

shiny-mlr: Integration of the mlr package into shiny

data-analysis data-visualization machine-learning mlr r r-package shiny shiny-apps

Last synced: 16 Mar 2025

https://github.com/Coorsaa/shinyMlr

shiny-mlr: Integration of the mlr package into shiny

data-analysis data-visualization machine-learning mlr r r-package shiny shiny-apps

Last synced: 04 Dec 2024

https://github.com/pragunbhutani/dbt-llm-agent

LLM based AI Agent to automate Data Analysis for dbt projects

agent agentic-ai ai ai-data-analysis data-analysis data-analyst dbt llm

Last synced: 05 May 2025

https://github.com/synthesized-io/fairlens

Identify bias and measure fairness of your data

bias data data-analysis data-science fairness pandas python statistics

Last synced: 15 Nov 2024

https://github.com/firmai/business-analytics-and-mathematics-python-book

Advanced Business Analytics and Mathematics with Python (by @firmai)

analytics business data-analysis data-science mathematics python

Last synced: 19 Nov 2024

https://github.com/stanfordnlp/edu-convokit

Edu-ConvoKit: An Open-Source Framework for Education Conversation Data

data data-analysis data-science education language natural-language-processing

Last synced: 15 Apr 2025

https://github.com/jadianes/data-journalism

Data journalism and easy to replicate notebooks using Python, R, and Web visualisations

data-analysis data-journalism data-visualisation data-visualization exploratory-data-analysis notebook

Last synced: 23 Feb 2025

https://github.com/jepegit/cellpy

extract and tweak data from electrochemical tests of cells

battery chemistry data-analysis electrochemistry opensource physics

Last synced: 14 Nov 2024

https://github.com/leerob/facebook-data-analyzer

📊Python script to analyze the contents of your Facebook data export

beautifulsoup data-analysis facebook python

Last synced: 30 Apr 2025

https://github.com/easonlai/azure_openai_langchain_sample

This repository contains various examples of how to use LangChain, a way to use natural language to interact with LLM, a large language model from Azure OpenAI Service.

azure-openai azure-openai-api azure-openai-service csv data-analysis langchain langchain-python openai python python3

Last synced: 26 Apr 2025

https://github.com/woz-u/DS-Student-Resources

Data Science Student Companion Notebooks and Data Lake

data-analysis data-science data-visualization machine-learning nosql python r sql statistics

Last synced: 27 Nov 2024

https://github.com/gher-uliege/divand.jl

DIVAnd performs an n-dimensional variational analysis of arbitrarily located observations

data-analysis earth-observation eosc-hub interpolation julia ocean-sciences oceanography smoothing-splines spatial-data-analysis toolbox

Last synced: 05 Apr 2025

https://github.com/ndleah/8-week-sql-challenge

#8WeekSQLChallenge by Danny Ma.

data-analysis data-science sql

Last synced: 02 Mar 2025

https://github.com/kianweelee/Edator

A python package that performs exploratory data analysis for users. Additionally, it generates 3 types of output files (cleaned CSV, plots and a text report).

data-analysis data-science exploratory-data-analysis

Last synced: 15 Nov 2024

https://github.com/capitalone/dataCompareR

dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.

compare-data data data-analysis data-science r

Last synced: 04 Dec 2024

https://github.com/thoughtspile/hippotable

👩🏻‍🔬📊 Lightweight data analysis in your browser

csv dashboard data-analysis data-science javascript table visualization

Last synced: 18 Dec 2024

https://github.com/impetus/jumbune

Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,

aiops apm cluster-monitoring data-analysis data-quality developer-tools devops-tools hadoop hadoop-cluster hadoop-monitor hadoop-monitoring monitoring-tool optimization-framework yarn yarn-hadoop-cluster

Last synced: 10 Apr 2025

https://github.com/yusufcinarci/data-science-projects

In this repo, there are (beginner-upper) level projects in the field of data science. I will host these projects that I have done in this field every day in this repo. With the hope that it will be useful to those who are interested in the field of data science like me and will just start...

data-analysis data-science data-science-projects jupyter jupyter-notebook python

Last synced: 10 Apr 2025

https://github.com/paezha/spatial-analysis-r

Open Educational Resource for teaching spatial data analysis and statistics with R

data-analysis open-educational-resource r r-package r-spatial rstats spatial-data-analysis spatial-statistics statistics

Last synced: 09 Apr 2025

https://github.com/njanakiev/folderstats

Python module that collects detailed statistics from a folder structure

data-analysis filesystem pandas python statistics

Last synced: 19 Dec 2024

https://github.com/pixelspark/warp

Convert and analyze large data sets at light speed, on Mac and iOS.

big-data data-analysis mysql postgresql rethinkdb sqlite

Last synced: 19 Nov 2024

https://github.com/apachecn/pandas-cookbook-code-notes

:book: Pandas Cookbook 带注释源码

code data-analysis notes pandas python

Last synced: 02 May 2025

https://github.com/alanderex/pydata-pandas-workshop

Material for my PyData Jupyter & Pandas Workshops, I'm also available for personal in-house trainings on request

data-analysis jupyter-notebook pandas visualisation workshop

Last synced: 14 Apr 2025

https://github.com/AllenInstitute/openscope_databook

OpenScope databook: a collaborative, versioned, data-centric collection of foundational analyses for reproducible systems neuroscience 🐁🧠🔬🖥️📈

dandi-archive data-analysis data-visualization nwb python reproducible-research visualization

Last synced: 01 May 2025

https://github.com/randyzwitch/streamlit-embedcode

Streamlit component for embedding code snippets such as GitHub gists, CodePen snippets, Gitlab snippets, etc.

data-analysis data-science data-visualization python streamlit streamlit-component

Last synced: 10 Feb 2025

https://github.com/renumics/sliceguard

A library for detecting problematic data segments in structured and unstructured data with few lines of code.

data-analysis data-cleaning data-curation data-exploration data-science data-visualization deep-learning eda exploratory-data-analysis machine-learning python visualization

Last synced: 16 Mar 2025

https://github.com/rickiepark/hg-da

<혼자 공부하는 데이터 분석 with 파이썬>의 코드 저장소

data-analysis data-science data-visualization machine-learning matplotlib numpy pandas scikit-learn scipy

Last synced: 06 Apr 2025

https://github.com/tatevkaren/tatevkaren-data-science-portfolio

Data Science Portfolio of Tatev Karen Aslanyan including Case Studies and Research Projects that I have completed that solve business problems or introduce new products. Case Study papers, codes, and additional resources are all included.

blog case-study computer-science data-analysis data-science deep-learning econometrics machine-learning papers portfolio portfolio-website statistics

Last synced: 10 Apr 2025

https://github.com/b0o/apple-autofill-domains

Apple's allowed autofill domains

apple data-analysis github-actions web-scraping

Last synced: 25 Mar 2025

https://github.com/dask-contrib/dask-awkward

Native Dask collection for awkward arrays, and the library to use it.

columnar-format dask data-analysis data-science data-structure jagged-array python ragged-array

Last synced: 12 Apr 2025

https://github.com/cvjena/libmaxdiv

Implementation of the Maximally Divergent Intervals algorithm for Anomaly Detection in multivariate spatio-temporal time-series.

anomalydetection anomalydiscovery data-analysis data-mining datamining machine-learning machine-learning-library machinelearning time-series timeseries

Last synced: 04 Apr 2025

https://github.com/404notf0und/FXY

Security-Scenes-Feature-Engineering-Toolkit, Continuous Integration.一款安全数据特征化工具

data-analysis data-mining feature-engineering machine-learning security security-scenes

Last synced: 21 Nov 2024

https://github.com/staircase-dev/staircase

A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

analysis data-analysis data-structures library numpy pandas python step-function stepfunction

Last synced: 04 Apr 2025

https://github.com/airoldilab/sgd

An R package for large scale estimation with stochastic gradient descent

big-data data-analysis gradient-descent r statistics

Last synced: 10 Apr 2025

https://github.com/404notf0und/fxy

Security-Scenes-Feature-Engineering-Toolkit, Continuous Integration.一款安全数据特征化工具

data-analysis data-mining feature-engineering machine-learning security security-scenes

Last synced: 12 Apr 2025

https://github.com/jmwoloso/pychattr

Python Channel Attribution (pychattr) - A Python implementation of the excellent R ChannelAttribution library

channel-attribution data-analysis data-science machine-learning python python-channel-attribution rpy2 wrapper

Last synced: 13 Nov 2024

https://github.com/antononcube/mathematicavsr

Example projects, code, and documents for comparing Mathematica with R.

comparison data-analysis data-science machine-learning mathematica r time-series

Last synced: 09 Apr 2025

https://github.com/DistrictDataLabs/cultivar

Multidimensional data explorer and visualization tool.

data-analysis data-exploration data-management visualization

Last synced: 15 Apr 2025

https://github.com/VUKOZ-OEL/3d-forest

Visualization, processing and analysis of Lidar point clouds, mainly focused on forest environment. New version of 3D Forest. Process files with terabytes of data. Edit new point attributes. Simple addition of new features by plugins.

3d classification cpp cross-platform data-analysis desktop-application editor forest gui interactive-visualization las laser-scanning lidar opengl plugins point-cloud qt scientific-computing segmentation tree

Last synced: 14 Nov 2024