Projects in Awesome Lists tagged with massive-datasets
A curated list of projects in awesome lists tagged with massive-datasets .
https://github.com/polardb/polardbx-sql
PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.
cloud-native distributed-transactions enterprise-class high-availability high-concurrency horizontal-scaling htap massive-datasets mysql relational-database
Last synced: 14 May 2025
https://github.com/helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
array-api data-analytics data-processing data-science distributed gpu hpc machine-learning massive-datasets mpi mpi4py multi-gpu multi-node-cluster numpy parallelism python pytorch tensors
Last synced: 15 May 2025
https://github.com/polardb/polardbx
PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.
cloud-native distributed-transactions enterprise-class high-availability high-concurrency horizontal-scaling htap massive-datasets mysql relational-databases
Last synced: 06 Apr 2025
https://github.com/federicobruzzone/anti-money-laundering
The project is based on the analysis of the "IBM Transactions for Anti Money Laundering" dataset published on Kaggle. The task is to implement a model which predicts whether or not a transaction is illicit, using the attribute "Is Laundering" as a label to be predicted.
machine-learning machine-learning-algorithms massive-datasets pyspark
Last synced: 07 May 2025
https://github.com/federicobruzzone/algorithms-for-massive-datasets
This repository contains a LaTeX file that generates a PDF document comprising comprehensive notes for the course "Algorithms for Massive Datasets"
algorithms deep-learning linkanalysis massive-datasets recommender-system unimi
Last synced: 22 Feb 2025
https://github.com/manuparra/hadoop-statistics
Calculate statistical measures of one column in big data Datasets with these simply Hadoop Application
avg bigdata hadoop java massive-datasets max min standardeviation
Last synced: 18 Feb 2025
https://github.com/sabaudian/amd_market_basket_analysis
Algorithms for Massive Datasets (AMD) -- Market-baskets analysis project
frequent-itemsets mapreduce market-basket-analysis massive-datasets pyspark python python-3 spark
Last synced: 21 Feb 2025
https://github.com/sj22032003/massive-data-streaming-nodejs
Stream, parse, manipulate and transform extremly large data ( can be 1 GB or 1TB ) in NodeJS without any process block, memory overflow or bottle neck with peak performance. And also show it in UI with the help of webStreams
advance-nodejs buffers massive-datasets node-js stream transform
Last synced: 24 Feb 2025