An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with massive-datasets

A curated list of projects in awesome lists tagged with massive-datasets .

https://github.com/polardb/polardbx-sql

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

cloud-native distributed-transactions enterprise-class high-availability high-concurrency horizontal-scaling htap massive-datasets mysql relational-database

Last synced: 14 May 2025

https://github.com/polardb/polardbx

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

cloud-native distributed-transactions enterprise-class high-availability high-concurrency horizontal-scaling htap massive-datasets mysql relational-databases

Last synced: 06 Apr 2025

https://github.com/federicobruzzone/anti-money-laundering

The project is based on the analysis of the "IBM Transactions for Anti Money Laundering" dataset published on Kaggle. The task is to implement a model which predicts whether or not a transaction is illicit, using the attribute "Is Laundering" as a label to be predicted.

machine-learning machine-learning-algorithms massive-datasets pyspark

Last synced: 07 May 2025

https://github.com/federicobruzzone/algorithms-for-massive-datasets

This repository contains a LaTeX file that generates a PDF document comprising comprehensive notes for the course "Algorithms for Massive Datasets"

algorithms deep-learning linkanalysis massive-datasets recommender-system unimi

Last synced: 22 Feb 2025

https://github.com/manuparra/hadoop-statistics

Calculate statistical measures of one column in big data Datasets with these simply Hadoop Application

avg bigdata hadoop java massive-datasets max min standardeviation

Last synced: 18 Feb 2025

https://github.com/sabaudian/amd_market_basket_analysis

Algorithms for Massive Datasets (AMD) -- Market-baskets analysis project

frequent-itemsets mapreduce market-basket-analysis massive-datasets pyspark python python-3 spark

Last synced: 21 Feb 2025

https://github.com/sj22032003/massive-data-streaming-nodejs

Stream, parse, manipulate and transform extremly large data ( can be 1 GB or 1TB ) in NodeJS without any process block, memory overflow or bottle neck with peak performance. And also show it in UI with the help of webStreams

advance-nodejs buffers massive-datasets node-js stream transform

Last synced: 24 Feb 2025