https://github.com/vbharadwaj-bk/fast_tensor_leverage

Code for the Neurips 2023 Paper "Fast Exact Leverage Score Sampling from Khatri-Rao Products"
https://github.com/vbharadwaj-bk/fast_tensor_leverage

khatri-rao leverage-score randomized-algorithm sparse-tensors

Last synced: 5 months ago
JSON representation

Code for the Neurips 2023 Paper "Fast Exact Leverage Score Sampling from Khatri-Rao Products"

Host: GitHub
URL: https://github.com/vbharadwaj-bk/fast_tensor_leverage
Owner: vbharadwaj-bk
Created: 2022-10-08T21:39:29.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2024-10-23T21:21:49.000Z (over 1 year ago)
Last Synced: 2025-04-12T23:54:45.037Z (over 1 year ago)
Topics: khatri-rao, leverage-score, randomized-algorithm, sparse-tensors
Language: Jupyter Notebook
Homepage: https://arxiv.org/pdf/2301.12584.pdf
Size: 249 MB
Stars: 5
Watchers: 3
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition
This repository contains the code for
the paper [Fast Exact Leverage Score Sampling
from Khatri-Rao Products](https://arxiv.org/pdf/2301.12584),
to appear at Neurips 2023 (main conference, poster).

## What can I do with it?
You can draw samples from the Khatri-Rao product of matrices
with tens of millions of rows each, provided the rank is at most
100-200 (and you have enough memory to store the inputs). These
samples are drawn according to the exact distribution of leverage
scores on the rows of the Khatri-Rao product, which have attractive
properties for randomized linear least-squares.

On LBNL Perlmutter CPU nodes, our code can draw 50,000 samples from
a Khatri-Rao product with $\approx 10^{21}$ total rows and 32 columns
in under 0.33 seconds.

Here's a smaller test case that can run on a laptop:
```python
import numpy as np
import cppimport.import_hook
from cpp_ext.als_module import Sampler

if __name__=='__main__':
# Take J=10,000 samples from the KRP of N=4 matrices,
# each with dimensions I x R, where I = 1000 and R = 8.
I, N, J, R = 1000, 4, 10000, 8
matrices = [np.random.normal(size=(I, R)) for i in range(N)]

sampler = Sampler(matrices, J, R, "efficient")
samples = np.zeros((N, J), dtype=np.uint64)
sampler.KRPDrawSamples(N+1, samples)
```

## How do I reproduce the figures in the paper from the data?
The raw data is stored in `json` format in the folder
`submission_data`. These files were output directly by the
scripts in the `benchmarks` folder. You can reproduce the
figures using the material in the `plotting` folder. From
within the folder, run

```shell
[zsh]> python generate_images.py
```
which will output all figures in PDF format to
`plotting/paper_images`. This script will also print tables
used in the work to `STDOUT`. Alternatively,
use the Jupyter file `generate_images.ipynb` to generate
the plots inside a notebook.

If you would like to generate the data yourself, use the
scripts in the `benchmarks` folder. The accuracy and runtime
benchmarks are easy to reproduce, but you may need a machine
with several hundred gigabytes of RAM to reproduce
experiments involving the largest sparse tensors
(on our benchmark platform, each LBNL Perlmutter
node has 512GB of RAM).

## Details
This repository contains two implementations of the data
structure in the paper. The first, available in the
folder `reference_implementation`, is written entirely in
Python with Numpy. It is slow, but the structure of the
code matches the pseudocode in our paper closely. You
can use it to verify the correctness of our algorithm
or pick apart how the data structure works. You only need
Python, Numpy, and Matplotlib to test the reference
implementation - see the README inside the folder for
more instructions.

The second implementation is a fast version written
in C++, with Pybind11 Python bindings
and compatibility with Numpy for easy
testing. All benchmarks were conducted
with the fast version, and all instructions detail
how to build the fast version.

This repository contains a copy of `json.hpp`
from Niels Lohmann's repository
. It is published
under an MIT license.

## Requirements
You need GCC 11.2+, OpenMP, and an install of the BLAS
and LAPACK. This code has been tested with OpenBLAS, and
we make no guarantees for Intel MKL or other BLAS
implementations. If you decide to use MKL, either a)
try using the Intel C++ compiler instead of GCC or b) if you use GCC,
link MKL in sequential mode. We have observed segfaults
when OpenMP threading is enabled in conjunction with
MKL (when compiled with GCC).

We strongly recommended that you install Intel
Thread Building Blocks (TBB), but this is not
required. We rely on Intel TBB for fast parallel
sorting during sparse tensor decomposition.

Our C++ code could stand alone, but right now, you
need Python as well to run it.

## Building our code

### Step 0: Clone the Repository
Clone the repository and `cd` into it.
```shell
[zsh]> git clone https://github.com/vbharadwaj-bk/fast_tensor_leverage.git
[zsh]> cd fast_tensor_leverage
```

### Step 1: Install Python packages
Install Python dependencies with the following command:
```shell
[zsh]> pip install -r requirements.txt
```
We rely on the Pybind11 and cppimport packages. We
use the HDF5 format to store sparse tensors, so
you need the h5py package if you want to perform
sparse tensor decomposition.

### Step 2: Configure the compile and runtime environments
Within the repository, run the following command:
```shell
[zsh]> python configure.py
```
This will create two files in the repository root:
`config.json` and `env.sh`. Edit the configuration
JSON file with include / link flags for your LAPACK
install. If you have TBB installed (strongly
recommended for good performance), fill in the appropriate
entries. If you do not have TBB installed (or if you
are using a compiler that automatically links the BLAS),
set those JSON entries to empty lists.

The file `env.sh` sets up the runtime environment,
and must be called every time you start a new shell
to run our code. First, set the variables CC
and CXX to your C and C++ compilers. The C++ extension
module is compiled with
these when it is imported by Python at runtime.

### Step 3: Test the code
You're ready to test! The C++ extension
compiles automatically the first time you run
the code, and is not compiled subsequently. Run
the following code:
```shell
[zsh]> python compare_distributions.py
```
If all goes well, you should see a new graph in
the `plotting` folder called
`distribution_comparison_test_generated.png`
(ignore the file `distribution_comparison.png`, which
is the control image for the README).
The graph should look like this:
![Distribution Comparison](plotting/distribution_comparison.png)

This image compares the true probability
distribution of leverage scores of the
Khatri-Rao product
$U_1 \odot U_2 \odot U_3$ (all
chosen as $8 \times 8$ matrices to make
the graph readable) to a
histogram of samples taken by our sampler.
There is excellent agreement between the
distribution that our sampler draws from
and the true leverage score distribution.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/vbharadwaj-bk/fast_tensor_leverage

Awesome Lists containing this project

README