An open API service indexing awesome lists of open source software.

https://github.com/labbeti/torchoutil

Collection of functions and modules to help development in PyTorch.
https://github.com/labbeti/torchoutil

deep-learning pytorch utilities

Last synced: 6 months ago
JSON representation

Collection of functions and modules to help development in PyTorch.

Awesome Lists containing this project

README

        

# torchoutil


Python


PyTorch


Code style: black


Build


Documentation Status

Collection of functions and modules to help development in PyTorch.

## Installation
```bash
pip install torchoutil
```

The main requirement is **[PyTorch](https://pytorch.org/)**.

To check if the package is installed and show the package version, you can use the following command:
```bash
torchoutil-info
```

## Examples

`torchoutil` functions and modules can be used like `torch` ones. The default acronym for `torchoutil` is `to`.

### Label conversions
Supports **multiclass** labels conversions between probabilities, classes indices, classes names and onehot encoding.

```python
import torchoutil as to

probs = to.as_tensor([[0.9, 0.1], [0.4, 0.6]])
names = to.probs_to_name(probs, idx_to_name={0: "Cat", 1: "Dog"})
# ["Cat", "Dog"]
```

This package also supports **multilabel** labels conversions between probabilities, classes multi-indices, classes multi-names and multihot encoding.

```python
import torchoutil as to

multihot = to.as_tensor([[1, 0, 0], [0, 1, 1], [0, 0, 0]])
indices = to.multihot_to_indices(multihot)
# [[0], [1, 2], []]
```

### Typing

```python
from torchoutil import Tensor2D

x1 = torch.as_tensor([1, 2])
print(isinstance(x1, Tensor2D)) # False
x2 = torch.as_tensor([[1, 2], [3, 4]])
print(isinstance(x2, Tensor2D)) # True
```

```python
from torchoutil import SignedIntegerTensor

x1 = torch.as_tensor([1, 2], dtype=torch.int)
print(isinstance(x1, SignedIntegerTensor)) # True

x2 = torch.as_tensor([1, 2], dtype=torch.long)
print(isinstance(x2, SignedIntegerTensor)) # True

x3 = torch.as_tensor([1, 2], dtype=torch.float)
print(isinstance(x3, SignedIntegerTensor)) # False
```

### Padding

```python
import torchoutil as to

x1 = torch.rand(10, 3, 1)
x2 = to.pad_dim(x, target_length=5, dim=1, pad_value=-1)
# x2 has shape (10, 5, 1)
```

```python
import torchoutil as to

tensors = [torch.rand(10, 2), torch.rand(5, 3), torch.rand(0, 5)]
padded = to.pad_and_stack_rec(tensors, pad_value=0)
# padded has shape (10, 5)
```

### Masking

```python
import torchoutil as to

x = to.as_tensor([3, 1, 2])
mask = to.lengths_to_non_pad_mask(x, max_len=4)
# Each row i contains x[i] True values for non-padding mask
# tensor([[True, True, True, False],
# [True, False, False, False],
# [True, True, False, False]])
```

```python
import torchoutil as to

x = to.as_tensor([1, 2, 3, 4])
mask = to.as_tensor([True, True, False, False])
result = to.masked_mean(x, mask)
# result contains the mean of the values marked as True: 1.5
```

### Others tensors manipulations!

```python
import torchoutil as to

x = to.as_tensor([1, 2, 3, 4])
result = to.insert_at_indices(x, indices=[0, 2], values=5)
# result contains tensor with inserted values: tensor([5, 1, 2, 5, 3, 4])
```

```python
import torchoutil as to

perm = to.randperm(10)
inv_perm = to.get_inverse_perm(perm)

x1 = to.rand(10)
x2 = x1[perm]
x3 = x2[inv_perm]
# inv_perm are indices that allow us to get x3 from x2, i.e. x1 == x3 here
```

### Pre-compute datasets to pickle or HDF files

Here is an example of pre-computing spectrograms of torchaudio `SPEECHCOMMANDS` dataset, using `pack_dataset` function:

```python
from torchaudio.datasets import SPEECHCOMMANDS
from torchaudio.transforms import Spectrogram
from torchoutil import nn
from torchoutil.utils.pack import pack_dataset

speech_commands_root = "path/to/speech_commands"
packed_root = "path/to/packed_dataset"

dataset = SPEECHCOMMANDS(speech_commands_root, download=True, subset="validation")
# dataset[0] is a tuple, contains waveform and other metadata

class MyTransform(nn.Module):
def __init__(self) -> None:
super().__init__()
self.spectrogram_extractor = Spectrogram()

def forward(self, item):
waveform = item[0]
spectrogram = self.spectrogram_extractor(waveform)
return (spectrogram,) + item[1:]

pack_dataset(dataset, packed_root, MyTransform())
```

Then you can load the pre-computed dataset using `PackedDataset`:
```python
from torchoutil.utils.pack import PackedDataset

packed_root = "path/to/packed_dataset"
packed_dataset = PackedDataset(packed_root)
packed_dataset[0] # == first transformed item, i.e. transform(dataset[0])
```

## Extras requirements
`torchoutil` also provides additional modules when some specific package are already installed in your environment.
All extras can be installed with `pip install torchoutil[extras]`

- If `tensorboard` is installed, the function `load_event_file` can be used. It is useful to load manually all data contained in an tensorboard event file.
- If `numpy` is installed, the classes `NumpyToTensor` and `ToNumpy` can be used and their related function. It is meant to be used to compose dynamic transforms into `Sequential` module.
- If `h5py` is installed, the function `pack_to_hdf` and class `HDFDataset` can be used. Can be used to pack/read dataset to HDF files, and supports variable-length sequences of data.
- If `pyyaml` is installed, the functions `to_yaml` and `load_yaml` can be used.

## Contact
Maintainer:
- [Étienne Labbé](https://labbeti.github.io/) "Labbeti": [email protected]