https://github.com/labbeti/torchoutil
Collection of functions and modules to help development in PyTorch.
https://github.com/labbeti/torchoutil
deep-learning pytorch utilities
Last synced: 6 months ago
JSON representation
Collection of functions and modules to help development in PyTorch.
- Host: GitHub
- URL: https://github.com/labbeti/torchoutil
- Owner: Labbeti
- License: mit
- Created: 2024-01-24T14:28:23.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-29T13:15:50.000Z (9 months ago)
- Last Synced: 2024-10-29T15:56:44.266Z (9 months ago)
- Topics: deep-learning, pytorch, utilities
- Language: Python
- Homepage: https://pypi.org/project/torchoutil/
- Size: 854 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
# torchoutil
Collection of functions and modules to help development in PyTorch.
## Installation
```bash
pip install torchoutil
```The main requirement is **[PyTorch](https://pytorch.org/)**.
To check if the package is installed and show the package version, you can use the following command:
```bash
torchoutil-info
```## Examples
`torchoutil` functions and modules can be used like `torch` ones. The default acronym for `torchoutil` is `to`.
### Label conversions
Supports **multiclass** labels conversions between probabilities, classes indices, classes names and onehot encoding.```python
import torchoutil as toprobs = to.as_tensor([[0.9, 0.1], [0.4, 0.6]])
names = to.probs_to_name(probs, idx_to_name={0: "Cat", 1: "Dog"})
# ["Cat", "Dog"]
```This package also supports **multilabel** labels conversions between probabilities, classes multi-indices, classes multi-names and multihot encoding.
```python
import torchoutil as tomultihot = to.as_tensor([[1, 0, 0], [0, 1, 1], [0, 0, 0]])
indices = to.multihot_to_indices(multihot)
# [[0], [1, 2], []]
```### Typing
```python
from torchoutil import Tensor2Dx1 = torch.as_tensor([1, 2])
print(isinstance(x1, Tensor2D)) # False
x2 = torch.as_tensor([[1, 2], [3, 4]])
print(isinstance(x2, Tensor2D)) # True
``````python
from torchoutil import SignedIntegerTensorx1 = torch.as_tensor([1, 2], dtype=torch.int)
print(isinstance(x1, SignedIntegerTensor)) # Truex2 = torch.as_tensor([1, 2], dtype=torch.long)
print(isinstance(x2, SignedIntegerTensor)) # Truex3 = torch.as_tensor([1, 2], dtype=torch.float)
print(isinstance(x3, SignedIntegerTensor)) # False
```### Padding
```python
import torchoutil as tox1 = torch.rand(10, 3, 1)
x2 = to.pad_dim(x, target_length=5, dim=1, pad_value=-1)
# x2 has shape (10, 5, 1)
``````python
import torchoutil as totensors = [torch.rand(10, 2), torch.rand(5, 3), torch.rand(0, 5)]
padded = to.pad_and_stack_rec(tensors, pad_value=0)
# padded has shape (10, 5)
```### Masking
```python
import torchoutil as tox = to.as_tensor([3, 1, 2])
mask = to.lengths_to_non_pad_mask(x, max_len=4)
# Each row i contains x[i] True values for non-padding mask
# tensor([[True, True, True, False],
# [True, False, False, False],
# [True, True, False, False]])
``````python
import torchoutil as tox = to.as_tensor([1, 2, 3, 4])
mask = to.as_tensor([True, True, False, False])
result = to.masked_mean(x, mask)
# result contains the mean of the values marked as True: 1.5
```### Others tensors manipulations!
```python
import torchoutil as tox = to.as_tensor([1, 2, 3, 4])
result = to.insert_at_indices(x, indices=[0, 2], values=5)
# result contains tensor with inserted values: tensor([5, 1, 2, 5, 3, 4])
``````python
import torchoutil as toperm = to.randperm(10)
inv_perm = to.get_inverse_perm(perm)x1 = to.rand(10)
x2 = x1[perm]
x3 = x2[inv_perm]
# inv_perm are indices that allow us to get x3 from x2, i.e. x1 == x3 here
```### Pre-compute datasets to pickle or HDF files
Here is an example of pre-computing spectrograms of torchaudio `SPEECHCOMMANDS` dataset, using `pack_dataset` function:
```python
from torchaudio.datasets import SPEECHCOMMANDS
from torchaudio.transforms import Spectrogram
from torchoutil import nn
from torchoutil.utils.pack import pack_datasetspeech_commands_root = "path/to/speech_commands"
packed_root = "path/to/packed_dataset"dataset = SPEECHCOMMANDS(speech_commands_root, download=True, subset="validation")
# dataset[0] is a tuple, contains waveform and other metadataclass MyTransform(nn.Module):
def __init__(self) -> None:
super().__init__()
self.spectrogram_extractor = Spectrogram()def forward(self, item):
waveform = item[0]
spectrogram = self.spectrogram_extractor(waveform)
return (spectrogram,) + item[1:]pack_dataset(dataset, packed_root, MyTransform())
```Then you can load the pre-computed dataset using `PackedDataset`:
```python
from torchoutil.utils.pack import PackedDatasetpacked_root = "path/to/packed_dataset"
packed_dataset = PackedDataset(packed_root)
packed_dataset[0] # == first transformed item, i.e. transform(dataset[0])
```## Extras requirements
`torchoutil` also provides additional modules when some specific package are already installed in your environment.
All extras can be installed with `pip install torchoutil[extras]`- If `tensorboard` is installed, the function `load_event_file` can be used. It is useful to load manually all data contained in an tensorboard event file.
- If `numpy` is installed, the classes `NumpyToTensor` and `ToNumpy` can be used and their related function. It is meant to be used to compose dynamic transforms into `Sequential` module.
- If `h5py` is installed, the function `pack_to_hdf` and class `HDFDataset` can be used. Can be used to pack/read dataset to HDF files, and supports variable-length sequences of data.
- If `pyyaml` is installed, the functions `to_yaml` and `load_yaml` can be used.## Contact
Maintainer:
- [Étienne Labbé](https://labbeti.github.io/) "Labbeti": [email protected]