https://github.com/fluxml/safetensors.jl

Last synced: over 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/fluxml/safetensors.jl
Owner: FluxML
License: mit
Created: 2024-02-05T17:04:24.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-05-08T17:12:27.000Z (about 2 years ago)
Last Synced: 2025-02-21T03:03:08.941Z (over 1 year ago)
Language: Julia
Size: 51.8 KB
Stars: 12
Watchers: 10
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # SafeTensors.jl

[![Build Status](https://github.com/FluxML/SafeTensors.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/FluxML/SafeTensors.jl/actions/workflows/CI.yml?query=branch%3Amain)

This packages loads data stored in [safetensor format](https://huggingface.co/docs/safetensors/index).

Since Python is row-major and Julia is column-major, the dimensions are permuted such the tensor has the same shape as in python, but everything is correctly ordered. This includes a performance penalty in sense that we cannot be completely copy-free.

The main function is `load_safetensors` which returns a `Dict{String,V}` where keys are names of tensors and values are tensors. An example from `runtests` is as follows

```julia

julia> using SafeTensors

julia> d = load_safetensors("test/model.safetensors")

Dict{String, Array} with 27 entries:

  "int32_357"   => Int32[0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29…

  "uint8_3"     => UInt8[0x00, 0x01, 0x02]

  "float16_35"  => Float16[0.0 1.0 … 3.0 4.0; 5.0 6.0 … 8.0 9.0; 10.0 11.0 … 13.0…

  "bool_3"      => Bool[0, 1, 0]

  "int64_3"     => [0, 1, 2]

  "int64_35"    => [0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]

  "float32_357" => Float32[0.0 7.0 … 21.0 28.0; 35.0 42.0 … 56.0 63.0; 70.0 77.0 …

  "bool_35"     => Bool[0 1 … 1 0; 1 0 … 0 1; 0 1 … 1 0]

  "float32_35"  => Float32[0.0 1.0 … 3.0 4.0; 5.0 6.0 … 8.0 9.0; 10.0 11.0 … 13.0…

  "float32_3"   => Float32[0.0, 1.0, 2.0]

  "uint8_35"    => UInt8[0x00 0x01 … 0x03 0x04; 0x05 0x06 … 0x08 0x09; 0x0a 0x0b …

  "float16_3"   => Float16[0.0, 1.0, 2.0]

  "int16_357"   => Int16[0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29…

  "int16_3"     => Int16[0, 1, 2]

  "float64_357" => [0.0 7.0 … 21.0 28.0; 35.0 42.0 … 56.0 63.0; 70.0 77.0 … 91.0 …

  "uint8_357"   => UInt8[0x00 0x07 … 0x15 0x1c; 0x23 0x2a … 0x38 0x3f; 0x46 0x4d …

  "float16_357" => Float16[0.0 7.0 … 21.0 28.0; 35.0 42.0 … 56.0 63.0; 70.0 77.0 …

  "int32_3"     => Int32[0, 1, 2]

  "int16_35"    => Int16[0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]

  "int8_357"    => Int8[0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29;…

  "int8_35"     => Int8[0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]

  "bool_357"    => Bool[0 1 … 1 0; 1 0 … 0 1; 0 1 … 1 0;;; 1 0 … 0 1; 0 1 … 1 0; …

  "float64_35"  => [0.0 1.0 … 3.0 4.0; 5.0 6.0 … 8.0 9.0; 10.0 11.0 … 13.0 14.0]

  "int8_3"      => Int8[0, 1, 2]

  "int64_357"   => [0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29; 36 …

  "int32_35"    => Int32[0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]

  "float64_3"   => [0.0, 1.0, 2.0]

```

It can also perform a lazy loading with `SafeTensors.deserialize("model.safetensors")` which `mmap` the file and return a `Dict`-like object:

```julia

julia> tensors = SafeTensors.deserialize("test/model.safetensors"; mmap = true #= default to `true`=#);

julia> tensors["float32_35"]

3×5 mappedarray(ltoh, PermutedDimsArray(reshape(reinterpret(Float32, view(::Vector{UInt8}, 0x0000000000000ef5:0x0000000000000f30)), 5, 3), (2, 1))) with eltype Float32:

  0.0   1.0   2.0   3.0   4.0

  5.0   6.0   7.0   8.0   9.0

 10.0  11.0  12.0  13.0  14.0

```

Serialization is also supported:

```julia

julia> using Random, BFloat16s

julia> weights = Dict("W"=>randn(BFloat16, 3, 5), "b"=>rand(BFloat16, 3))

Dict{String, Array{BFloat16}} with 2 entries:

  "W" => [0.617188 0.695312 … 0.390625 -2.0; -0.65625 -0.617188 … 0.652344 0.244141; 0.226562 2.70312 … -0.174805 -0.7773…

  "b" => [0.111816, 0.566406, 0.283203]

julia> f = tempname();

julia> SafeTensors.serialize(f, weights)

julia> loaded = SafeTensors.deserialize(f);

julia> loaded["W"] ≈ weights["W"]

true

julia> SafeTensors.serialize(f, weights, Dict("Package"=>"SafeTensors.jl", "version"=>"1"))

julia> loaded = SafeTensors.deserialize(f);

julia> loaded.metadata

Dict{String, String} with 2 entries:

  "Package" => "SafeTensors.jl"

  "version" => "1"

```

Working with gpu:

```julia

julia> loaded["W"]

3×5 mappedarray(ltoh, PermutedDimsArray(reshape(reinterpret(BFloat16, view(::Vector{UInt8}, 0x00000000000000b9:0x00000000000000d6)), 5, 3), (2, 1))) with eltype BFloat16:

  0.542969    0.201172   1.38281    -0.255859  -1.55469

  0.172852   -0.949219   0.0561523  -1.34375   -0.206055

 -0.0854492   1.17969   -0.265625   -0.871094   2.25

julia> using CUDA; CUDA.allowscalar(false)

julia> CuArray(loaded["W"])

3×5 CuArray{BFloat16, 2, CUDA.Mem.DeviceBuffer}:

  0.542969    0.201172   1.38281    -0.255859  -1.55469

  0.172852   -0.949219   0.0561523  -1.34375   -0.206055

 -0.0854492   1.17969   -0.265625   -0.871094   2.25

julia> gpu_weights = Dict("W"=>CuArray(loaded["W"]), "b"=>CuArray(loaded["b"]))

Dict{String, CuArray{BFloat16, N, CUDA.Mem.DeviceBuffer} where N} with 2 entries:

  "W" => [0.542969 0.201172 … -0.255859 -1.55469; 0.172852 -0.949219 … -1.34375 -0.206055; -0.0854492 1.17969 … -0.871094…

  "b" => BFloat16[0.871094, 0.773438, 0.703125]

julia> f = tempname();

julia> SafeTensors.serialize(f, gpu_weights)

julia> SafeTensors.deserialize(f)

SafeTensors.SafeTensor{SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{UInt64}}, true}} with 2 entries:

  "W" => BFloat16[0.542969 0.201172 … -0.255859 -1.55469; 0.172852 -0.949219 … -1.34375 -0.206055; -0.0854492 1.17969 … -…

  "b" => BFloat16[0.871094, 0.773438, 0.703125]

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fluxml/safetensors.jl

Awesome Lists containing this project

README