https://github.com/cisco-open/ruby-ctypes

Ruby gem for manipuliating binary data using C datatype semantics
https://github.com/cisco-open/ruby-ctypes
datatypes ruby rubygem
Last synced: 2 months ago
JSON representation
Ruby gem for manipuliating binary data using C datatype semantics
Host: GitHub
URL: https://github.com/cisco-open/ruby-ctypes
Owner: cisco-open
License: mit
Created: 2025-02-11T19:53:52.000Z (3 months ago)
Default Branch: main
Last Pushed: 2025-02-26T14:25:48.000Z (2 months ago)
Last Synced: 2025-02-26T15:32:39.162Z (2 months ago)
Topics: datatypes, ruby, rubygem
Language: Ruby
Homepage:
Size: 81.1 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project

README

        # CTypes Ruby Gem

[![Version](https://img.shields.io/gem/v/ctypes.svg)](https://rubygems.org/gems/ctypes)

[![GitHub](https://img.shields.io/badge/github-elf__utils-blue.svg)](http://github.com/cisco-open/ruby-ctypes)

[![Documentation](https://img.shields.io/badge/docs-rdoc.info-blue.svg)](http://rubydoc.info/gems/ctypes/frames)

[![Contributor-Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-fbab2c.svg)](CODE_OF_CONDUCT.md)

[![Maintainer](https://img.shields.io/badge/Maintainer-Cisco-00bceb.svg)](https://opensource.cisco.com)

Manipulate common C types in Ruby.

- unpack complex binary data into ruby types, modify, and repack them as binary 

- bounds checking on types (when packing)

- complex types supported

    - structs with flexible array members

    - arrays terminated by specific values

    - strings terminated by a specific byte sequence

- flexible endian support

    - default endian globally configurable; defaults to host endian

    - individual types can have fixed-endian

    - structs support per attribute endian

- minimal reserved words for Union and Struct types

    - want to avoid colliding with struct & union field names so you don't have

      to rename fields like `len`

- reloadable type definitions (pry `reload-code` friendly)

    - useful for using REPL-based development

## Comparisons

- BinData gem:

    - Tightly coupled with file I/O

    - no support for non-blocking I/O (non-blocking network sockets)

    - reserves common struct attribute names such as `len`

    - does not support reloading of types (pry `reload-code`)

- Fiddle gem:

    - only supports native endian

    - no support for dynamically sized & terminated types

## Installation

Install the gem and add to the application's Gemfile by executing:

    $ bundle add ctypes

If bundler is not being used to manage dependencies, install the gem by executing:

    $ gem install ctypes

## Usage

### Basic types

```ruby

require "ctypes"

# load optional helpers for common types

include CTypes::Helpers

# common integer types all defined: uint64, int64, ..., uint8, int8

# can be used to pack and unpack values

uint32.pack(0xfeedface)                     # => "\xce\xfa\xed\xfe"

uint32.pack(0xfeedface, endian: :big)       # => "\xfe\xed\xfa\xce"

uint32.unpack("\xce\xfa\xed\xfe")           # => 0xfeedface

uint32.unpack("\xfe\xed\xfa\xce", endian: :big)

                                            # => 0xfeedface

# `unpack_one` can be used to manually unpack sequential types from a string. # We recommend using `CTypes::Struct` for complex types, but this approach

# can be useful when exploring binary data.

buf = "\xaa\xbb\xcc\xdd\x11\x22"

word, buf = uint32.unpack_one(buf)          # => [0xddccbbaa, "\x11\x22"]

hword, buf = uint16.unpack_one(buf)         # => [0x2211, ""]

# create fixed-endian types from existing types

u32be = uint32.with_endian(:big)

u32be.pack(0xfeedface)                      # => "\xfe\xed\xfa\xce"

# c strings (char[], uint8[], int8[]) supported by string

string.unpack("hello world\0\0\0\0")        # => "hello world"

string.pack("hello world")                  # => "hello world"

# note: by default strings are greedy; they will consume all bytes in the

# input, but only return the bytes up to the first null byte

string.unpack("first\0second\0")            # => ["first", ""]

# to unpack null-terminated strings use string.terminated

_, rest = string.terminated.unpack("first\0second\0")

                                            # => ["first", "second\x00"]

string.terminated.unpack(rest)              # => ["second", ""]

string.terminated.pack("first")             # => "first\0"

# other bytes can be used to terminate strings

t = string.terminated("\xff")

t.unpack("test\xff")                        # => "test"

t.pack("hello\0world")                      # => "hello\x00world\xFF"

# along with byte sequences

t = string.terminated("STOP")

t.unpack("this is the messageSTOPnext messageSTOP")

                                            # => "this is the message"

t.pack("this is a reply")                   # => "this is a replySTOP"

# fixed-width string (char[16])

string(16).pack("hello world")              # => "hello world\0\0\0\0\0"

string(16).unpack("hello world\0\0\0\0\0")  # => "hello world\0\0\0\0\0"

string(16).unpack("hello world")            # => Exception raised

# fixed-width string, but preserve null bytes when unpacking

char_16 = string(16, trim: false)

char_16.unpack("hello world\0\0\0\0\0")     # => "hello world\0\0\0\0\0"

char_16.pack("hello world")                 # => "hello world\0\0\0\0\0"

```

### Arrays

```ruby

require "ctypes"

include CTypes::Helpers

# fixed-length arrays

pair = array(uint32, 2)

pair.unpack("\x01\x02\x03\x04\x05\x06\x07\x08")

                                            # => [0x04030201, 0x08070605]

pair.unpack("\x01\x02\x03\x04\x05\x06\x07\x08\xff\xff\xff\xff")

                                            # => [0x04030201, 0x08070605]

# dynamic length (greedy) arrays

bytes = array(uint8)

bytes.unpack("hello")                       # => [104, 101, 108, 108, 111]

bytes.unpack("\1\2\3")                      # => [1, 2, 3]

bytes.pack([4,5,6])                         # => "\4\5\6"

# any type can be converted to a fixed-endian type

be_pair = pair.with_endian(:big)

be_pair.unpack("\x01\x02\x03\x04\x05\x06\x07\x08")

                                            # => [0x01020304, 0x05060708]

# and it can be done for the inner type too

be_pair_inner = array(uint8.with_endian(:big))

be_pair_inner.unpack("\x01\x02\x03\x04\x05\x06\x07\x08")

                                            # => [0x01020304, 0x05060708]

# array of null-terminated strings, terminated by an empty string

strings = array(string.terminated("\0"), terminator: "")

strings.unpack("first\0second\0third\0\0")

                                            # => ["first", "second", "third"]

# array of integers, terminated by -1

ints = array(int8, terminator: -1)

ints.pack([1, 2, 3, 4])                     # => "\x01\x02\x03\x04\xFF"

ints.unpack("\x01\x02\x03\x04\xFFtail")     # => [1, 2, 3, 4]

ints.unpack_one("\x01\x02\x03\x04\xFFtail") # => [[1, 2, 3, 4], "tail"]

# array of structs; terminated by the :end type

type = struct do

  attribute :type, enum(uint8, %i[record end])

  attribute :value, uint32

end

records = array(type, terminator: {type: :end, value: 0})

records.pack([{type: :record, value: 0xffff}])

                            # => "\x00\xFF\xFF\x00\x00\x01\x00\x00\x00\x00"

records.unpack("\x00\xFF\xFF\x00\x00\x01\x00\x00\x00\x00")

                            # => struct {

                            #       .type = :record,

                            #       .value = 65535 (0xffff), }

```

### Enums

```ruby

require "ctypes"

include CTypes::Helpers

# default enum is uint32, start numbering at zero

state = enum(%i[invalid running sleep blocked])

state.pack(:running)                        # => "\1\0\0\0"

# can use other integer types

state = enum(uint8, %i[invalid running sleep blocked])

state.pack(:running)                        # => "\1"

# can be sparse

state = enum(uint8, {invalid: 0, running: 5, sleep: 6, blocked: 0xff})

state.pack(:blocked)                        # => "\xff"

# same as above with block syntax

state = enum(uint8) do |e|

  e << :invalid

  e << {running: 5}

  e << :sleep # assigned value 6

  e << {blocked: 0xff}

end

state.pack(:blocked)                        # => "\xff"

```

### Structures

```ruby

# Declare a TLV struct.  Size of each structure is determined by the `len`

# field.

class TLV < CTypes::Struct

  layout do

    endian :big     # all fields will use network-byte order

    attribute :type, enum(uint8, %i[invalid hello read write goodbye])

    attribute :len, uint32

    attribute :value, string

    # dynamically determine the size of the struct when unpacking

    size { |struct| offsetof(:value) + struct[:len] }

  end

end

# pack the tlv struct

version = "v1.0"

TLV.pack({type: :hello, len: version.size, value: version})

                                    # => "\x01\x04\x00\x00\x00v1.0"

# unpack a binary structure

msg = TLV.unpack("\x01\x04\x00\x00\x00v1.0")

msg.type                            # => :hello

msg.value                           # => "v1.0"

# modify the structure and repack into binary representation

msg.type = :goodbye

msg.len = 0

msg.to_binstr                       # => "\x04\x00\x00\x00\x00"

```

### Unions

Note: because the underlying memory for union values is not shared between each

member, accessing multiple members in a union does have a performance penalty

to pack the existing member and unpack the new member.  This penalty can be

avoided for read-only unions by freezing the union instance.

```ruby

class Msg < CTypes::Union

  layout do

    endian :big # network byte-order

    type = enum(uint8, {invalid: 0, hello: 1, read: 2})

    member :hello, struct({type:, version: string})

    member :read, struct({type:, offset: uint64, len: uint64})

    member :type, type

  end

end

# provide only one member when packing

Msg.pack({hello: {type: :hello, version: "v1.0"}})    # => "\x01v1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"

Msg.pack({read: {type: :read, offset: 0xfeed, len: 0xdddd}}) # => "\x02\x00\x00\x00\x00\x00\x00\xFE\xED\x00\x00\x00\x00\x00\x00\xDD\xDD"

# unpack a message and access member values

msg = Msg.unpack("\x02" +

                 "\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe" +

                 "\xab\xab\xab\xab\xab\xab\xab\xab")

msg.type                      # => :read

msg.read.offset               # => 18374403900871474942

msg.read.len                  # => 12370169555311111083

# modify and pack into binary

msg.hello.type = :hello

msg.hello.version = "v1.0"

msg.to_binstr                 # => "\x01v1.0\xFE\xFE\xFE\xFE\xAB\xAB\xAB\xAB\xAB\xAB\xAB\xAB"

```

### Terminated

Some greedy dynamic length types are terminated with byte sequences, or

variable byte sequences.  To handle these types we use CTypes::Terminated.

```ruby

# string.terminated returns a CTypes::Terminated instance

telegram = string.terminated("STOP")

telegrams = array(telegram)

telegrams.unpack("hello worldSTOPnext messageSTOP")

                              # => ["hello world", "next message"]

# record is an id along with an array of data bytes

record = struct({id: uint8, data: array(uint8)})

# each record is terminated with the byte sequence \xff\xee (for reasons?)

term = "\xff\xee"

# create a terminated type for the record (yea, it is ugly right now)

terminated_record = CTypes::Terminated

    .new(type: record,

         locate: proc { |b,_| [b.index(term), term.size] },

         terminate: term)

# and then an array of terminated records type

records = array(terminated_record)

# now pack & unpack as needed

records.pack([

    {id: 1, data: [1, 2, 3, 4]},

    {id: 2, data: [5, 5]},

    {id: 3}

])          # => "\x01\x01\x02\x03\x04\xFF\xEE\x02\x05\x05\xFF\xEE\x03\xFF\xEE"

records.unpack("\x01\x01\x02\x03\x04\xFF\xEE\x02\x05\x05\xFF\xEE\x03\xFF\xEE")

            # => [#,

            #     #,

            #     #]

```

### Custom Types

Custom types can be created then used within other CTypes. The following is an

custom CTypes implementation of the DWARF ULEB128 datatype.  It is a compressed

representation of a 128-bit integer that uses 7 bits per byte for the encoded

value, with the highest bit set on the last byte of the value. The bytes are

stored in little endian order.

```ruby

module ULEB128

  extend CTypes::Type

  # declare the underlying DRY type; it must have a default value, and may

  # have constraints set

  @dry_type = Dry::Types["integer"].default(0)

  # as this is a dynamically sized type, let's set size to be the minimum size

  # for the type (1 byte), and ensure .fixed_size? returns false

  @size = 1

  def self.fixed_size?

    false

  end

  # provide a method for packing the ruby value into the binary representation

  def self.pack(value, endian: default_endian, validate: true)

    return "\x80" if value == 0

    buf = String.new

    while value != 0

      buf << (value & 0x7f)

      value >>= 7

    end

    buf[-1] = (buf[-1].ord | 0x80).chr

    buf

  end

  # provide a method for unpacking an instance of this type from a String, and

  # returning both the unpacked value, and any unused input

  def self.unpack_one(buf, endian: default_endian)

    value = 0

    shift = 0

    len = 0

    buf.each_byte do |b|

      len += 1

      value |= ((b & 0x7f) << shift)

      return value, buf[len...] if (b & 0x80) != 0

      shift += 7

    end

    raise TerminatorNotFoundError

  end

end

# now the type can be used like any other type

ULEB128.unpack_one("\x7f\x7f\x83XXX")       # => [0xffff, "XXX"]

ULEB128.unpack("\x7f\x7f\x83")              # => 0xffff

ULEB128.unpack("\x81XXX")                   # => 1

ULEB128.pack(0)                             # => "\x80"

ULEB128.pack(1)                             # => "\x81"

ULEB128.pack(0xffff)                        # => "\x7F\x7F\x83"

# use it in an array

list = array(ULEB128)

list.unpack("\x7f\x7f\x83\x81\x80")         # => [65535, 1, 0]

# or a struct

t = struct(id: uint32, value: ULEB128)

t.unpack("\1\0\0\0\x7f\x7f\x83XXX")         # => #

```

## Roadmap

See the [open issues](https://github.com/cisco-open/ruby-ctypes/issues) for a

list of proposed features (and known issues).

## Development

After checking out the repo, run `bundle install` to install dependencies.

Then, run `rake spec` to run the tests. You can also run `bin/console` for an

interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run `bundle exec rake install`. To

release a new version, update the version number in `version.rb`, and then run

`bundle exec rake release`, which will create a git tag for the version, push

git commits and the created tag, and push the `.gem` file to

[rubygems.org](https://rubygems.org).

## Contributing

Contributions are what make the open source community such an amazing place to

learn, inspire, and create. Any contributions you make are **greatly

appreciated**. For detailed contributing guidelines, please see

[CONTRIBUTING.md](CONTRIBUTING.md)

## License

The gem is available as open source under the terms of the

[MIT License](https://opensource.org/licenses/MIT).

License. See [LICENSE.txt](LICENSE.txt) for more information.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cisco-open/ruby-ctypes

Awesome Lists containing this project

README