Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lambdaclass/cairo-vm_in_go
cairo-vm_in_go is a Go implementation of the Cairo VM. Cairo (CPU Algebraic Intermediate Representation) is a programming language for writing provable programs, where one party can prove to another that a certain computation was executed correctly without the need for this party to re-execute the same program.
https://github.com/lambdaclass/cairo-vm_in_go
Last synced: 3 months ago
JSON representation
cairo-vm_in_go is a Go implementation of the Cairo VM. Cairo (CPU Algebraic Intermediate Representation) is a programming language for writing provable programs, where one party can prove to another that a certain computation was executed correctly without the need for this party to re-execute the same program.
- Host: GitHub
- URL: https://github.com/lambdaclass/cairo-vm_in_go
- Owner: lambdaclass
- License: apache-2.0
- Created: 2023-07-28T15:50:08.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-21T20:54:42.000Z (12 months ago)
- Last Synced: 2024-04-07T21:36:46.280Z (7 months ago)
- Language: Go
- Homepage:
- Size: 696 KB
- Stars: 47
- Watchers: 3
- Forks: 9
- Open Issues: 42
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-starknet - cairo-vm_in_go - Another Golang implementation. (Additional developer resources)
README
# cairo-vm.go
This is a work in progress implementation of the [Cairo VM](https://github.com/lambdaclass/cairo-vm) in `Go`. The reasons for doing this include:
- Having a diversity of implementations helps find bugs and make the whole ecosystem more resilient.
- It's a good opportunity to extensively document the VM in general, as currently the documentation on its internals is very scarce and mostly lives on the minds of a few people.## Other docs
- [Project layout](docs/layout.md)
- [Rust/lambdaworks integration](docs/rust-integration.md)## Installation
Go needs to be installed. For mac computers, run
```shell
brew install go
```We also use [pyenv](https://github.com/pyenv/pyenv) to install testing-related dependencies
## Compiling, running, testing
First, install the testing dependencies:
- On Linux, run
```shell
make deps
```- On MacOS, run
```shell
make deps-macos
```To build the project, run:
```shell
make build
```To run all tests, activate the venv created by make deps and run the test target:
```shell
. cairo-vm-env/bin/activate
make test
```## Running the demo
This project currently has two demo targets, one for running a fibonacci programs and one for running a factorial program. Both of them output their corresponding trace files.
The demo uses cairo_lang to compile both cairo programs, you can install it by running `make deps` (or `make deps-macos` if you are on macos)To run the fibonacci demo:
```shell
make demo_fibonacci
```To run the factorial demo:
```shell
make demo_factorial
```These demos compare the trace and memory files generated by the `Go` and `Rust` VM implementations to make sure they coincide. We have more general Makefile targets that do this comparison for every program we use for testing. You can check out a list of all these programs under the `cairo_programs` directory. The targets to run these comparisons are
```
make compare_trace_memory
```and
```
make compare_proof_trace_memory
```The first one does the comparison for normal executions, while the second one does it with *proof mode* enabled.
## Project Guidelines
- PRs addressing performance are forbidden. We are currently concerned with making it work without bugs and nothing more.
- All PRs must contain tests. Code coverage has to be above 98%.
- To check for security and other types of bugs, the code will be fuzzed extensively.
- PRs must be accompanied by its corresponding documentation. A book will be written documenting the entire inner workings of it, so anyone can dive in to a Cairo VM codebase and follow it along.## Milestones
### First milestone: Fibonacci/Factorial
The first milestone for Cairo VM in Go is completed! 🎉
The milestone includes:
- Parsing of `json` programs.
- Decoding of instructions.
- Memory relocation.
- Instruction execution.
- Writing of the trace into files with the correct format.
- Make the fibonacci and factorial tests pass, comparing our own trace with the Rust VM one, making sure they match.
- Writing of the memory into files with the correct format.
- Make the fibonacci and factorial tests pass, comparing our own memory with the Rust VM one, making sure they match.### Second Milestone: Full VM implementation
Below is a list of requirements to have a full implementation along with their progress:
- ✅ Builtins. Add the `BuiltinRunner` logic, then implement each builtin:
- ✅ `Range check (RC)`
- ✅ `Output`
- ✅ `Bitwise`
- ✅ `Ec_op`
- ✅ `Pedersen`
- ✅ `Keccak`
- ✅ `Poseidon`
- ✅ `Signature`
- ✅ Memory layouts. (plain, small, all_cairo)
- Hints.
- ✅ Add the `HintProcessor` logic
- ✅ [Parsing of references](https://github.com/lambdaclass/cairo-vm/tree/main/docs/references_parsing)
- Implement each hint. This is a long task which we divided into milestones. You can check out their progress [here](https://github.com/lambdaclass/cairo-vm_in_go/milestones).
- ✅ Proof mode. It's important to explain in detail what this is when we do it. It's one of the most obscure parts of the VM in my experience.
- ✅ Program tests from Cairo VM in Rust.
- Support for Temporary Memory
- Handle temporary addresses in memory-related code
- Implement relocation process for temporary memory
- Print debug information on failure (i.e. introduce the concept of a `VmException`)
- Air Public inputs. (Tied to Proof mode)
- Starknet integration:
- Running cairo contracts (i.e. implement `RunFromEntrypoint`)
- Tracking `ExecutionResources` and `RunResources`.
- Cairo 1 support. This requires:
- Parsing Cairo 1 contracts. There is no `Json` representation of a Cairo 1 Program, so we can only run contracts. This means this depends on the `RunFromEntrypoint` feature above.
- Implementing Cairo 1 builtin (`Segment Arena`)
- Implementing Cairo 1 Hints
- Support for `CairoPie` (Cairo Position Independent Code).### Other Stuff: Performance and Fuzzing
Things that are out of the second milestone but we want to have:
- Add benchmarks to track performance and compare against the Rust implementation.
- Add Fuzzing and Differential fuzzing to catch errors that our regular tests don't and compare ourselves against the Rust VM.
## Documentation### High Level Overview
The Cairo virtual machine is meant to be used in the context of STARK validity proofs. What this means is that the point of Cairo is not just to execute some code and get a result, but to *prove* to someone else that said execution was done correctly, without them having to re-execute the entire thing. The rough flow for it looks like this:
- A user writes a Cairo program.
- The program is compiled into Cairo's VM bytecode.
- The VM executes said code and provides a *trace* of execution, i.e. a record of the state of the machine and its memory *at every step of the computation*.
- This trace is passed on to a STARK prover, which creates a cryptographic proof from it, attesting to the correct execution of the program.
- The proof is passed to a verifier, who checks that the proof is valid in a fraction of a second, without re-executing.The main three components of this flow are:
- A Cairo compiler to turn a program written in the [Cairo programming language](https://www.cairo-lang.org/) into bytecode.
- A Cairo VM to then execute it and generate a trace.
- [A STARK prover and verifier](https://github.com/lambdaclass/starknet_stack_prover_lambdaworks) so one party can prove correct execution, while another can verify it.While this repo is only concerned with the second component, it's important to keep in mind the other two; especially important are the prover and verifier that this VM feeds its trace to, as a lot of its design decisions come from them. This virtual machine is designed to make proving and verifying both feasible and fast, and that makes it quite different from most other VMs you are probably used to.
### Basic VM flow
Our virtual machine has a very simple flow:
- Take a compiled cairo program as input. You can check out an example program [here](https://github.com/lambdaclass/cairo-vm.go/blob/main/cairo_programs/fibonacci.cairo).
- Run the bytecode from the compiled program, doing the usual `fetch->decode->execute` loop, running until program termination.
- On every step of the execution, record the values of each register.
- Take the register values and memory at every step and write them to a file, called the `execution trace`.Barring some simplifications we made, this is all the Cairo VM does. The two main things that stand out as radically different are the memory model and the use of `Field Elements` to perform arithmetic. Below we go into more detail on each step, and in the process explain the ommisions we made.
### Architecture
The Cairo virtual machine uses a Von Neumann architecture with a Non-deterministic read-only memory. What this means, roughly, is that memory is immutable after you've written to it (i.e. you can only write to it once); this is to make the STARK proving easier, but we won't go into that here.
#### Memory Segments and Relocation
The process of memory allocation in a contiguous write-once memory region can get pretty complicated. Imagine you want to have a regular call stack, with a stack pointer pointing to the top of it and allocation and deallocation of stack frames and local variables happening throughout execution. Because memory is immutable, this cannot be done the usual way; once you allocate a new stack frame that memory is set, it can't be reused for another one later on.
Because of this, memory in Cairo is divided into `segments`. This is just a way of organizing memory more conveniently for this write-once model. Each segment is nothing more than a contiguous memory region. Segments are identified by an `index`, an integer value that uniquely identifies them.
Memory `cells` (i.e. values in memory) are identified by the index of the segment they belong to and an `offset` into said segment. Thus, the memory cell `{2,0}` is the first cell of segment number `2`.
Even though this segment model is extremely convenient for the VM's execution, the STARK prover needs to have the memory as just one contiguous region. Because of this, once execution of a Cairo program finishes, all the memory segments are collapsed into one; this process is called `Relocation`. We will go into more detail on all of this below.
#### Registers
There are only three registers in the Cairo VM:
- The program counter `pc`, which points to the next instruction to be executed.
- The allocation pointer `ap`, pointing to the next unused memory cell.
- The frame pointer `fp`, pointing to the base of the current stack frame. When a new function is called, `fp` is set to the current `ap`. When the function returns, `fp` goes back to its previous value. The VM creates new segments whenever dynamic allocation is needed, so for example the cairo analog to a Rust `Vec` will have its own segment. Relocation at the end meshes everything together.#### Instruction Decoding/Execution
TODO: explain the components of an instruction (`dst_reg`, `op0_reg`, etc), what each one is used for and how they're encoded/decoded.
#### Felts
Felts, or Field Elements, are cairo's basic integer type. Every variable in a cairo vm that is not a pointer is a felt. From our point of view we could say a felt in cairo is an unsigned integer in the range [0, CAIRO_PRIME). This means that all operations are done modulo CAIRO_PRIME. The CAIRO_PRIME is 0x800000000000011000000000000000000000000000000000000000000000001, which means felts can be quite big (up to 252 bits), luckily, we have the [Lambdaworks](https://github.com/lambdaclass/lambdaworks) library to help with handling these big integer values and providing fast and efficient modular arithmetic.
#### Lambdaworks library wrapper
[Lambdaworks](https://github.com/lambdaclass/lambdaworks) is a custom performance-focused library that aims to ease programming for developers. It provides essential mathematical and cryptographic methods required for this project, enabling arithmetic operations between `felts` and type conversions efficiently.
We've developed a C wrapper to expose the library's functions and enable easy usage from Go. This allows seamless integration of the library's features within Go projects, enhancing performance and functionality.#### More on memory
The cairo memory is made up of contiguous segments of variable length identified by their index. The first segment (index 0) is the program segment, which stores the instructions of a cairo program. The following segment (index 1) is the execution segment, which holds the values that are created along the execution of the vm, for example, when we call a function, a pointer to the next instruction after the call instruction will be stored in the execution segment which will then be used to find the next instruction after the function returns. The following group of segments are the builtin segments, one for each builtin used by the program, and which hold values used by the builtin runners. The last group of segments are the user segments, which represent data structures created by the user, for example, when creating an array on a cairo program, that array will be represented in memory as its own segment.
An address (or pointer) in cairo is represented as a `relocatable` value, which is made up of a `segment_index` and an `offset`, the `segment_index` tells us which segment the value is stored in and the `offset` tells us how many values exist between the start of the segment and the value.
As the cairo memory can hold both felts and pointers, the basic memory unit is a `maybe_relocatable`, a variable that can be either a `relocatable` or a `felt`.
While memory is continous, some gaps may be present. These gaps can be created on purpose by the user, for example by running:
```text
[ap + 1] = 2;
```Where a gap is created at ap. But they may also be created indireclty by diverging branches, as for example one branch may declare a variable that the other branch doesn't, as memory needs to be allocated for both cases if the second case is ran then a gap is left where the variable should have been written.
##### Memory API
The memory can perform the following basic operations:
- `memory_add_segment`: Creates a new, empty segment in memory and returns a pointer to its start. Values cannot be inserted into a memory segment that hasn't been previously created.
- `memory_insert`: Inserts a `maybe_relocatable` value at an address indicated by a `relocatable` pointer. For this operation to succeed, the pointer's segment_index must be an existing segment (created using `memory_add_segment`), and there mustn't be a value stored at that address, as the memory is immutable after its been written once. If there is a value already stored at that address but it is equal to the value to be inserted then the operation will be successful.
- `memory_get`: Fetches a `maybe_relocatable` value from a memory address indicated by a `relocatable` pointer.
Other operations:
- `memory_load_data`: This is a convenience method, which takes an array of `maybe_relocatable` and inserts them contiguosuly in memory by calling `memory_insert` and advancing the pointer by one after each insertion. Returns a pointer to the next free memory slot after the inserted data.
##### Memory Relocation
During execution, the memory consists of segments of varying length, and they can be accessed by indicating their segment index, and the offset within that segment. When the run is finished, a relocation process takes place, which transforms this segmented memory into a contiguous list of values. The relocation process works as follows:
1. The size of each segment is calculated (The size is equal to the highest offset within the segment + 1, and not the amount of `maybe_relocatable` values, as there can be gaps)
2. A base is assigned to each segment by accumulating the size of the previous segment. The first segment's base is set to 1.
3. All `relocatable` values are converted into a single integer by adding their `offset` value to their segment's base calculated in the previous stepFor example, if we have this memory represented by address, value pairs:
```text
0:0 -> 1
0:1 -> 4
0:2 -> 7
1:0 -> 8
1:1 -> 0:2
1:4 -> 0:1
2:0 -> 1
```Step 1: Calculate segment sizes:
```text
0 --(has size)--> 3
1 --(has size)--> 5
2 --(has size)--> 1
```Step 2: Assign a base to each segment:
```text
0 --(has base value)--> 1
1 --(has base value)--> 4 (that is: 1 + 3)
2 --(has base value)--> 9 (that is: 4 + 5)
```Step 3: Convert relocatables to integers
```text
1 (base[0] + 0) -> 1
2 (base[0] + 1) -> 4
3 (base[0] + 2) -> 7
4 (base[1] + 0) -> 8
5 (base[1] + 1) -> 3 (that is: base[0] + 2)
.... (memory gaps)
8 (base[1] + 4) -> 2 (that is: base[0] + 1)
9 (base[2] + 0) -> 1
```#### Program parsing
The input of the Virtual Machine is a compiled Cairo program in Json format. The main part of the file are listed below:
- **data:** List of hexadecimal values that represent the instructions and immediate values defined in the cairo program. Each hexadecimal value is stored as a maybe_relocatable element in memory, but they can only be felts because the decoder has to be able to get the instruction fields in its bit representation.
- **debug_info:** This field provides information about the instructions defined in the data list. Each one is identified with its index inside the data list. For each one it contains information about the cairo variables in scope, the hints executed before that instruction if any, and its location inside the cairo program.
- **hints:** All the hints used in the program, ordered by the pc offset at which they should be executed.
- **identifiers:** User-defined symbols in the Cairo code representing variables, functions, classes, etc. with unique names. The expected offset, type and its corresponding information is provided for each identifierFor example, the identifier representing the main function (usually the entrypoint of the program) is of `function` type, and a list of decorators wrappers (if there are any) are provided as additional information.
Another example is a user defined struct, is of `struct` type, it provides its size, the members it contains (with its information) and more.- **main_scope:** Usually something like `__main__`. All the identifiers associated with main function will be identified as `__main__`.identifier_name. Useful to identify the entrypoint of the program.
- **prime:** The cairo prime in hexadecimal format. As explained above, all arithmetic operations are done over a base field, modulo this primer number.
- **reference_manager:** Contains information about cairo variables. This information is useful to access to variables when executing cairo hints.### Code walkthrough/Write your own Cairo VM
Let's begin by creating the basic types and structures for our VM:
#### Felt
As anyone who has ever written a cairo program will know, everything in cairo is a Felt. We can think of it as our unsigned integer. In this project, we use the `Lambdaworks` library to abstract ourselves from modular arithmetic.
TODO: Instructions on how to use Lambdaworks felt from Go
#### Relocatable
This is how cairo represents pointers, they are made up of `SegmentIndex`, which segment the variable is in, and `Offset`, how many values exist between the start of a segment and the variable. We represent them like this:
```go
type Relocatable struct {
SegmentIndex int
Offset uint
}
```#### MaybeRelocatable
As the cairo memory can hold both felts and relocatables, we need a data type that can represent both in order to represent a basic memory unit.
We would normally use enums or unions to represent this type, but as go lacks both, we will instead hold a non-typed inner value and rely on the api to make sure we can only create MaybeRelocatable values with either Felt or Relocatable as inner type.```go
type MaybeRelocatable struct {
inner any
}// Creates a new MaybeRelocatable with an Int inner value
func NewMaybeRelocatableInt(felt uint) *MaybeRelocatable {
return &MaybeRelocatable{inner: Int{felt}}
}// Creates a new MaybeRelocatable with a Relocatable inner value
func NewMaybeRelocatableRelocatable(relocatable Relocatable) *MaybeRelocatable {
return &MaybeRelocatable{inner: relocatable}
}
```We will also add some methods that will allow us access `MaybeRelocatable` inner values:
```go
// If m is Int, returns the inner value + true, if not, returns zero + false
func (m *MaybeRelocatable) GetInt() (Int, bool) {
int, is_type := m.inner.(Int)
return int, is_type
}// If m is Relocatable, returns the inner value + true, if not, returns zero + false
func (m *MaybeRelocatable) GetRelocatable() (Relocatable, bool) {
rel, is_type := m.inner.(Relocatable)
return rel, is_type
}
```These will allow us to safely discern between `Felt` and `Relocatable` values later on.
#### MaybeRelocatable Operations
Introducing the MaybeRelocatable type means we will have to handle various arithmetic operations between Relocatable, Felt and MaybeRelocatable types.
We will start by implementing Add and Sub operations for the `Relocatable` type:##### Relocatable.Add
Addition between Relocatable values is not supported, so we don't implement it.
##### Relocatable.AddFelt
This method adds a Felt value to the relocatable's offset by first converting the relocatable's offset to a Felt, performing felt addition between the offset and the felt value, and then converting the new offset to a uint value. This method returns an error if the new offset exceeds the size of a uint.
```go
// Adds a Felt value to a Relocatable
// Fails if the new offset exceeds the size of a uint
func (r *Relocatable) AddFelt(other lambdaworks.Felt) (Relocatable, error) {
new_offset_felt := lambdaworks.FeltFromUint64(uint64(r.Offset)).Add(other)
new_offset, err := new_offset_felt.ToU64()
if err != nil {
return *r, err
}
return NewRelocatable(r.SegmentIndex, uint(new_offset)), nil
}
```##### Relocatable.Sub
This method returns the distance between two relocatable values. It can only be performed between to relocatables of the same segment (aka relocatables with the same segment index), and it returns the difference between their offsets as a uint value. It fails if the segment indexes differ or if the difference would yield a negative value
```go
// Returns the distance between two relocatable values (aka the difference between their offsets)
// Fails if they have different segment indexes or if the difference is negative
func (r *Relocatable) Sub(other Relocatable) (uint, error) {
if r.SegmentIndex != other.SegmentIndex {
return 0, errors.New("Cant subtract two relocatables with different segment indexes")
}
if r.Offset < other.Offset {
return 0, errors.New("Relocatable subtraction yields relocatable with negative offset")
}
return r.Offset - other.Offset, nil
}
```##### Relocatable.SubFelt
This method subtracts a Felt value to the relocatable's offset by first converting the relocatable's offset to a Felt, performing felt subtraction between the offset and the felt value, and then converting the new offset to a uint value. This method returns an error if the new offset is negative or exceeds the size of a uint.
```go
// Substracts a Felt value from a Relocatable
// Performs the initial substraction considering the offset as a Felt
// Fails if the new offset exceeds the size of a uint
func (r *Relocatable) SubFelt(other lambdaworks.Felt) (Relocatable, error) {
new_offset_felt := lambdaworks.FeltFromUint64(uint64(r.Offset)).Sub(other)
new_offset, err := new_offset_felt.ToU64()
if err != nil {
return *r, err
}
return NewRelocatable(r.SegmentIndex, uint(new_offset)), nil
}
```Now lets look at the operations between `MaybeRelocatable`s:
##### MaybeRelocatable.Add
There are four different cases to consider when adding two `MaybeRelocatable` values:
1. Both values are `Felt`: We perform felt addition
2. Both values are `Relocatable`: This operation is not supported so we return an error
3. First value is `Felt` and other in `Relocatable`: This operation is not supported so we return an error
4. First value is `Relocatable` and other is `Felt`: We call `Relocatable.AddFelt````go
func (m MaybeRelocatable) Add(other MaybeRelocatable) (MaybeRelocatable, error) {
// check if they are felt
m_int, m_is_int := m.GetFelt()
other_int, other_is_int := other.GetFelt()if m_is_int && other_is_int {
result := NewMaybeRelocatableFelt(m_int.Add(other_int))
return *result, nil
}// check if one is relocatable and the other int
m_rel, is_rel_m := m.GetRelocatable()
other_rel, is_rel_other := other.GetRelocatable()if is_rel_m && !is_rel_other {
other_felt, _ := other.GetFelt()
relocatable, err := m_rel.AddFelt(other_felt)
if err != nil {
return *NewMaybeRelocatableFelt(lambdaworks.FeltZero()), err
}
return *NewMaybeRelocatableRelocatable(relocatable), nil} else if !is_rel_m && is_rel_other {
m_felt, _ := m.GetFelt()
relocatable, err := other_rel.AddFelt(m_felt)
if err != nil {
return *NewMaybeRelocatableFelt(lambdaworks.FeltZero()), err
}
return *NewMaybeRelocatableRelocatable(relocatable), nil
} else {
return *NewMaybeRelocatableFelt(lambdaworks.FeltZero()), errors.New("RelocatableAdd")
}
}
```##### MaybeRelocatable.Sub
There are four different cases to consider when adding two `MaybeRelocatable` values:
1. Both values are `Felt`: We perform felt subtraction
2. Both values are `Relocatable`: We call `Relocatable.Sub`
3. First value is `Felt` and other in `Relocatable`: This operation is not supported so we return an error
4. First value is `Relocatable` and other is `Felt`: We call `Relocatable.SubFelt````go
func (m MaybeRelocatable) Sub(other MaybeRelocatable) (MaybeRelocatable, error) {
// check if they are felt
m_int, m_is_int := m.GetFelt()
other_felt, other_is_felt := other.GetFelt()if m_is_int && other_is_felt {
result := NewMaybeRelocatableFelt(m_int.Sub(other_felt))
return *result, nil
}// check if one is relocatable and the other int
m_rel, is_rel_m := m.GetRelocatable()
other_rel, is_rel_other := other.GetRelocatable()if is_rel_m && !is_rel_other {
relocatable, err := m_rel.SubFelt(other_felt)
if err != nil {
return *NewMaybeRelocatableFelt(lambdaworks.FeltZero()), err
}
return *NewMaybeRelocatableRelocatable(relocatable), nil} else if is_rel_m && is_rel_other {
offset_diff, err := m_rel.Sub(other_rel)
if err != nil {
return *NewMaybeRelocatableFelt(lambdaworks.FeltZero()), err
}
return *NewMaybeRelocatableFelt(lambdaworks.FeltFromUint64(uint64(offset_diff))), nil
} else {
return *NewMaybeRelocatableFelt(lambdaworks.FeltZero()), errors.New("Cant sub Relocatable from Felt")
}
}
```#### Memory
As we previously described, the memory is made up of a series of segments of variable length, each containing a continuous sequence of `MaybeRelocatable` elements. Memory is also immutable, which means that once we have written a value into memory, it can't be changed.
There are multiple valid ways to represent this memory structure, but the simplest way to represent it is by using a map, maping a `Relocatable` address to a `MaybeRelocatable` value.
As we don't have an actual representation of segments, we have to keep track of the number of segments.```go
type Memory struct {
data map[Relocatable]MaybeRelocatable
num_segments uint
}
```Now we can define the basic memory operations:
##### Insert
Here we need to make perform some checks to make sure that the memory remains consistent with its rules:
- We must check that insertions are performed on previously-allocated segments, by checking that the address's segment_index is lower than our segment counter
- We must check that we are not mutating memory we have previously written, by checking that the memory doesn't already contain a value at that address that is not equal to the one we are inserting```go
func (m *Memory) Insert(addr Relocatable, val *MaybeRelocatable) error {
// Check that insertions are preformed within the memory bounds
if addr.segmentIndex >= int(m.num_segments) {
return errors.New("Error: Inserting into a non allocated segment")
}// Check for possible overwrites
prev_elem, ok := m.data[addr]
if ok && prev_elem != *val {
return errors.New("Memory is write-once, cannot overwrite memory value")
}m.data[addr] = *val
return nil
}
```##### Get
This is the easiest operation, as we only need to fetch the value from our map:
```go
// Gets some value stored in the memory address `addr`.
func (m *Memory) Get(addr Relocatable) (*MaybeRelocatable, error) {
value, ok := m.data[addr]if !ok {
return nil, errors.New("Memory Get: Value not found")
}return &value, nil
}
```#### MemorySegmentManager
In our `Memory` implementation, it looks like we need to have segments allocated before performing any valid memory operation, but we can't do so from the `Memory` api. To do so, we need to use the `MemorySegmentManager`.
The `MemorySegmentManager` is in charge of creating new segments and calculating their size during the relocation process, it has the following structure:```go
type MemorySegmentManager struct {
segmentSizes map[uint]uint
Memory Memory
}
```And the following methods:
##### Add Segment
As we are using a map, we dont have to allocate memory for the new segment, so we only have to raise our segment counter and return the first address of the new segment:
```go
func (m *MemorySegmentManager) AddSegment() Relocatable {
ptr := Relocatable{int(m.Memory.num_segments), 0}
m.Memory.num_segments += 1
return ptr
}
```##### Load Data
This method inserts a contiguous array of values starting from a certain addres in memory, and returns the next address after the inserted values. This is useful when inserting the program's instructions in memory.
In order to perform this operation, we only need to iterate over the array, inserting each value at the address indicated by `ptr` while advancing the ptr with each iteration and then return the final ptr.```go
func (m *MemorySegmentManager) LoadData(ptr Relocatable, data *[]MaybeRelocatable) (Relocatable, error) {
for _, val := range *data {
err := m.Memory.Insert(ptr, &val)
if err != nil {
return Relocatable{0, 0}, err
}
ptr.offset += 1
}
return ptr, nil
}
```#### RunContext
The RunContext keeps track of the vm's registers. Cairo VM only has 3 registers:
- The program counter `Pc`, which points to the next instruction to be executed.
- The allocation pointer `Ap`, pointing to the next unused memory cell.
- The frame pointer `Fp`, pointing to the base of the current stack frame. When a new function is called, `Fp` is set to the current `Ap` value. When the function returns, `Fp` goes back to its previous value.We can represent it like this:
```go
type RunContext struct {
Pc memory.Relocatable
Ap memory.Relocatable
Fp memory.Relocatable
}
```#### VirtualMachine
With all of these types and structures defined, we can build our VM:
```go
type VirtualMachine struct {
RunContext RunContext
currentStep uint
Segments memory.MemorySegmentManager
}
```To begin coding the basic execution functionality of our VM, we only need these basic fields, we will be adding more fields as we dive deeper into this guide.
#### Instruction Decoding and Execution
Cairo program execution is divided into steps, and in turn each step is divided into:
1. Instruction decoding
2. Instruction execution##### Step
This method is the organizer of the execution of each instruction, it orchestrates them and handles the possible errors.
The first thing it does is to obtain the instruction we want to run, it does that by getting the value on memory where the current pc is pointing. We know that the instruction has to be a felt, if it is not, then there is an error with the encoding of the instruction.
Once we retrieve the felt we have the `encoded instruction`, we need to decode it to get the fields from its bits representation. Felt is not useful anymore so we will get its integer representation.
Now it's time to decode the instruction and then run the `decoded instruction`.```go
func (v *VirtualMachine) Step() error {
encoded_instruction, err := v.Segments.Memory.Get(v.RunContext.Pc)
if err != nil {
return fmt.Errorf("Failed to fetch instruction at %+v", v.RunContext.Pc)
}encoded_instruction_felt, ok := encoded_instruction.GetFelt()
if !ok {
return errors.New("Wrong instruction encoding")
}encoded_instruction_uint, err := encoded_instruction_felt.ToU64()
if err != nil {
return err
}instruction, err := DecodeInstruction(encoded_instruction_uint)
if err != nil {
return err
}return v.RunInstruction(&instruction)
}
```##### Decode instruction
```go
// Structure of the 63-bit that form the first word of each instruction.
// See Cairo whitepaper, page 32 - https://eprint.iacr.org/2021/1063.pdf.
// ┌─────────────────────────────────────────────────────────────────────────┐
// │ off_dst (biased representation) │
// ├─────────────────────────────────────────────────────────────────────────┤
// │ off_op0 (biased representation) │
// ├─────────────────────────────────────────────────────────────────────────┤
// │ off_op1 (biased representation) │
// ├─────┬─────┬───────┬───────┬───────────┬────────┬───────────────────┬────┤
// │ dst │ op0 │ op1 │ res │ pc │ ap │ opcode │ 0 │
// │ reg │ reg │ src │ logic │ update │ update │ │ │
// ├─────┼─────┼───┬───┼───┬───┼───┬───┬───┼───┬────┼────┬────┬────┬────┼────┤
// │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ 12 │ 13 │ 14 │ 15 │
// └─────┴─────┴───┴───┴───┴───┴───┴───┴───┴───┴────┴────┴────┴────┴────┴────┘
```As we can see in the chart above, all the information we need is present on the bits representation of the instruction. The first thing to do is create a structure that stores it.
```go
type Instruction struct {
Off0 int
Off1 int
Off2 int
DstReg Register
Op0Reg Register
Op1Addr Op1Src
ResLogic ResLogic
PcUpdate PcUpdate
ApUpdate ApUpdate
FpUpdate FpUpdate
Opcode Opcode
}
```And the good thing about this is that every combination of bits for each field is known, so we can code all the possible flags to work with. These flags are represented below.
The off0, off1, and off2 values are used to compute the address of the dst, op0 and op1 respectively.
For example, if the DstReg is AP and Off0 is -1, then we can compute the dst address by substracting one from the current value of ap.There's two possible registers, ap and fp. The ap register (address pointer register) keeps track of memory addresses for data access. The fp register (frame pointer register) manages function call stack frames, local variables and parameter access.
```go
type Register uintconst (
AP Register = 0
FP Register = 1
)
```The Op1Src constants define sources for an operation, including immediate values, registers (ap, fp), and an operation result.
```go
type Op1Src uintconst (
Op1SrcImm Op1Src = 0
Op1SrcAP Op1Src = 1
Op1SrcFP Op1Src = 2
Op1SrcOp0 Op1Src = 4
)
```
The ResLogic constants represent different types of results in a program, including operation results, addition, multiplication, and unconstrained values.```go
type ResLogic uintconst (
ResOp1 ResLogic = 0
ResAdd ResLogic = 1
ResMul ResLogic = 2
ResUnconstrained ResLogic = 3
)
```The PcUpdate constants define different ways to update the program counter, including regular updates, jumps, relative jumps, and conditional jumps (jump if not zero).
```go
type PcUpdate uintconst (
PcUpdateRegular PcUpdate = 0
PcUpdateJump PcUpdate = 1
PcUpdateJumpRel PcUpdate = 2
PcUpdateJnz PcUpdate = 3
)
```The ApUpdate constants represent various ways of updating an address pointer, including regular updates, and different addition types.
```go
type ApUpdate uintconst (
ApUpdateRegular ApUpdate = 0
ApUpdateAdd ApUpdate = 1
ApUpdateAdd1 ApUpdate = 2
ApUpdateAdd2 ApUpdate = 3
)
```The FpUpdate constants define different ways of updating the frame pointer, including regular updates, addition with a specific offset, and destination updates.
```go
type FpUpdate uintconst (
FpUpdateRegular FpUpdate = 0
FpUpdateAPPlus2 FpUpdate = 1
FpUpdateDst FpUpdate = 2
)
```The Opcode constants represent different types of operations or instructions, including no operation, assertion checks, function calls, and returns.
```go
type Opcode uintconst (
NOp Opcode = 0
AssertEq Opcode = 1
Call Opcode = 2
Ret Opcode = 4
)
```Now, once everything is set up, we only have to retrive each field by getting its representative bits. We do that, creating different bitmasks to just get the value.
1. Constants and Masks:
The method starts by defining constants and masks for various fields and properties of the instruction. These constants are used to extract specific bits from the encoded instruction and decode them into meaningful values. For instance, the HighBit constant represents the highest bit (bit 63) of the instruction, and various other masks are defined to extract different fields like destination register, opcode, update types, etc.```go
func DecodeInstruction(encodedInstruction uint64) (Instruction, error) {
const HighBit uint64 = 1 << 63
const DstRegMask uint64 = 0x0001
const DstRegOff uint64 = 0
const Op0RegMask uint64 = 0x0002
const Op0RegOff uint64 = 1
const Op1SrcMask uint64 = 0x001C
const Op1SrcOff uint64 = 2
const ResLogicMask uint64 = 0x0060
const ResLogicOff uint64 = 5
const PcUpdateMask uint64 = 0x0380
const PcUpdateOff uint64 = 7
const ApUpdateMask uint64 = 0x0C00
const ApUpdateOff uint64 = 10
const OpcodeMask uint64 = 0x7000
const OpcodeOff uint64 = 12
```2. Checking High Bit:
The first check in the method is whether the highest bit (bit 63) of the encoded instruction is set to zero. If it's not zero, this indicates an error, and the function returns an ErrNonZeroHighBitError.```go
if encodedInstruction&HighBit != 0 {
return Instruction{}, ErrNonZeroHighBitError
}
```3. Extracting Offsets:
The method extracts three offsets from the encoded instruction. These offsets represent memory addresses used in the instruction. They are extracted using bitwise operations and masks to get the lower 16 bits of three different sections of the instruction.```go
var offset0 = fromBiasedRepresentation((encodedInstruction) & 0xFFFF)
var offset1 = fromBiasedRepresentation((encodedInstruction >> 16) & 0xFFFF)
var offset2 = fromBiasedRepresentation((encodedInstruction >> 32) & 0xFFFF)-----------------------------------------------------------------------------------------
func fromBiasedRepresentation(offset uint64) int {
var bias uint16 = 1 << 15
return int(int16(uint16(offset) - bias))
}
```4. Extracting Flags:
The next step is to extract the flag section of the encoded instruction, which holds information about registers, sources, updates, and opcodes. This flag section is extracted using a bit shift to the right by 48 positions (which discards the lower 48 bits).
```go
var flags = encodedInstruction >> 48
```5. Decoding Fields:
Using the extracted flag section, the method decodes various fields like destination register, op0 register, op1 source, result logic, pc update, ap update, and opcode. These fields are decoded by extracting specific bits from the flag section and mapping them to their corresponding enum values.```go
var dstRegNum = (flags & DstRegMask) >> DstRegOff
var op0RegNum = (flags & Op0RegMask) >> Op0RegOff
var op1SrcNum = (flags & Op1SrcMask) >> Op1SrcOff
var resLogicNum = (flags & ResLogicMask) >> ResLogicOff
var pcUpdateNum = (flags & PcUpdateMask) >> PcUpdateOff
var apUpdateNum = (flags & ApUpdateMask) >> ApUpdateOff
var opCodeNum = (flags & OpcodeMask) >> OpcodeOffvar dstRegister Register
var op0Register Register
var op1Src Op1Src
var pcUpdate PcUpdate
var res ResLogic
var opcode Opcode
var apUpdate ApUpdate
var fpUpdate FpUpdateif dstRegNum == 1 {
dstRegister = FP
} else {
dstRegister = AP
}if op0RegNum == 1 {
op0Register = FP
} else {
op0Register = AP
}switch op1SrcNum {
case 0:
op1Src = Op1SrcOp0
case 1:
op1Src = Op1SrcImm
case 2:
op1Src = Op1SrcFP
case 4:
op1Src = Op1SrcAP
default:
return Instruction{}, ErrInvalidOp1RegError
}switch pcUpdateNum {
case 0:
pcUpdate = PcUpdateRegular
case 1:
pcUpdate = PcUpdateJump
case 2:
pcUpdate = PcUpdateJumpRel
case 4:
pcUpdate = PcUpdateJnz
default:
return Instruction{}, ErrInvalidPcUpdateError
}switch resLogicNum {
case 0:
if pcUpdate == PcUpdateJnz {
res = ResUnconstrained
} else {
res = ResOp1
}
case 1:
res = ResAdd
case 2:
res = ResMul
default:
return Instruction{}, ErrInvalidResError
}switch opCodeNum {
case 0:
opcode = NOp
case 1:
opcode = Call
case 2:
opcode = Ret
case 4:
opcode = AssertEq
default:
return Instruction{}, ErrInvalidOpcodeError
}switch apUpdateNum {
case 0:
if opcode == Call {
apUpdate = ApUpdateAdd2
} else {
apUpdate = ApUpdateRegular
}
case 1:
apUpdate = ApUpdateAdd
case 2:
apUpdate = ApUpdateAdd1
default:
return Instruction{}, ErrInvalidApUpdateError
}switch opcode {
case Call:
fpUpdate = FpUpdateAPPlus2
case Ret:
fpUpdate = FpUpdateDst
default:
fpUpdate = FpUpdateRegular
}
```
6. Creating the Instruction and Returning the Result:
With all the necessary information extracted and decoded, the method constructs an Instruction object by assigning the decoded values to its fields and returns the created Instruction object along with a nil error if the decoding process is successful.```go
return Instruction{
Off0: offset0,
Off1: offset1,
Off2: offset2,
DstReg: dstRegister,
Op0Reg: op0Register,
Op1Addr: op1Src,
ResLogic: res,
PcUpdate: pcUpdate,
ApUpdate: apUpdate,
FpUpdate: fpUpdate,
Opcode: opcode,
}, nil
```7. Error Handling:
If at any point during the decoding process, an unexpected value is encountered or the input doesn't conform to the expected pattern, the method returns an appropriate error. These errors include cases like invalid op1 register, invalid pc update, invalid result logic, invalid opcode, etc.```go
var ErrNonZeroHighBitError = errors.New("Instruction high bit was not set to zero")
var ErrInvalidOp1RegError = errors.New("Instruction had invalid Op1 Register")
var ErrInvalidPcUpdateError = errors.New("Instruction had invalid Pc update")
var ErrInvalidResError = errors.New("Instruction had an invalid res")
var ErrInvalidOpcodeError = errors.New("Instruction had an invalid opcode")
var ErrInvalidApUpdateError = errors.New("Instruction had an invalid Ap Update")
```##### Run instruction
At this point, we have all the information we need from the instruction. Let's run it!
There are 5 steps to run an instruction, they will be explained in detail later.
1. Compute the operands of the instruction
2. Assert the correctness of the operands
3. Add the context's register state to the trace
4. Update registers
5. Add one to the current step```go
func (v *VirtualMachine) RunInstruction(instruction *Instruction) error {
operands, err := v.ComputeOperands(*instruction)
if err != nil {
return err
}err = v.OpcodeAssertions(*instruction, operands)
if err != nil {
return err
}v.Trace = append(v.Trace, TraceEntry{Pc: v.RunContext.Pc, Ap: v.RunContext.Ap, Fp: v.RunContext.Fp})
err = v.UpdateRegisters(instruction, &operands)
if err != nil {
return err
}v.CurrentStep++
return nil
}
```#### Compute operands
Once the instruction has been decoded, it is executed by `RunInstruction` whose first function is to compute operands. This function is in charge of
calculating the addresses of the operands and fetching them from memory. If the function could not fetch the operands then they are deduced from the other operands,
taking in consideration what kind of opcode is being executed.```go
func (vm *VirtualMachine) ComputeOperands(instruction Instruction) (Operands, error) {
var res *memory.MaybeRelocatabledst_addr, err := vm.RunContext.ComputeDstAddr(instruction)
if err != nil {
return Operands{}, errors.New("FailedToComputeDstAddr")
}
dst, _ := vm.Segments.Memory.Get(dst_addr)op0_addr, err := vm.RunContext.ComputeOp0Addr(instruction)
if err != nil {
return Operands{}, fmt.Errorf("FailedToComputeOp0Addr: %s", err)
}
op0, _ := vm.Segments.Memory.Get(op0_addr)op1_addr, err := vm.RunContext.ComputeOp1Addr(instruction, op0)
if err != nil {
return Operands{}, fmt.Errorf("FailedToComputeOp1Addr: %s", err)
}
op1, _ := vm.Segments.Memory.Get(op1_addr)if op0 == nil {
deducedOp0, deducedRes, err := vm.DeduceOp0(&instruction, dst, op1)
if err != nil {
return Operands{}, err
}
op0 = deducedOp0
if op0 != nil {
vm.Segments.Memory.Insert(op0_addr, op0)
}
res = deducedRes
}if op1 == nil {
deducedOp1, deducedRes, err := vm.DeduceOp1(instruction, dst, op0)
if err != nil {
return Operands{}, err
}
op1 = deducedOp1
if op1 != nil {
vm.Segments.Memory.Insert(op1_addr, op1)
}
if res == nil {
res = deducedRes
}
}if res == nil {
res, err = vm.ComputeRes(instruction, *op0, *op1)if err != nil {
return Operands{}, err
}
}if dst == nil {
deducedDst := vm.DeduceDst(instruction, res)
dst = deducedDst
if dst != nil {
vm.Segments.Memory.Insert(dst_addr, dst)
}
}operands := Operands{
Dst: *dst,
Op0: *op0,
Op1: *op1,
Res: res,
}
return operands, nil
}
```##### ComputeDstAddr
The method `ComputeDstAddr` computes the address of the value that will be stored in the Destination (dst) operand. It checks which register its is relative to (wether ap or fp) and gets the direction by adding the instruction's first offset(off0) to the corresponding register.
```go
func (run_context RunContext) ComputeDstAddr(instruction Instruction) (memory.Relocatable, error) {
var base_addr memory.Relocatable
switch instruction.DstReg {
case AP:
base_addr = run_context.Ap
case FP:
base_addr = run_context.Fp
}if instruction.Off0 < 0 {
return base_addr.SubUint(uint(math.Abs(float64(instruction.Off0))))
} else {
return base_addr.AddUint(uint(instruction.Off0))
}}
```##### ComputeOp0Addr
This method is similar to `ComputeDstAddr` but it uses the instruction second offset (off1) to add to the selected register (ap or fp)
```go
func (run_context RunContext) ComputeOp0Addr(instruction Instruction) (memory.Relocatable, error) {
var base_addr memory.Relocatable
switch instruction.Op0Reg {
case AP:
base_addr = run_context.Ap
case FP:
base_addr = run_context.Fp
}if instruction.Off1 < 0 {
return base_addr.SubUint(uint(math.Abs(float64(instruction.Off1))))
} else {
return base_addr.AddUint(uint(instruction.Off1))
}
}```
##### ComputeOp1Addr
It computes the address of `Op1` based on the `Op0` operand and the kind of Address the instruction has for `Op1`.
- If its address is `Op1SrcFp` it calculates the direction from Fp register.
- if it is `Op1SrcAp` then if calculates it if from Ap register.
- If it is an immediate then checks if the offset 2 is 1 and calculates it from the `Pc`.
- If it is an `Op1SrcOp0` it checks the `Op0` and calculates the direction from it.Then it performs and addition or a substraction if the `Off2` is negative or positive.
```go
func (run_context RunContext) ComputeOp1Addr(instruction Instruction, op0 *memory.MaybeRelocatable) (memory.Relocatable, error) {
var base_addr memory.Relocatableswitch instruction.Op1Addr {
case Op1SrcFP:
base_addr = run_context.Fp
case Op1SrcAP:
base_addr = run_context.Ap
case Op1SrcImm:
if instruction.Off2 == 1 {
base_addr = run_context.Pc
} else {
base_addr = memory.NewRelocatable(0, 0)
return memory.Relocatable{}, &VirtualMachineError{Msg: "UnknownOp0"}
}
case Op1SrcOp0:
if op0 == nil {
return memory.Relocatable{}, errors.New("Unknown Op0")
}
rel, is_rel := op0.GetRelocatable()
if is_rel {
base_addr = rel
} else {
return memory.Relocatable{}, errors.New("AddressNotRelocatable")
}
}if instruction.Off2 < 0 {
return base_addr.SubUint(uint(math.Abs(float64(instruction.Off2))))
} else {
return base_addr.AddUint(uint(instruction.Off2))
}
}
```##### DeduceOp0
The method deduces the value of `Op0` if possible (based on `dst` and `Op1`).
If Instruction's opcode is a `Call` `Op0` is deduced by adding the instruction size to the program counter.- If it is an `AssertEq` then a second switch case is used to check what `ResLogic` is.
- If it is `ResAdd` `Op0` is deduced from the substraction of `Op1` from `Dst`, if is is `Resmul` the `Op0` is deduced from the division of `Dst` and `Op1` (both felt values).
- Otherwise op0 is nil.The method also deduces `res` by using the value of `dst`
```go
func (vm *VirtualMachine) DeduceOp0(instruction *Instruction, dst *memory.MaybeRelocatable, op1 *memory.MaybeRelocatable) (deduced_op0 *memory.MaybeRelocatable, deduced_res *memory.MaybeRelocatable, error error) {
switch instruction.Opcode {
case Call:
deduced_op0 := vm.RunContext.Pc
deduced_op0.Offset += instruction.Size()
return memory.NewMaybeRelocatableRelocatable(deduced_op0), nil, nil
case AssertEq:
switch instruction.ResLogic {
case ResAdd:
if dst != nil && op1 != nil {
deduced_op0, err := dst.Sub(*op1)
if err != nil {
return nil, nil, err
}
return &deduced_op0, dst, nil
}
case ResMul:
if dst != nil && op1 != nil {
dst_felt, dst_is_felt := dst.GetFelt()
op1_felt, op1_is_felt := op1.GetFelt()
if dst_is_felt && op1_is_felt && !op1_felt.IsZero() {
return memory.NewMaybeRelocatableFelt(dst_felt.Div(op1_felt)), dst, nil}
}
}
}
return nil, nil, nil
}
```##### DeduceOp1
The method deduces the value of `Op1` if possible (based on `dst` and `Op0`) it also deduces `res` if possible.
- If the instruction opcode is `AssertEq` a switch case is used to check what the `ResLogic` is.
- If it is a `ResOp1` then the value of op1 is equal to the dst operand.
- If `ResLogic` is `ResAdd` op1 is deduced from the substraction of `op0` from `dst`.
- If it is `ResMul` `op1` is deduced from the division of `dst` by `op0`,In all the cases `res` is equal to `dst`. if none of the former cases apply then nil is returned.
```go
func (vm *VirtualMachine) DeduceOp1(instruction Instruction, dst *memory.MaybeRelocatable, op0 *memory.MaybeRelocatable) (*memory.MaybeRelocatable, *memory.MaybeRelocatable, error) {
if instruction.Opcode == AssertEq {
switch instruction.ResLogic {
case ResOp1:
return dst, dst, nil
case ResAdd:
if op0 != nil && dst != nil {
dst_rel, err := dst.Sub(*op0)
if err != nil {
return nil, nil, err
}
return &dst_rel, dst, nil
}
case ResMul:
dst_felt, dst_is_felt := dst.GetFelt()
op0_felt, op0_is_felt := op0.GetFelt()
if dst_is_felt && op0_is_felt && !op0_felt.IsZero() {
res := memory.NewMaybeRelocatableFelt(dst_felt.Div(op0_felt))
return res, dst, nil
}
}
}
return nil, nil, nil
}```
##### ComputeRes
If the Res value has not been deduced in the previous steps then it is computed based on the `Op0` and `Op1` values.
- If `ResLogic` is `ResOp1` then `res` is equal to `op1`.
- If it is `ResAdd` then `res` is deduced from the addition of `op0` and `op1`.
- If it is `ResMul` `res` is deduced from the multiplication of `op0` and `op1`.
- Otherwise `res` is nil.```go
func (vm *VirtualMachine) ComputeRes(instruction Instruction, op0 memory.MaybeRelocatable, op1 memory.MaybeRelocatable) (*memory.MaybeRelocatable, error) {
switch instruction.ResLogic {
case ResOp1:
return &op1, nilcase ResAdd:
maybe_rel, err := op0.Add(op1)
if err != nil {
return nil, err
}
return &maybe_rel, nilcase ResMul:
num_op0, m_type := op0.GetFelt()
num_op1, other_type := op1.GetFelt()
if m_type && other_type {
result := memory.NewMaybeRelocatableFelt(num_op0.Mul(num_op1))
return result, nil
} else {
return nil, errors.New("ComputeResRelocatableMul")
}case ResUnconstrained:
return nil, nil
}
return nil, nil
}
```##### DeduceDst
If the destination value has not been calculated before then it is deduced based on the Res operand. If the opcode is an `AssertEq` then dst is equal res.
If it is a `Call` then its value is taken from the `Fp` register```go
func (vm *VirtualMachine) DeduceDst(instruction Instruction, res *memory.MaybeRelocatable) *memory.MaybeRelocatable {
switch instruction.Opcode {
case AssertEq:
return res
case Call:
return memory.NewMaybeRelocatableRelocatable(vm.RunContext.Fp)}
return nil
}
```#### Opcode assertions
Once we have the instruction's operands to work with, we have to ensure the correctness of them. The first thing we need to differentiate is which type of instruction are we running, we do this by looking at the instruction's opcode.
The posible opcodes we want to perform assertions on are:
1. AssertEq instruction
2. Call instructionIn the first option, we need to ensure the result operand is not null (nil in this case) and also that the result operand is equal to the dst operand. If any of those things fail, we throw an error.
On the other hand, the Call instruction, what we do first is define our return pc register, we do that adding the size of the instruction to the current pc. Then, we check our operand op0 is equal to the return pc and our dst operand is the same as the return fp register. If any of those things fail, we throw an error.
If this method returns a nil error, it means operands were computed correctly and we are good to go!
```go
func (vm *VirtualMachine) OpcodeAssertions(instruction Instruction, operands Operands) error {
switch instruction.Opcode {
case AssertEq:
if operands.Res == nil {
return &VirtualMachineError{"UnconstrainedResAssertEq"}
}
if !operands.Res.IsEqual(&operands.Dst) {
return &VirtualMachineError{"DiffAssertValues"}
}
case Call:
new_rel, err := vm.RunContext.Pc.AddUint(instruction.Size())
if err != nil {
return err
}
returnPC := memory.NewMaybeRelocatableRelocatable(new_rel)if !operands.Op0.IsEqual(returnPC) {
return &VirtualMachineError{"CantWriteReturnPc"}
}returnFP := vm.RunContext.Fp
dstRelocatable, _ := operands.Dst.GetRelocatable()
if !returnFP.IsEqual(&dstRelocatable) {
return &VirtualMachineError{"CantWriteReturnFp"}
}
}return nil
}
```#### Updating Registers
After we succesfully computed the value of the operands, it's now time to update the value of the registers, we will update each register according to the `PcUpdate`, `ApUpdate` and `FpUpdate` fields of the instruction respectively.
##### UpdatePc
As we already know, the pc (program counter) points to the next instruction in memory. When no jumps take place, the pc is updated to point to the next instruction by adding the instruction size to it. The instruction size is 1 if there is no immediate value, and 2 if there is an immediate value following the instruction.
Cairo also supports 3 different types of jumps. The first one is a regular jump, in which the pc takes the value of the res operand. The next one is a relative jump, in which the pc advances by a number of positions set by the res operand. And the last one is a jump not zero, which performs a relative jump, advancing the number of positions given by op1, if the value of the dst operand is not zero, or performs a regular update if the value of the dst operand is zero. The operand will only be zero if it is a Felt value which is zero, relocatable values are never zero.```go
// Updates the value of PC according to the executed instruction
func (vm *VirtualMachine) UpdatePc(instruction *Instruction, operands *Operands) error {
switch instruction.PcUpdate {
case PcUpdateRegular:
vm.RunContext.Pc.Offset += instruction.Size()
case PcUpdateJump:
if operands.Res == nil {
return errors.New("Res.UNCONSTRAINED cannot be used with PcUpdate.JUMP")
}
res, ok := operands.Res.GetRelocatable()
if !ok {
return errors.New("An integer value as Res cannot be used with PcUpdate.JUMP")
}
vm.RunContext.Pc = res
case PcUpdateJumpRel:
if operands.Res == nil {
return errors.New("Res.UNCONSTRAINED cannot be used with PcUpdate.JUMP_REL")
}
res, ok := operands.Res.GetFelt()
if !ok {
return errors.New("A relocatable value as Res cannot be used with PcUpdate.JUMP_REL")
}
new_pc, err := vm.RunContext.Pc.AddFelt(res)
if err != nil {
return err
}
vm.RunContext.Pc = new_pc
case PcUpdateJnz:
if operands.Dst.IsZero() {
vm.RunContext.Pc.Offset += instruction.Size()
} else {
new_pc, err := vm.RunContext.Pc.AddMaybeRelocatable(operands.Op1)
if err != nil {
return err
}
vm.RunContext.Pc = new_pc
}}
return nil
}
```Some auxiliary methods were added for this method:
###### Instruction.Size
Returns 1 if the instruction has no immediate value or 2 if it has. We can tell that an instruction has an immediate value if the op1 address is given by the immediate value.
```go
func (i *Instruction) Size() uint {
if i.Op1Addr == Op1SrcImm {
return 2
}
return 1
}
```###### MaybeRelocatable.IsZero()
Returns true if the value is a Felt that is zero, returns false otherwise
```go
func (m *MaybeRelocatable) IsZero() bool {
felt, is_int := m.GetFelt()
return is_int && felt.IsZero()
}
```##### UpdateFp
As we already know, the fp (frame pointer) points to the frame of the current function. It can be updated in 4 different ways. A regular fp update means no changes to the fp register. An ap plus 2 update consists on asigning the value of ap to fp and increasing it's offset by two (note: in the code below we only assign the offset, as fp and ap live on the execution segment and therefore have the same segment index). A dst fp update consists in performing either a direct or relative jump based on the value of the dst operand. If dst is a relocatable, fp will take the value of dst, if dst is a felt, fp's offset will be increased by the amount given by dst
```go
// Updates the value of FP according to the executed instruction
func (vm *VirtualMachine) UpdateFp(instruction *Instruction, operands *Operands) error {
switch instruction.FpUpdate {
case FpUpdateAPPlus2:
vm.RunContext.Fp.Offset = vm.RunContext.Ap.Offset + 2
case FpUpdateDst:
rel, ok := operands.Dst.GetRelocatable()
if ok {
vm.RunContext.Fp = rel
} else {
felt, _ := operands.Dst.GetFelt()
new_fp, err := vm.RunContext.Fp.AddFelt(felt)
if err != nil {
return err
}
vm.RunContext.Fp = new_fp
}
}
return nil
}
```##### UpdateAp
And lastly, the ap register points to the next unsused memory cell and has 4 types of update. A regular ap update means no changes to the ap register. An add update consists on advancing the ap register by the amount given by res. And the add1 and add2 updates consist on advancing the op register by 1 and 2 respectively.
```go
// Updates the value of AP according to the executed instruction
func (vm *VirtualMachine) UpdateAp(instruction *Instruction, operands *Operands) error {
switch instruction.ApUpdate {
case ApUpdateAdd:
if operands.Res == nil {
return errors.New("Res.UNCONSTRAINED cannot be used with ApUpdate.ADD")
}
new_ap, err := vm.RunContext.Ap.AddMaybeRelocatable(*operands.Res)
if err != nil {
return err
}
vm.RunContext.Ap = new_ap
case ApUpdateAdd1:
vm.RunContext.Ap.Offset += 1
case ApUpdateAdd2:
vm.RunContext.Ap.Offset += 2
}
return nil
}
```#### CairoRunner
Now that can can execute cairo steps, lets look at the VM's initialization step.
We will begin by creating our `CairoRunner`:```go
type CairoRunner struct {
Program vm.Program
Vm vm.VirtualMachine
ProgramBase memory.Relocatable
executionBase memory.Relocatable
initialPc memory.Relocatable
initialAp memory.Relocatable
initialFp memory.Relocatable
finalPc memory.Relocatable
mainOffset uint
}func NewCairoRunner(program vm.Program) *CairoRunner {
mainIdentifier, ok := (*program.Identifiers)["__main__.main"]
main_offset := uint(0)
if ok {
main_offset = uint(mainIdentifier.PC)
}
return &CairoRunner{Program: program, Vm: *vm.NewVirtualMachine(), mainOffset: main_offset}}
```Now we will create our `Initialize` method step by step:
```go
// Performs the initialization step, returns the end pointer (pc upon which execution should stop)
func (r *CairoRunner) Initialize() (memory.Relocatable, error) {
r.initializeSegments()
end, err := r.initializeMainEntrypoint()
r.initializeVM()
return end, err
}
```##### InitializeSegments
This method will create our program and execution segments
```go
func (r *CairoRunner) initializeSegments() {
// Program Segment
r.ProgramBase = r.Vm.Segments.AddSegment()
// Execution Segment
r.executionBase = r.Vm.Segments.AddSegment()
}
```##### initializeMainEntrypoint
This method will initialize the memory and initial register values to begin execution from the main entrypoint, and return the final pc
```go
func (r *CairoRunner) initializeMainEntrypoint() (memory.Relocatable, error) {
stack := make([]memory.MaybeRelocatable, 0, 2)
return_fp := r.Vm.Segments.AddSegment()
return r.initializeFunctionEntrypoint(r.mainOffset, &stack, return_fp)
}
```##### initializeFunctionEntrypoint
This method will initialize the memory and initial register values to execute a cairo function given its offset within the program segment (aka entrypoint) and return the final pc. In our case, this function will be the main entrypoint, but later on we will be able to use this method to run starknet contract entrypoints.
The stack will then be loaded into the execution segment in the next method. For now, the stack will be empty, but later on it will contain the builtin bases (which are the arguments for the main function), and the function arguments when running a function from a starknet contract.```go
func (r *CairoRunner) initializeFunctionEntrypoint(entrypoint uint, stack *[]memory.MaybeRelocatable, return_fp memory.Relocatable) (memory.Relocatable, error) {
end := r.Vm.Segments.AddSegment()
*stack = append(*stack, *memory.NewMaybeRelocatableRelocatable(end), *memory.NewMaybeRelocatableRelocatable(return_fp))
r.initialFp = r.executionBase
r.initialFp.Offset += uint(len(*stack))
r.initialAp = r.initialFp
r.finalPc = end
return end, r.initializeState(entrypoint, stack)
}
```##### InitializeState
This method will be in charge of loading the program data into the program segment and the stack into the execution segment
```go
func (r *CairoRunner) initializeState(entrypoint uint, stack *[]memory.MaybeRelocatable) error {
r.initialPc = r.ProgramBase
r.initialPc.Offset += entrypoint
// Load program data
_, err := r.Vm.Segments.LoadData(r.ProgramBase, &r.Program.Data)
if err == nil {
_, err = r.Vm.Segments.LoadData(r.executionBase, stack)
}
return err
}
```##### initializeVm
This method will set the values of the VM's `RunContext` with our `CairoRunner`'s initial values
```go
func (r *CairoRunner) initializeVM() {
r.Vm.RunContext.Ap = r.initialAp
r.Vm.RunContext.Fp = r.initialFp
r.Vm.RunContext.Pc = r.initialPc
}
```With `CairoRunner.Initialize()` now complete we can move on to the execution step:
##### RunUntilPc
This method will continuously execute cairo steps until the end pc, returned by 'CairoRunner.Initialize()' is reached
```go
func (r *CairoRunner) RunUntilPC(end memory.Relocatable) error {
for r.Vm.RunContext.Pc != end {
err := r.Vm.Step()
if err != nil {
return err
}
}
return nil
```Once we are done executing, we can relocate our memory and trace and output them into files.
#### Relocate
This method will relocate the memory and trace generated by the program execution. Relocating means that our VM transforms a two-dimensional memory (aka a memory divided by segments) to a continuous, one-dimensional memory.
In this section, we will refer to the two-dimensional memory as the original memory and the one-dimensional memory as the relocated memory.
The memory relocation process is explained at a high level [here](#memory-relocation).The original memory is accessed using `Relocatable`s while the relocated memory is accessed using uints. Also, the original memory values can be either `Felt`s of `Relocatable`s while the relocated memory values are all `Felt`s.
The relocation process is divided into 4 steps:
1. Compute the sizes of each memory segment.
2. Build an array that contains the first relocated address of each segment.
3. Creates the relocated memory transforming the original memory by using the array built in 2.
4. Creates the relocated trace transforming the original trace by using the array built in 2.The function that does the relocation process is shown below:
```go
func (v *VirtualMachine) Relocate() error {
v.Segments.ComputeEffectiveSizes()
if len(v.Trace) == 0 {
return nil
}relocationTable, ok := v.Segments.RelocateSegments()
// This should be unreachable
if !ok {
return errors.New("ComputeEffectiveSizes called but RelocateSegments still returned error")
}relocatedMemory, err := v.Segments.RelocateMemory(&relocationTable)
if err != nil {
return err
}v.RelocateTrace(&relocationTable)
v.RelocatedMemory = relocatedMemory
return nil
}
```Notice that each step is encapsulated into a function.
We need to add the relocated trace and relocated memory fields to the `VirtualMachine` struct:```go
type VirtualMachine struct {
RunContext RunContext
currentStep uint
Segments memory.MemorySegmentManager
Trace []TraceEntry
RelocatedTrace []RelocatedTraceEntry
RelocatedMemory map[uint]lambdaworks.Felt
}
```Also we need to add some methods to `MemorySegmentManager` and to `VirtualMachine`.
We will look into each one in the following subsections.#### ComputeEffectiveSizes
To relocate the memory segments, we first need to know the size of each segment of the memory.
This method computes those sizes by returning a map whose keys are segment indexes and its values are the segment sizes:```go
func (m *MemorySegmentManager) ComputeEffectiveSizes() map[uint]uint {
if len(m.SegmentSizes) == 0 {for ptr := range m.Memory.data {
segmentIndex := uint(ptr.SegmentIndex)
segmentMaxSize := m.SegmentSizes[segmentIndex]
segmentSize := ptr.Offset + 1
if segmentSize > segmentMaxSize {
m.SegmentSizes[segmentIndex] = segmentSize
}
}
}return m.SegmentSizes
}
```Notice that this method will only do something once: consecutive calls won't do anything.
This is because the relocation process will happen only after the program execution and never again.
The `Relocate` function shouldn't be called more than once.#### RelocateSegments
Once we have the segment sizes, we need to know where to relocate the segments.
Because we want the relocated memory to be continuous, the segments should be placed one after the other.
This means that the last address of the segment `i` is followed by the first address of the segment `i + 1`.
To know where to relocate the segments, we need to know the first address of each segment as if they were already relocated.
We also need to enforce that `RelocateSegments` is called after `ComputeEffectiveSizes`:```go
func (m *MemorySegmentManager) RelocateSegments() ([]uint, bool) {
if m.SegmentSizes == nil {
return nil, false
}first_addr := uint(1)
relocation_table := []uint{first_addr}for i := uint(0); i < m.Memory.NumSegments(); i++ {
new_addr := relocation_table[i] + m.SegmentSizes[i]
relocation_table = append(relocation_table, new_addr)
}
relocation_table = relocation_table[:len(relocation_table)-1]return relocation_table, true
}
```#### RelocateMemory
Once we have where each segment should go in the relocated memory, we can relocate the memory segments.
```go
func (s *MemorySegmentManager) RelocateMemory(relocationTable *[]uint) (map[uint]lambdaworks.Felt, error) {
relocatedMemory := make(map[uint]lambdaworks.Felt, 0)for i := uint(0); i < s.Memory.NumSegments(); i++ {
for j := uint(0); j < s.SegmentSizes[i]; j++ {
ptr := NewRelocatable(int(i), j)
cell, err := s.Memory.Get(ptr)
if err == nil {
relocatedAddr := ptr.RelocateAddress(relocationTable)
value, err := cell.RelocateValue(relocationTable)
if err != nil {
return nil, err
}
relocatedMemory[relocatedAddr] = value
}
}
}return relocatedMemory, nil
}
```This method calls two new methods: `RelocateAddress` from `Relocatable` and `RelocateValue` from `MaybeRelocatable`.
Let's implement them.##### RelocateAddress
This `Relocatable`'s method transforms an address of the original memory into an address of the relocated memory.
Because the relocated memory is one-dimensional and not divided into segments, the memory addresses are not of type `Relocatable` but
`uint`.```go
func (r *Relocatable) RelocateAddress(relocationTable *[]uint) uint {
return (*relocationTable)[r.SegmentIndex] + r.Offset
}
```##### RelocateValue
This `MaybeRelocatable`'s method transforms a value of the original memory into a value of the relocated memory.
- If the value is a `Felt`, the method doesn't transform it and returns the value as is.
- If the value is a `Relocatable`, the method transforms it to an address of the relocated memory.
- If the value is any other type, the method returns an error.```go
func (m *MaybeRelocatable) RelocateValue(relocationTable *[]uint) (lambdaworks.Felt, error) {
inner_felt, ok := m.GetFelt()
if ok {
return inner_felt, nil
}inner_relocatable, ok := m.GetRelocatable()
if ok {
return lambdaworks.FeltFromUint64(uint64(inner_relocatable.RelocateAddress(relocationTable))), nil
}return lambdaworks.FeltZero(), errors.New(fmt.Sprintf("Unexpected type %T", m.inner))
}
```#### RelocateTrace
After running `RelocateMemory` and returning the relocated memory without errors, the `Relocate` function calls `RelocateTrace`.
This method transforms the fields of each trace entry from `Relocatable`s to `Felt`s.
The fields of each entry are pc, ap and fp.
Because those fields are address, the trace relocation process involves relocating addresses.
That's why, this method also calls `RelocateAddress`:```go
func (v *VirtualMachine) RelocateTrace(relocationTable *[]uint) error {
if len(*relocationTable) < 2 {
return errors.New("no relocation found for execution segment")
}for _, entry := range v.Trace {
v.RelocatedTrace = append(v.RelocatedTrace, RelocatedTraceEntry{
Pc: lambdaworks.FeltFromUint64(uint64(entry.Pc.RelocateAddress(relocationTable))),
Ap: lambdaworks.FeltFromUint64(uint64(entry.Ap.RelocateAddress(relocationTable))),
Fp: lambdaworks.FeltFromUint64(uint64(entry.Fp.RelocateAddress(relocationTable))),
})
}return nil
}
```Like in `RelocateMemory`, we need to check that the `RelocateSegments` function was called before.
Once the program was executed, the `VirtualMachine` will have generated a relocated trace and a relocated memory for that execution.
Both trace and memory should be outputs of the execution, so we will write two functions that respectively write the trace and memory
into a file.
We will create a package called `cairo_run` and create those functions there.#### Writing the trace into a file
First, we will add a function in our new `cairo_run` package to write the relocated trace into a buffer.
This method converts the pc, fp and ap fields of each trace entry into a `uint64` and writes each integer into the buffer in little endian:```go
func WriteEncodedTrace(relocatedTrace []vm.RelocatedTraceEntry, dest io.Writer) error {
for i, entry := range relocatedTrace {
ap_buffer := make([]byte, 8)
ap, err := entry.Ap.ToU64()
if err != nil {
return err
}
binary.LittleEndian.PutUint64(ap_buffer, ap)
_, err = dest.Write(ap_buffer)
if err != nil {
return encodeTraceError(i, err)
}fp_buffer := make([]byte, 8)
fp, err := entry.Fp.ToU64()
if err != nil {
return err
}
binary.LittleEndian.PutUint64(fp_buffer, fp)
_, err = dest.Write(fp_buffer)
if err != nil {
return encodeTraceError(i, err)
}pc_buffer := make([]byte, 8)
pc, err := entry.Pc.ToU64()
if err != nil {
return err
}
binary.LittleEndian.PutUint64(pc_buffer, pc)
_, err = dest.Write(pc_buffer)
if err != nil {
return encodeTraceError(i, err)
}
}return nil
}
```#### Writing the memory into a file
Second, we will add a function in the `cairo_run` package to write the relocated memory into another buffer.
We want to write into the buffer the pairs of addresses and values by following a specific order.
In this case, we want those pairs to be incrementally ordered by address.
Because we implemented the relocated memory as a map and we store the addresses as keys, if we iterate over the relocated memory using
`range`, we will end up storing the pairs of addresses and values without following any specific order.
So, first we sort the relocated memory addresses and then we iterate those addresses to write each pair into the buffer.
Before writing the relocated memory values into the buffer, we have to convert them to bytes in little endian by using the lambdaworks
method `ToLeBytes`.
That method returns an array and `io.Writer.Write` expects a slice, so we convert the array that `ToLeBytes` returns to a slice.```go
func WriteEncodedMemory(relocatedMemory map[uint]lambdaworks.Felt, dest io.Writer) error {
// create a slice to store keys of the relocatedMemory map
keysMap := make([]uint, 0, len(relocatedMemory))
for k := range relocatedMemory {
keysMap = append(keysMap, k)
}// sort the keys
sort.Slice(keysMap, func(i, j int) bool { return keysMap[i] < keysMap[j] })// iterate over the `relocatedMemory` map in sorted key order
for _, k := range keysMap {
// write the key
keyArray := make([]byte, 8)
binary.LittleEndian.PutUint64(keyArray, uint64(k))
_, err := dest.Write(keyArray)
if err != nil {
return encodeMemoryError(k, err)
}// write the value
valueArray := relocatedMemory[k].ToLeBytes()_, err = dest.Write(valueArray[:])
if err != nil {
return encodeMemoryError(k, err)
}
}return nil
}
```#### Putting it all together
Now it's time to finally add our `main` function to the project!
The CLI will receive one argument: the program path.
This path will be passed as argument to a new function called `CairoRun`.
This new function will return the cairo runner and an error, if it exists one during the program execution.
If the program execution ends successfully, we write the relocated trace and memory in files respectively and print that the execution was ended with no errors.
If the program execution ends with an error, we don't write any file and print the corresponding error.```go
func main() {
if len(os.Args) < 2 {
fmt.Println("Wrong argument count: Use go run cmd/cli/main.go COMPILED_JSON")
return
}
cli_args := os.Args[1:]
programPath := cli_args[0]
cairoRunner, err := cairo_run.CairoRun(programPath)
if err != nil {
fmt.Printf("Failed with error: %s", err)
return
}
traceFilePath := strings.Replace(programPath, ".json", ".go.trace", 1)
traceFile, err := os.OpenFile(traceFilePath, os.O_RDWR|os.O_CREATE, 0644)
defer traceFile.Close()memoryFilePath := strings.Replace(programPath, ".json", ".go.memory", 1)
memoryFile, err := os.OpenFile(memoryFilePath, os.O_RDWR|os.O_CREATE, 0644)
defer memoryFile.Close()cairo_run.WriteEncodedTrace(cairoRunner.Vm.RelocatedTrace, traceFile)
cairo_run.WriteEncodedMemory(cairoRunner.Vm.RelocatedMemory, memoryFile)println("Done!")
}
```To make our `main` function work, we still need to add another function to the `cairo_run` package.
The `CairoRun` function will be responsible of integrating the initialization, execution and relocation.```go
func CairoRun(programPath string) (*runners.CairoRunner, error) {
compiledProgram := parser.Parse(programPath)
programJson := vm.DeserializeProgramJson(compiledProgram)cairoRunner, err := runners.NewCairoRunner(programJson)
if err != nil {
return nil, err
}
end, err := cairoRunner.Initialize()
if err != nil {
return nil, err
}
err = cairoRunner.RunUntilPC(end)
if err != nil {
return nil, err
}
err = cairoRunner.Vm.Relocate()
return cairoRunner, err
}
```TODO: add parsing and deserializing
#### Builtins
Now that we are able to run a basic fibonacci program, lets step up our game by adding builtins to our VM. A builtin is a low level optimization integrated into the core loop of the VM that allows otherwise expensive computation to be performed more efficiently. Builtins have two ways to operate: via validation rules and via auto-deduction rules. Validation rules are applied to every element that is inserted into a builtin's segment. For example, if I want to verify an ecdsa signature, I can insert it into the ecdsa builtin's segment and let a validation rule take care of verifying the signature. Auto-deduction rules take over during instruction execution, when we can't compute the value of an operand who's address belongs to a builtin segment, we can use that builtin's auto-deduction rule to calculate the value of the operand. For example, If I want to calculate the pedersen hash of two values, I can write the values into the pedersen builtin's segment and then ask for the next memory cell, without builtins, this instruction would have failed, as there is no value stored in that cell, but now we can use auto-deduction rules to calculate the hash and fill in that memory cell.
We will define a basic interface to generalize all of our builtin's behaviour:
```go
type BuiltinRunner interface {
// Returns the first address of the builtin's memory segment
Base() memory.Relocatable
// Returns the name of the builtin
Name() string
// Creates a memory segment for the builtin and initializes its base
InitializeSegments(*memory.MemorySegmentManager)
// Returns the builtin's initial stack
InitialStack() []memory.MaybeRelocatable
// Attempts to deduce the value of a memory cell given by its address. Can return either a nil pointer and an error, if an error arises during the deduction,
// a valid pointer and nil if the deduction was succesful, or a nil pointer and nil if there is no deduction for the memory cell
DeduceMemoryCell(memory.Relocatable, *memory.Memory) (*memory.MaybeRelocatable, error)
// Adds a validation rule to the memory
// Validation rules are applied when a value is inserted into the builtin's segment
AddValidationRule(*memory.Memory)
}
```And now lets integrate this into our existing codebase:
First we will make some modifications to our basic structures:
We will add our builtin runners to the VM:
```go
type VirtualMachine struct {
RunContext RunContext
currentStep uint
Segments memory.MemorySegmentManager
Trace []TraceEntry
RelocatedTrace []RelocatedTraceEntry
RelocatedMemory map[uint]lambdaworks.Felt
BuiltinRunners []builtins.BuiltinRunner
}
```Then we will create two new types to handle validation rules in the `Memory`:
##### ValidationRule
This will represent our builtin's validation rules, they take a memory address and a referenece to the memory, and return a list of validated addresses, for most builtins, this list will contain the address it received if the validation was succesful, but some builtins may return additional addresses.
```go
// A function that validates a memory address and returns a list of validated addresses
type ValidationRule func(*Memory, Relocatable) ([]Relocatable, error)
```##### AddressSet
As go doesn't have a set type, we created our own really basic set for `Relocatable`s. This will hold the values returned by the validation rules, so that we don't have to run them more than once for each memory cell.
```go
// A Set to store Relocatable values
type AddressSet map[Relocatable]boolfunc NewAddressSet() AddressSet {
return make(map[Relocatable]bool)
}func (set AddressSet) Add(element Relocatable) {
set[element] = true
}func (set AddressSet) Contains(element Relocatable) bool {
return set[element]
}
```And we will add them to our `Memory` stuct:
``` go
type Memory struct {
data map[Relocatable]MaybeRelocatable
num_segments uint
validation_rules map[uint]ValidationRule
validated_addresses AddressSet
}
```Now we only need to add a way to create this validation rules:
```go
// Adds a validation rule for a given segment
func (m *Memory) AddValidationRule(segment_index uint, rule ValidationRule) {
m.validation_rules[segment_index] = rule
}
```And a method that runs validations on a memory address:
```go
// Applies the validation rule for the addr's segment if any
// Skips validation if the address is temporary or if it has been previously validated
func (m *Memory) validateAddress(addr Relocatable) error {
if addr.SegmentIndex < 0 || m.validated_addresses.Contains(addr) {
return nil
}
rule, ok := m.validation_rules[uint(addr.SegmentIndex)]
if !ok {
return nil
}
validated_addresses, error := rule(m, addr)
if error != nil {
return error
}
for _, validated_address := range validated_addresses {
m.validated_addresses.Add(validated_address)
}
return nil
}
```And we are all set to integrate this new logic into our `Memory`'s `Insert` operation:
```go
// Inserts a value in some memory address, given by a Relocatable value.
func (m *Memory) Insert(addr Relocatable, val *MaybeRelocatable) error {
// Check that insertions are preformed within the memory bounds
if addr.SegmentIndex >= int(m.num_segments) {
return errors.New("Error: Inserting into a non allocated segment")
}
// Check for possible overwrites
prev_elem, ok := m.data[addr]
if ok && prev_elem != *val {
return errors.New("Memory is write-once, cannot overwrite memory value")
}m.data[addr] = *val
return m.validateAddress(addr)
}
```Now we will initialize the builtins from our `CairoRunner`:
##### NewCairoRunner (Builtins)
Here we will have to iterate over the `Builtins` field of the `Program`, and add the corresponding builtin to the `VirtualMachine`'s `BuiltinRunner` field. We don't have any builtins yet, so we wil add a comment as placeholder and just leave a default case. As we implement more builtins, we will add a case for each of them.
```go
func NewCairoRunner(program vm.Program) (*CairoRunner, error) {
mainIdentifier, ok := (*program.Identifiers)["__main__.main"]
main_offset := uint(0)
if ok {
main_offset = uint(mainIdentifier.PC)
}
runner := CairoRunner{Program: program, Vm: *vm.NewVirtualMachine(), mainOffset: main_offset}
for _, builtin_name := range program.Builtins {
switch builtin_name {
// Add a case for each builtin here, example:
// case "range_check":
// runner.Vm.BuiltinRunners = append(runner.Vm.BuiltinRunners, RangeCheckBuiltin{})
default:
return nil, errors.New("Invalid builtin")
}
}
return &runner, nil
}
```##### InitializeSegments (Builtins)
Here we will also initialize the builtin segments by calling each builtin's `InitializeSegments` method
```go
func (r *CairoRunner) initializeSegments() {
// Program Segment
r.ProgramBase = r.Vm.Segments.AddSegment()
// Execution Segment
r.executionBase = r.Vm.Segments.AddSegment()
// Builtin Segments
for i := range r.Vm.BuiltinRunners {
r.Vm.BuiltinRunners[i].InitializeSegments(&r.Vm.Segments)
}
}
```##### InitializeMainEntryPoint(Builtins)
Here we will add the builtin's initial_stack to our stack. The builtin's initial_stack is generally made up of the builtin's base, and is what allows the main function to write into the builtin's segment.
```go
func (r *CairoRunner) initializeMainEntrypoint() (memory.Relocatable, error) {
// When running from main entrypoint, only up to 11 values will be written (9 builtin bases + end + return_fp)
stack := make([]memory.MaybeRelocatable, 0, 11)
// Append builtins initial stack to stack
for i := range r.Vm.BuiltinRunners {
for _, val := range r.Vm.BuiltinRunners[i].InitialStack() {
stack = append(stack, val)
}
}
return_fp := r.Vm.Segments.AddSegment()
return r.initializeFunctionEntrypoint(r.mainOffset, &stack, return_fp)
}
```##### initializeVm (Builtins)
Here we will add our builtin's validation rules to the `Memory` and use them to validate the meory cells we loaded before
```go
func (r *CairoRunner) initializeVM() error {
r.Vm.RunContext.Ap = r.initialAp
r.Vm.RunContext.Fp = r.initialFp
r.Vm.RunContext.Pc = r.initialPc
// Add validation rules
for i := range r.Vm.BuiltinRunners {
r.Vm.BuiltinRunners[i].AddValidationRule(&r.Vm.Segments.Memory)
}
// Apply validation rules to memory
return r.Vm.Segments.Memory.ValidateExistingMemory()
}
```For this we will add the method `Memory.ValidateExistingMemory`:
```go
func (m *Memory) ValidateExistingMemory() error {
for addr := range m.data {
err := m.validateAddress(addr)
if err != nil {
return err
}
}
return nil
}
```Now we will dive deeper into how `auto-deduction` rules come into play during execution:
Before builtins, the basic flow for computing the value of an operand was to first compute its address, and then if we couldn't find it in memory, we would deduce its value based on the other operands.
With the introduction of builtins and their auto-deduction rules, this flow changes a bit. Now we compute the address, use it to fetch the value from memory, if we can't find it in memory we try to use the builtin's auto deduction rules, and if we can't deduce it via builtins we will then deduce it based on the other operands's.
But what does it mean to use the builtin's auto deduction rules to deduce the value of an operand?##### DeduceMemoryCell
This method will iterate over the builtin runners and try to find a builtin who's base's segment index matches the operand to be deduced's address. That is to say, it checks if the address belongs to a builtin's segment. If a match is found, it uses the builtin's `DeduceMemoryCell` method to run the builtin's auto-deduction rules and calculate the value of the operand
```go
// Applies the corresponding builtin's deduction rules if addr's segment index corresponds to a builtin segment
// Returns nil if there is no deduction for the address
func (vm *VirtualMachine) DeduceMemoryCell(addr memory.Relocatable) (*memory.MaybeRelocatable, error) {
for i := range vm.BuiltinRunners {
if vm.BuiltinRunners[i].Base().SegmentIndex == addr.SegmentIndex {
return vm.BuiltinRunners[i].DeduceMemoryCell(addr, &vm.Segments.Memory)
}
}
return nil, nil
}
```Now we have to integrate this new method into our `VirtualMachine.ComputeOperands` method:
We will add two helper methods to make our code easier to follow with these new additions:
Both of these methods will be ran fetching either op1 or op0 respectively yields a nil value, and will try to deduce them using both builtins and normal deductions, returning an error if both of these attempts fail##### ComputeOp0Deductions
```go
// Runs deductions for Op0, first runs builtin deductions, if this fails, attempts to deduce it based on dst and op1
// Also returns res if it was also deduced in the process
// Inserts the deduced operand
// Fails if Op0 was not deduced or if an error arised in the process
func (vm *VirtualMachine) ComputeOp0Deductions(op0_addr memory.Relocatable, instruction *Instruction, dst *memory.MaybeRelocatable, op1 *memory.MaybeRelocatable) (deduced_op0 memory.MaybeRelocatable, deduced_res *memory.MaybeRelocatable, err error) {
op0, err := vm.DeduceMemoryCell(op0_addr)
if err != nil {
return *memory.NewMaybeRelocatableFelt(lambdaworks.FeltZero()), nil, err
}
if op0 == nil {
op0, deduced_res, err = vm.DeduceOp0(instruction, dst, op1)
if err != nil {
return *memory.NewMaybeRelocatableFelt(lambdaworks.FeltZero()), nil, err
}
}
if op0 != nil {
vm.Segments.Memory.Insert(op0_addr, op0)
} else {
return *memory.NewMaybeRelocatableFelt(lambdaworks.FeltZero()), nil, errors.New("Failed to compute or deduce op0")
}
return *op0, deduced_res, nil
}
```##### ComputeOp1Deductions
```go
// Runs deductions for Op1, first runs builtin deductions, if this fails, attempts to deduce it based on dst and op0
// Also updates res if it was also deduced in the process
// Inserts the deduced operand
// Fails if Op1 was not deduced or if an error arised in the process
func (vm *VirtualMachine) ComputeOp1Deductions(op1_addr memory.Relocatable, instruction *Instruction, dst *memory.MaybeRelocatable, op0 *memory.MaybeRelocatable, res *memory.MaybeRelocatable) (memory.MaybeRelocatable, error) {
op1, err := vm.DeduceMemoryCell(op1_addr)
if err != nil {
return *memory.NewMaybeRelocatableFelt(lambdaworks.FeltZero()), err
}
if op1 == nil {
var deducedRes *memory.MaybeRelocatable
op1, deducedRes, err = vm.DeduceOp1(instruction, dst, op0)
if err != nil {
return *memory.NewMaybeRelocatableFelt(lambdaworks.FeltZero()), err
}
if res == nil {
res = deducedRes
}
}
if op1 != nil {
vm.Segments.Memory.Insert(op1_addr, op1)
} else {
return *memory.NewMaybeRelocatableFelt(lambdaworks.FeltZero()), errors.New("Failed to compute or deduce op1")
}
return *op1, nil
}
```Now we integrate these two new methods into our previous `ComputeOperands` method:
```go
func (vm *VirtualMachine) ComputeOperands(instruction Instruction) (Operands, error) {
var res *memory.MaybeRelocatabledst_addr, err := vm.RunContext.ComputeDstAddr(instruction)
if err != nil {
return Operands{}, errors.New("FailedToComputeDstAddr")
}
dst, _ := vm.Segments.Memory.Get(dst_addr)op0_addr, err := vm.RunContext.ComputeOp0Addr(instruction)
if err != nil {
return Operands{}, fmt.Errorf("FailedToComputeOp0Addr: %s", err)
}
op0_op, _ := vm.Segments.Memory.Get(op0_addr)op1_addr, err := vm.RunContext.ComputeOp1Addr(instruction, op0_op)
if err != nil {
return Operands{}, fmt.Errorf("FailedToComputeOp1Addr: %s", err)
}
op1_op, _ := vm.Segments.Memory.Get(op1_addr)var op0 memory.MaybeRelocatable
if op0_op != nil {
op0 = *op0_op
} else {
op0, res, err = vm.ComputeOp0Deductions(op0_addr, &instruction, dst, op1_op)
if err != nil {
return Operands{}, err
}
}var op1 memory.MaybeRelocatable
if op1_op != nil {
op1 = *op1_op
} else {
op1, err = vm.ComputeOp1Deductions(op1_addr, &instruction, dst, op0_op, res)
if err != nil {
return Operands{}, err
}
}
if res == nil {
res, err = vm.ComputeRes(instruction, op0, op1)if err != nil {
return Operands{}, err
}
}if dst == nil {
deducedDst := vm.DeduceDst(instruction, res)
dst = deducedDst
if dst != nil {
vm.Segments.Memory.Insert(dst_addr, dst)
}
}operands := Operands{
Dst: *dst,
Op0: op0,
Op1: op1,
Res: res,
}
return operands, nil
}
```With all of our builtin logic integrated into the codebase, we can implement any builtin and use it in our cairo programs while worrying only about implementing the `BuiltinRunner` interface and creating the builtin in the `NewCairoRunner` function.
#### RangeCheck
The `RangeCheck` builtin does a very simple thing: it asserts that a given number is in the range $[0, 2^{128})$, i.e., that it's greater than zero and less than $2^{128}$. This might seem superficial but it is used for a lot of different things in Cairo, including comparing numbers. Whenever a program asserts that some number is less than other, the range check builtin is being called underneath.
TODO: explain this better, it's not entirely clear why $2^{128}$ was chosen.
Let's now talk about how to implement the `RangeCheckBuiltinRunner`.
We have getter functions just to obtain information about the builtin. The `Name` method is used when iterating through all the builtins of the program so we can switch to the correct execution.
```go
func (r *RangeCheckBuiltinRunner) Base() memory.Relocatable {
return r.base
}func (r *RangeCheckBuiltinRunner) Name() string {
return "range_check"
}
```For the `InitializeSegments` method we just add a segment to the memory and store the first address of the segment in the base attribute.
```go
func (r *RangeCheckBuiltinRunner) InitializeSegments(segments *memory.MemorySegmentManager) {
r.base = segments.AddSegment()
}
```Next we have the `InitialStack` method, that just returns a stack with the base address appended.
```go
func (r *RangeCheckBuiltinRunner) InitialStack() []memory.MaybeRelocatable {
if r.included {
stack := []memory.MaybeRelocatable{*memory.NewMaybeRelocatableRelocatable(r.base)}
return stack
}
return []memory.MaybeRelocatable{}
}
```In this case, the `DeduceMemoryCell` is not used in this builtin, so we return nothing.
```go
func (r *RangeCheckBuiltinRunner) DeduceMemoryCell(addr memory.Relocatable, mem *memory.Memory) (*memory.MaybeRelocatable, error) {
return nil, nil
}
```And finally we have the `AddValidationRule` and the `ValidationRule` methods.
###### AddValidationRule
Receives the memory and adds a new validation rule to it for the builtin segment.
###### ValidationRule
Receives the memory and an address and it checks if the value in that address is a `felt` and then if it's inside the range. To do so, it checks that the necessary number of bits for representing the felt is not greater than the bits for representing the upper bound of the range. If it fits in this range, it returns an `Relocatable` array with the address appended. Otherwise returns error.
```go
func ValidationRule(mem *memory.Memory, address memory.Relocatable) ([]memory.Relocatable, error) {
res_val, err := mem.Get(address)
if err != nil {
return nil, err
}
felt, is_felt := res_val.GetFelt()
if !is_felt {
return nil, errors.New("NotFeltElement")
}
if felt.Bits() <= N_PARTS*INNER_RC_BOUND_SHIFT {
return []memory.Relocatable{address}, nil
}
return nil, errors.New("RangeCheckNumOutOfBounds")
}func (r *RangeCheckBuiltinRunner) AddValidationRule(mem *memory.Memory) {
mem.AddValidationRule(uint(r.base.SegmentIndex), ValidationRule)
}
``````#### Output
TODO
#### Poseidon
The poseidon builtin is used to compute the poseidon hash function in an efficient way. The poseidon hash used by the builtin differs from a standard poseidon hash in two ways, it uses different constants (becoming its own stark poseidon hash), and it also uses the internal poseidon permutation function instead of calling a poseidon hash function. The reason for the second one is that it allows the builtin to hash more than one element at a time by permuting the three-element poseidon state.
Due to this difference, the best solution is to use a poseidon implementation built specifically for cairo. In our case we are going to use the poseidon hash in the `starknet-crypto` crate of the [starknet-rs](https://github.com/xJonathanLEI/starknet-rs) repo.
The section below will explain how to create a C wrapper to use this crate from our go code, but you can skip it if you want to use your own version in your native language.##### Importing the `starknet-crypto`rust crate for our poseidon needs
###### Basic Lib Setup
To set up this we will need the following files:
- A rust project that will hold the rust wrapper for our lib
- A C header file that will use the rust lib as its backend
- A Go file that will call the C header and which our VM's code will intetact with.Our file tree will look like this:
```text
starknet_crypto
┣ lib
┃ ┣ starknet_crypto
┃ ┃ ┣ src
┃ ┃ ┃ ┗ lib.rs
┃ ┃ ┣ Cargo.lock
┃ ┃ ┗ Cargo.toml
┃ ┗ starknet_crypto.h
┣ starknet_crypto.go
```Our Cargo.toml file will look like this:
```toml
[package]
name = "starknet-crypto"
version = "0.1.0"
edition = "2021"# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
libc = "0.2"
starknet-crypto = { version = "0.5.0"}[lib]
crate-type = ["cdylib", "staticlib", "lib"]
```We will import libc in our lib.rs as an external crate by adding:
```rust
extern crate libc;
```In order to build the lib we will add the following lines to our Makefile's `build` target:
```bash
@cd pkg/starknet_crypto/lib/starknet_crypto && cargo build --release
@cp pkg/starknet_crypto/lib/starknet_crypto/target/release/libstarknet_crypto.a pkg/starknet_crypto/lib
```And in order to import the lib from go we will add the following to our starknet_crypto.go file:
```go
/*
#cgo LDFLAGS: pkg/starknet_crypto/lib/libstarknet_crypto.a -ldl
#include "lib/starknet_crypto.h"
#include
*/
import "C"
```###### Converting Felt to FieldElement
Now that we have the basic setup the first thing we have to do is to define a conversion between our `Felt` in go, a `felt_t` type in C, and starknet-crypto's `FieldElement` types. We will perform these conversions using the big endian byte representation.
In our C header hile (starknet_crypto.h) we will define the types `byte_t` and `felt_t`:
```c
#includetypedef uint8_t byte_t;
// A 252 bit prime field element (felt), represented as an array of 32 bytes.
typedef byte_t felt_t[32];
```And we will interpret this `felt_t` in rust (lib.rs file) as a mutable pointer to the first byte in the felt:
```rust
// C representation of a bit array: a raw pointer to a mutable unsigned 8 bits integer.
type Bytes = *mut u8;
```With these types defined we can now work on converting the C felt representation to a `FieldElement` in rust. To do so we will implement the following conversion functions:
- `field_element_from_bytes`
We will convert the C pointer to an array of bytes and use it to create a `FieldElement`
```rust
fn field_element_from_bytes(bytes: Bytes) -> FieldElement {
let array = unsafe {
let slice: &mut [u8] = std::slice::from_raw_parts_mut(bytes, 32);
let array: [u8; 32] = slice.try_into().unwrap();
array
};
FieldElement::from_bytes_be(&array).unwrap()
}
```- `bytes_from_field_element`
We will convert the `FieldElement` into bytes and insert each byte into the C mutable pointer
```rust
fn bytes_from_field_element(felt: FieldElement, bytes: Bytes) {
let byte_array = felt.to_bytes_be();
for i in 0..32 {
unsafe {
*bytes.offset(i) = byte_array[i as usize];
}
}
}
```Now we will implement these same conversions but between `Felt` in go and `felt_t` in C. As we can import C types from Go, we don't have to define a type to represent `felt_t`.
- toC
We convert the `Felt` to bytes and insert each byte into a `felt_t`
```go
func toC(f lambdaworks.Felt) C.felt_t {
var result C.felt_t
for i, byte := range f.ToBeBytes() {
result[i] = C.byte_t(byte)
}
return result
}
```- fromC
We iterate the `felt_t` value and cast each byte as a `uint8` to build a byte array which we then use to build our `Felt`
```go
func fromC(result C.felt_t) lambdaworks.Felt {
var bytes [32]uint8
for i, byte := range result {
bytes[i] = uint8(byte)
}
return lambdaworks.FeltFromBeBytes(&bytes)
}
```###### Calling the poseidon permutation function
Now that we have our felt types defined we can move on to wrapping the poseidon permutation function. The `poseidon_permute_comp` from `starknet_crypto` receives a mutable state of three felts as an array. To reduce the complexity of our wrapper we will be receiving three felts in our C function.
We will define the following function in our C header file:
```C
// Computes the poseidon hash permutation over a state of three felts
void poseidon_permute(felt_t, felt_t, felt_t);
```And we will implement it in the rust lib file, using the types and conversions we implemented earlier:
```rust
use starknet_crypto::{poseidon_permute_comp, FieldElement};#[no_mangle]
extern "C" fn poseidon_permute(
first_state_felt: Bytes,
second_state_felt: Bytes,
third_state_felt: Bytes,
) {
// Convert state from C representation to FieldElement
let mut state_array: [FieldElement; 3] = [
field_element_from_bytes(first_state_felt),
field_element_from_bytes(second_state_felt),
field_element_from_bytes(third_state_felt),
];
// Call poseidon permute comp
poseidon_permute_comp(&mut state_array);
// Convert state from FieldElement back to C representation
bytes_from_field_element(state_array[0], first_state_felt);
bytes_from_field_element(state_array[1], second_state_felt);
bytes_from_field_element(state_array[2], third_state_felt);
}
```And with our lib ready, all that is left is to make a go wrapper with our felt conversion functions that calls the C function:
```go
func PoseidonPermuteComp(poseidon_state *[3]lambdaworks.Felt) {
state := *poseidon_state
// Convert args to c representation
first_state_felt := toC(state[0])
second_state_felt := toC(state[1])
third_state_felt := toC(state[2])// Compute hash using starknet_crypto C wrapper
C.poseidon_permute(&first_state_felt[0], &second_state_felt[0], &third_state_felt[0])
// Convert result to Go representation
var new_poseidon_state = [3]lambdaworks.Felt{
fromC(first_state_felt),
fromC(second_state_felt),
fromC(third_state_felt),
}
// Update poseidon state
*poseidon_state = new_poseidon_state
}
```Now that we can call a simple poseidon permutation function we can start implementing our poseidon builtin runner!
##### Implementing the PoseidonBuiltinRunner
We will start by defining our `PoseidonBuiltinRunner` and adding it to our VM when creating a `CairoRunner`:
It will contain it's base and a cache of values that we will use later to optimize our `DeduceMemoryCell` method. The included field indicates if a builtin is used by the program, is used in proof_mode, as all builtins have to be present by default, but for now we will always set the included field to true.
```go
type PoseidonBuiltinRunner struct {
base memory.Relocatable
included bool
cache map[memory.Relocatable]lambdaworks.Felt
}func NewPoseidonBuiltinRunner(included bool) *PoseidonBuiltinRunner {
return &PoseidonBuiltinRunner{included: included, cache: make(map[memory.Relocatable]lambdaworks.Felt)}
}
```In order to store it as a `BuiltinRunner` we will have to implement the `BuiltinRunner` interface. Aside from `AddValidationRule` & `DeduceMemoryCell`, most builtins share the same behaviour in their methods, so we can just port them from the builtin runners we implemented before:
```go
const POSEIDON_BUILTIN_NAME = "poseidon"
func (p *PoseidonBuiltinRunner) Base() memory.Relocatable {
return p.base
}func (p *PoseidonBuiltinRunner) Name() string {
return POSEIDON_BUILTIN_NAME
}func (p *PoseidonBuiltinRunner) InitializeSegments(segments *memory.MemorySegmentManager) {
p.base = segments.AddSegment()
}func (p *PoseidonBuiltinRunner) InitialStack() []memory.MaybeRelocatable {
if p.included {
return []memory.MaybeRelocatable{*memory.NewMaybeRelocatableRelocatable(p.base)}
} else {
return nil
}
}
```As the poseidon builtin doesn't have validation rules, the method will be left empty:
```go
func (p *PoseidonBuiltinRunner) AddValidationRule(*memory.Memory) {
}
```Now lets dive into the poseidon builtin's behaviour!
The poseidon builtin memory is divided into instances of 6 cells, 3 input cells and 3 output cells. This means that whenever we want to deduce the value of an output cell, we will look for the input cells, compute the pedersen permutation over them, and write the permutated values to the output cells. As we only deduce the value of one output cell at a time, we will write the value of the output cells to a cache and use them the next time we have to deduce a memory cell so we avoid computing the poseidon hash more than once over the same input values
We define the following constants to represent a poseidon instance:
```go
const POSEIDON_CELLS_PER_INSTANCE = 6
const POSEIDON_INPUT_CELLS_PER_INSTANCE = 3
```And we can implement `DeduceMemoryCell`:
This method will first check if the cell is an input cell, if it's an input cell then there is nothing to deduce and it returns nil. Then it will check if there is a cached value for that cell and return it. If there is no cached value it will define the addresses of the first input and output cells, and fetch the values of the input cells. If any of the input cells is missing, or is not a felt value, it returns an error. Once it has the three input cells, it performs the poseidon permutation and inserts the permutated value into each output cell's address in the cache. It then returns the value stored in the cache for the address that the method received.
```go
func (p *PoseidonBuiltinRunner) DeduceMemoryCell(address memory.Relocatable, mem *memory.Memory) (*memory.MaybeRelocatable, error) {
// Check if its an input cell
index := address.Offset % POSEIDON_CELLS_PER_INSTANCE
if index < POSEIDON_INPUT_CELLS_PER_INSTANCE {
return nil, nil
}value, ok := p.cache[address]
if ok {
return memory.NewMaybeRelocatableFelt(value), nil
}input_start_addr, _ := address.SubUint(index)
output_start_address := input_start_addr.AddUint(POSEIDON_INPUT_CELLS_PER_INSTANCE)// Build the initial poseidon state
var poseidon_state [3]lambdaworks.Feltfor i := uint(0); i < POSEIDON_INPUT_CELLS_PER_INSTANCE; i++ {
felt, err := mem.GetFelt(input_start_addr.AddUint(i))
if err != nil {
return nil, err
}
poseidon_state[i] = felt
}// Run the poseidon permutation
starknet_crypto.PoseidonPermuteComp(&poseidon_state)// Insert the new state into the corresponding output cells in the cache
for i, elem := range poseidon_state {
p.cache[output_start_address.AddUint(uint(i))] = elem
}
return memory.NewMaybeRelocatableFelt(p.cache[address]), nil
}
```#### Pedersen
TODO
#### Ecdsa
TODO
#### Keccak
TODO
#### Bitwise
This builtin provides a way to work with the basic bit operations `and`, `or` and `xor`. It implements the basic builtin interface methods:
```go
type BitwiseBuiltinRunner struct {
base memory.Relocatable
included bool
}
```
the getter methods:```go
func (b *BitwiseBuiltinRunner) Base() memory.Relocatable {
return r.base
}func (b *BitwiseBuiltinRunner) Name() string {
return "bitwise"
}
```For the `InitializeSegments` we just add a segment to the memory and store in the base attribute, the first adress of the segment.
```go
func (b *BitwiseBuiltinRunner) InitializeSegments(segments *memory.MemorySegmentManager) {
r.base = segments.AddSegment()
}
```we also have `InitialStack` method that returns a stack the base address appended
```go
func (b *BitwiseBuiltinRunner) InitialStack() []memory.MaybeRelocatable {
if b.included {
return []memory.MaybeRelocatable{*memory.NewMaybeRelocatableRelocatable(b.base)}
} else {
return []memory.MaybeRelocatable{}
}
}
```The method `DeducedMemoryCell` fetches the operands from memory and performs the following operations:
- If the index is less than the number of input cells then the method returns nil
- After the operands are fetched it is checks that both of them are felts because bitwise can not be performed on relocatable or nil values.
- If the number of bits of any operand is greater than `TOTAL_N_BITS` the method fails because we are out of the field.
- The index is used to know which bitwise operation is going to be performed, if is 2 then `and` is executed, if is 3 then `xor` and if is a 4 then `or` is executed
- Otherwise nil value is returned```go
const BITWISE_CELLS_PER_INSTANCE = 5
const BITWISE_TOTAL_N_BITS = 251
const BIWISE_INPUT_CELLS_PER_INSTANCE = 2func (b *BitwiseBuiltinRunner) DeduceMemoryCell(address memory.Relocatable, mem *memory.Memory) (*memory.MaybeRelocatable, error) {
index := address.Offset % BITWISE_CELLS_PER_INSTANCE
if index < BITWISE_INPUT_CELLS_PER_INSTANCE {
return nil, nil
}x_addr, _ := address.SubUint(index)
y_addr := x_addr.AddUint(1)num_x_felt, err := mem.GetFelt(x_addr)
if err != nil {
return nil, nil
}
num_y_felt, err := mem.GetFelt(y_addr)
if err != nil {
return nil, nil
}if num_x_felt.Bits() > BITWISE_TOTAL_N_BITS {
return nil, ErrFeltBiggerThanPowerOfTwo(num_x_felt)
}
if num_y_felt.Bits() > BITWISE_TOTAL_N_BITS {
return nil, ErrFeltBiggerThanPowerOfTwo(num_y_felt)
}var res *memory.MaybeRelocatable
switch index {
case 2:
res = memory.NewMaybeRelocatableFelt(num_x_felt.And(num_y_felt))
case 3:
res = memory.NewMaybeRelocatableFelt(num_x_felt.Xor(num_y_felt))
case 4:
res = memory.NewMaybeRelocatableFelt(num_x_felt.Or(num_y_felt))
default:
res = nil
}
return res, nil}
```Finally `AddValidationRule` is empty in this case
``` go
func (b *BitwiseBuiltinRunner) AddValidationRule(*memory.Memory) {}
```#### EcOp
TODO
#### SegmentArena
TODO
#### Hints
So far we have been thinking about the VM mostly abstracted from the prover and verifier it's meant to feed its results to. The last main feature we need to talk about, however, requires keeping this proving/verifying logic in mind.
As a reminder, the whole point of the Cairo VM is to output a trace/memory file so that a `prover` can then create a cryptographic proof that the execution of the program was done correctly. A `verifier` can then take that proof and verify it in much less time than it would have taken to re-execute the entire program.
In this model, the one actually using the VM to run a cairo program is *always the prover*. The verifier does not use the VM in any way, as that would defeat the entire purpose of validity proofs; they just get the program being run and the proof generated by the prover and run some cryptographic algorithm to check it.
While the verifier does not execute the code, they do *check it*. As an example, if a cairo program computes a fibonacci number like this:
```
func main() {
// Call fib(1, 1, 10).
let result: felt = fib(1, 1, 10);
}
```the verifier won't *run* this, but they will reject any incorrect execution of the call to `fib`. The correct value for `result` in this case is `144` (it's the 10th fibonacci number); any attempt by the prover to convince the verifier that `result` is not `144` will fail, because the call to the `fib` function is *being proven* and thus *seen* by the verifier.
A `Hint` is a piece of code that is not proven, and therefore not seen by the verifier. If `fib` above were a hint, then the prover could convince the verifier that `result` is $144$, $0$, $1000$ or any other number.
In cairo 0, hints are code written in `Python` and are surrounded by curly brackets. Here's an example from the `alloc` function, provided by the Cairo common library
```
func alloc() -> (ptr: felt*) {
%{ memory[ap] = segments.add() %}
ap += 1;
return (ptr=cast([ap - 1], felt*));
}
```The first line of the function,
```
%{ memory[ap] = segments.add() %}
```is a hint called `ADD_SEGMENT`. All it does is create a new memory segment, then write its base to the current value of `ap`. This is python code that is being run in the context of the VM's execution; thus `memory` refers to the VM's current memory and `segments.add()` is just a function provided by the VM to allocate a new segment.
At this point you might be wondering: why run code that's not being proven? Isn't the whole point of Cairo to prove correct execution? There are (at least) two reasons for hints to exist.
##### Nothing to prove
For some operations there's simply nothing to prove, as they are just convenient things one wants to do during execution. The `ADD_SEGMENT` hint shown above is a good example of that. When proving execution, the program's memory is presented as one relocated continuous segment, it does not matter at all which segment a cell was in, or when that segment was added. The verifier doesn't care.
Because of this, there's no reason to make `ADD_SEGMENT` a part of the cairo language and have an instruction for it.
##### Optimization
Certain operations can be very expensive, in the sense that they might involve a huge amount of instructions or memory usage, and therefore contribute heavily to the proving time. For certain calculations, there are two ways to convince the verifier that it was done correctly:
- Write the entire calculation in Cairo/Cairo Assembly. This makes it show up in the trace and therefore get proven.
- *Present the result of the calculation to the verifier through a hint*, then show said result indeed satisfies the relevant condition that makes it the actual result.To make this less abstract, let's show two examples.
##### Square root
Let's say the calculation in question is to compute the square root of a number `x`. The two ways to do it then become:
- Write the usual square root algorithm in Cairo to compute `sqrt(x)`.
- Write a hint that computes `sqrt(x)`, then immediately after calling the hint show __in Cairo__ that `(sqrt(x))^2 = x`.The second approach is exactly what the `sqrt` function in the Cairo common library does:
```
// Returns the floor value of the square root of the given value.
// Assumptions: 0 <= value < 2**250.
func sqrt{range_check_ptr}(value) -> felt {
alloc_locals;
local root: felt;%{
from starkware.python.math_utils import isqrt
value = ids.value % PRIME
assert value < 2 ** 250, f"value={value} is outside of the range [0, 2**250)."
assert 2 ** 250 < PRIME
ids.root = isqrt(value)
%}assert_nn_le(root, 2 ** 125 - 1);
tempvar root_plus_one = root + 1;
assert_in_range(value, root * root, root_plus_one * root_plus_one);return root;
}
```If you read it carefully, you'll see that the hint in this function computes the square root in python, then this line
```
assert_in_range(value, root * root, root_plus_one * root_plus_one);
```asserts __in Cairo__ that `(sqrt(x))^2 = x`.
This is done this way because it is much cheaper, in terms of the generated trace (and thus proving time), to square a number than compute its square root.
Notice that the last assert is absolutely mandatory to make this safe. If you forget to write it, the square root calculation does not get proven, and anyone could convince the verifier that the result of `sqrt(x)` is any number they like.
##### Linear search turned into an O(1) lookup
This example is taken from the [Cairo documentation](https://docs.cairo-lang.org/0.12.0/hello_cairo/program_input.html).
Given a list of `(key, value)` pairs, if we want to write a `get_value_by_key` function that returns the value associated to a given key, there are two ways to do it:
- Write a linear search in Cairo, iterating over each key until you find the requested one.
- Do that exact same linear search *inside a hint*, find the result, then show that the result's key is the one requested.Again, the second approach makes the resulting trace and proving much faster, because it's just a lookup; there's no linear search. Notice this only applies to proving, the VM has to execute the hint, so there's still a linear search when executing to generate the trace. In fact, the second approach is more expensive for the VM than the first one. It has to do both a linear search and a lookup. This is a tradeoff in favor of proving time.
Also note that, as in the square root example, when writing this logic you need to remember to show the hint's result is the correct one in Cairo. If you don't, your code is not being proven.
##### Non-determinism
The Cairo paper and documentation refers to this second approach to calculating things through hints as *non-determinism*. The reason for this is that sometimes there is more than one result that satisfies a certain condition. This means that cairo execution becomes non deterministic; a hint could output multiple values, and in principle there is no way to know which one it's going to be. Running the same code multiple times could give different results.
The square root is an easy example of this. The condition `(sqrt(x))^2 = x` is not unique, there are two solutions to it. Without the hint, this is non-deterministic, `x` could have multiple values; the hint resolves that by choosing a specific value when being run.
##### Common Library and Hints
As explained above, using hints in your code is highly unsafe. Forgetting to add a check after calling them can make your code vulnerable to any sorts of attacks, as your program will not prove what you think it proves.
Because of this, most hints in Cairo 0 are wrapped around or used by functions in the Cairo common library that do the checks for you, thus making them safe to use. Ideally, Cairo developers should not be using hints on their own; only transparently through Cairo library functions they call.
##### Whitelisted Hints
In Cairo, a hint could be any Python code you like. In the context of it as just another language someone might want to use, this is fine. In the context of Cairo as a programming language used to write smart contracts deployed on a blockchain, it's not. Users could deploy contracts with hints that simply do
```python
while true:
pass
```and grind the network down to a halt, as nodes get stuck executing an infinite loop when calling the contract.
To address this, the starknet network maintains a list of *whitelisted* hints, which are the only ones that can be used in starknet contracts. These are the ones implemented in this VM.
#### Implementing Hints
Hints are essentially logic that is executed in each cairo step, before the next instruction, and which may interact with and modify the vm. We will first look into the broad execution loop and the dive into the different types of interaction hints can have with the vm.
While the original cairo-lang implementation executes these hints in python, we will instead be implementing their logic in go and matching each string of python code to a function in the vm's code. We will also be using an interface to abstract the hint processing part of the vm and allow greater flexibility when using the vm in other contexts.##### The HintProcessor interface
This `HintProcessor` interface will consist of two methods: `CompileHint`, which receives hint data from the compiled program and transforms it into whatever format is more convenient for hint execution, and `ExecuteHint`, which will receive this data and use it to execute the hint.
```go
type HintProcessor interface {
// Transforms hint data outputted by the VM into whichever format will be later used by ExecuteHint
CompileHint(hintParams *parser.HintParams, referenceManager *parser.ReferenceManager) (any, error)
// Executes the hint which's data is provided by a dynamic structure previously created by CompileHint
ExecuteHint(vm *VirtualMachine, hintData *any, constants *map[string]lambdaworks.Felt, execScopes *types.ExecutionScopes) error
}
```We will first look at how hint processing ties into the core vm execution loop, and then look into how this vm's implementation of the `HintProcessor` interface works:
##### VM execution loop
Before we begin executing steps, we will feed the hint-related information from the compiled program to the `HintProcessor`, and obtain what we call `HintData`, which will be later on used to execute the hint. As we can see, the compiled json stores the hint information in a map which connects pc offsets (at which pc offset the hint should be executed) to a list of hints (yes, more than one hint can be executed as a given pc), and we will use a similar structure to hold the compiled `HintData`.
```go
func (r *CairoRunner) BuildHintDataMap(hintProcessor vm.HintProcessor) (map[uint][]any, error) {
hintDataMap := make(map[uint][]any)
for pc, hintsParams := range r.Program.Hints {
hintDatas := make([]any, 0, len(hintsParams))
for _, hintParam := range hintsParams {
data, err := hintProcessor.CompileHint(&hintParam)
if err != nil {
return nil, err
}
hintDatas = append(hintDatas, data)
}
hintDataMap[pc] = hintDatas
}return hintDataMap, nil
}
```Once we have our map of `HintData`s we can start executing cairo steps. Before fetching the next instruction, we will check if we have hints to run for the current pc, and if we do, the `HintProcessor` will execute each hint using the corresponding `HintData`.
```go
func (v *VirtualMachine) Step(hintProcessor HintProcessor, hintDataMap *map[uint][]any) error {
// Run Hint
hintDatas, ok := (*hintDataMap)[v.RunContext.Pc.Offset]
if ok {
for i := 0; i < len(hintDatas); i++ {
err := hintProcessor.ExecuteHint(v, &hintDatas[i])
if err != nil {
return err
}
}
}// Run Instruction
encoded_instruction, err := v.Segments.Memory.Get(v.RunContext.Pc)
```##### Implementing a HintProcessor: ExecuteHint
This method will receive a `HintData`, and match its `Code` field, which contains the python code as a string, to a go function that implements its logic:
```go
func (p *CairoVmHintProcessor) ExecuteHint(vm *vm.VirtualMachine, hintData *any) error {
data, ok := (*hintData).(HintData)
if !ok {
return errors.New("Wrong Hint Data")
}
switch data.Code {
case ADD_SEGMENT:
return addSegment(vm)
default:
return errors.Errorf("Unknown Hint: %s", data.Code)
}
}
```Where `ADD_SEGMENT` is a constant with the python code of the hint
```go
const ADD_SEGMENT = "memory[ap] = segments.add()"
```And the function `addSegment` implements its logic, which is to add a segment to the vm's memory:
```go
func addSegment(vm *VirtualMachine) error {
newSegmentBase := vm.Segments.AddSegment()
return vm.Segments.Memory.Insert(vm.RunContext.Ap, NewMaybeRelocatableRelocatable(newSegmentBase))
}
```Before we implement the `CompileHint` method, lets look at this crucial part of hint interaction with the vm:
##### Hint Interaction: Ids
Ids are hints' way to interact with variables in a cairo program. For example, if I declare a variable `n` in my cairo code, I can access that `n` variable inside a hint using `ids.n`.
The following cairo snippet would print the number 17
```py
let n = 17
%{ print(ids.n) %}
```To access these variables when implementing our hints in go we will implement the `IdsManager`, which will allow us to read and write cairo variables.
But interacting with cairo variables is not as easy as it sounds. In order to access them, we must first compute their address from a `Reference`###### References
As cairo variables are created during the vm's execution, we can't know their value beforehand. In order to solve this, the compiled program provides us with references for cairo variables available to hints. These references are instructions on where we can find a specific cairo variable in memory. For example, they might tell us to take the current value of the fp register, substract 1 from it, and access the memory value at that new address.
As these references come in string format, we need to parse them into a struct that we can efficiently use to compute addresses:
```go
type HintReference struct {
Offset1 OffsetValue
Offset2 OffsetValue
Dereference bool
ApTrackingData parser.ApTrackingData
ValueType string
}
```This struct matches the canonical string format for references: `"cast(Offset1 + Offset2, ValueType)"` (or `"[cast(Offset1 + Offset2, ValueType)]"`, in the case of Dereference being true ).
The first two fields: Offset1 and Offset2 will lead us to a particular memory value, the Dereference field will tell us if the value of the ids is that memory value we found (in case of false), or if we should use that value as an address to fetch the ids value from memory (in case of true), and the ValueType tells us what type the variable has (be it a felt, felt*, struct, etc). As we already know the context of the hints, we can ignore the ValueType.Now lets look at what an `OffsetValue` is:
```go
type OffsetValue struct {
ValueType offsetValueType
Immediate Felt
Value int
Register vm.Register
Dereference bool
}type offsetValueType uint
const (
Value offsetValueType = 0
Immediate offsetValueType = 1
Reference offsetValueType = 2
)
```There are three types of `OffsetValue`:
* Inmediate: Contains the value of the ids as a literal, for example `"cast(17, felt)"` is a reference to a felt with literal value 17. Only Offset1 can be of Immediate type, and the reference can't have Dereference = true
* Reference: It is made up of a Register (AP or FP) and a Value, it will tell us the location of an ids in memory by pointing to a memory cell relative to a register. For example `"cast(fp + (-1), felt*)"` is a reference with Offset1 of type Reference, with register FP and Value -1, and it leads us to an felt* value obtained from subtracting 1 from the current fp value. OffsetValues of type Reference can also have Dereference, for example: `"cast([fp + (-1)], felt)"` will lead us to a felt value located one cell before the one at the current register value. Both OffsetValues can be of type Reference in the same Reference
* Value: Only Offset2 can be of type value, it consists of a single field value and acts as a modifier to the first OffsetValue (which will always be of type Reference for this case). For example, we can add second OffsetValue of Value type with Value = 1 to the first Reference type example: `"cast(fp + (-1) + 2), felt*)"`, this will tell us to subtract 1 from fp, and then add 2 to it, and that will be our ids value.
When an offset doesn't exist in the reference, we use an OffsetValue of type Value with Value 0, which essentially does nothing, to represent it. This allows us to use go's zero value by default to make our code (and life) a bit simpler.
This can be a bit hard to grasp at first so lets look at some examples:
* Immediate Reference
String Reference: `cast(17, felt)`
Struct Reference: {Offset1: {ValueType: Immediate, Immedate: 17}, ValueType: "felt"}
Reference in words: The value of the ids is 17* Dereference with one offset of Type Reference
String Reference: `[cast(ap + 1, felt)]`
Struct Reference: {Offset1: {ValueType: Reference, Register: AP, Value: 1}, Dereference: true, ValueType: "felt"}
Reference in words: Take the current value of ap, add 1 to it and then fetch the memory value at that address* Two offsets of type Reference, Value
String Reference: `"cast(ap + 1 + (-2), felt*)"`
Struct Reference: {Offset1: {ValueType: Reference, Register: AP, Value: 1}, Offset2: {ValueType: Value, Value: -2}, ValueType: "felt*"}
Reference in words: Take the current value of ap, add 1 to it and then subtract 2 from it* Two offsets of type Reference (with Dereference), Value
String Reference: `"cast([ap + (-1)] + (-2), felt*)"`
Struct Reference: {Offset1: {ValueType: Reference, Register: AP, Value: 1, Dereference: true}, Offset2: {ValueType: Value, Value: -2}, ValueType: "felt*"}
Reference in words: Take the current value of ap, add 1 to it, fetch the memory value at that address and then subtract 2 from it* Two offsets of type Reference (with Dereference)
String Reference: `"cast([ap + (-1)] + [ap], felt)"`
Struct Reference: {Offset1: {ValueType: Reference, Register: AP, Value: 1, Dereference: true}, Offset2: {ValueType: Reference, Register: AP, Value: 0, Dereference: true}, ValueType: "felt*"}
Reference in words: Take the current value of ap, subtract 1 to it, fetch the memory value at that address. Take the current value of ap, fetch the memory value at that address. Add the two values we obtained.Now all thats left to analyze is in the reference is the `ApTracking`:
```go
type ApTrackingData struct {
Group int `json:"group"`
Offset int `json:"offset"`
}
```As the value of AP is constantly changing with each instruction executed, its not that simple to track variables who's references are based on ap. ApTracking is used to calculate the difference between the value of ap at the moment the variable was created/ enterted the scope of the function (and hence, the hint) and the value of ap at the moment the hint is executed. Each hint and each reference has its own ApTracking.
###### Computing addresses using References
The function used to fetch the value from an ids variable using a reference works as follows:
1. Check if the refeference has type Immediate, if this is true, return the Immediate field
2. Calculate the address of the ids variable using the reference (we will see how this works soon)
3. Check the Dereference field of the reference, if false, return the address we obtained in 2, if true, fetch the memory value at that address and return it.```go
func getValueFromReference(reference *HintReference, apTracking parser.ApTrackingData, vm *VirtualMachine) (*MaybeRelocatable, bool) {
// Handle the case of immediate
if reference.Offset1.ValueType == Immediate {
return NewMaybeRelocatableFelt(reference.Offset1.Immediate), true
}
addr, ok := getAddressFromReference(reference, apTracking, vm)
if ok {
if reference.Dereference {
val, err := vm.Segments.Memory.Get(addr)
if err == nil {
return val, true
}
} else {
return NewMaybeRelocatableRelocatable(addr), true
}
}
return nil, false
}
```In order to extract the value of an ids variable, we will first compute its address, this works as follows:
1. Check that the Offset1 is a Reference
2. Compute the value of Offset1
3. Add the value of Offet2. By either calculating it in the case of a Reference type, or just using the Value field in the case of a Value type.
4. Return the result obtained in step 3.
```go
func getAddressFromReference(reference *HintReference, apTracking parser.ApTrackingData, vm *VirtualMachine) (Relocatable, bool) {
if reference.Offset1.ValueType != Reference {
return Relocatable{}, false
}
offset1 := getOffsetValueReference(reference.Offset1, reference.ApTrackingData, apTracking, vm)
if offset1 != nil {
offset1_rel, is_rel := offset1.GetRelocatable()
if is_rel {
switch reference.Offset2.ValueType {
case Reference:
offset2 := getOffsetValueReference(reference.Offset2, reference.ApTrackingData, apTracking, vm)
if offset2 != nil {
res, err := offset1_rel.AddMaybeRelocatable(*offset2)
if err == nil {
return res, true
}
}
case Value:
res, err := offset1_rel.AddInt(reference.Offset2.Value)
if err == nil {
return res, true
}
}
}
}
return Relocatable{}, false}
```Now lets see how computing the value of an OffsetValue of type Reference works:
1. Determine a base address by checking the Register field of the OffsetValue. If the register is FP, use the current value of fp. If the register is AP, apply the necessary ap tracking corrections to ap and use it as base address.
2. Add the field Value of the OffsetValue to the base address
3. Check the Dereference field of the OffsetValue. If its false, return the address we obtained in 2. If is true, fetch the memory value at that address and return it```go
func getOffsetValueReference(offsetValue OffsetValue, refApTracking parser.ApTrackingData, hintApTracking parser.ApTrackingData, vm *VirtualMachine) *MaybeRelocatable {
var baseAddr Relocatable
ok := true
switch offsetValue.Register {
case FP:
baseAddr = vm.RunContext.Fp
case AP:
baseAddr, ok = applyApTrackingCorrection(vm.RunContext.Ap, refApTracking, hintApTracking)
}
if ok {
baseAddr, err := baseAddr.AddInt(offsetValue.Value)
if err == nil {
if offsetValue.Dereference {
// val will be nil if err is not nil, so we can ignore it
val, _ := vm.Segments.Memory.Get(baseAddr)
return val
} else {
return NewMaybeRelocatableRelocatable(baseAddr)
}
}
}
return nil
}
```Finally, the last thing we need is to know how ap tracking corrections work.
This function will receive an address (the current value of ap), the ap tracking data of the reference (unique to each reference) and the hint's ap tracking data (unique to each hint) and perform the following steps:
1. Assert that both ap tracking datas belong to the same group (aka their Group fields match)
2. Subtract the difference between the hint's ap tracking data's Offset field and the reference's ap tracking data's Offset field from the address (ap)
3. Return the value obtained in 2```go
func applyApTrackingCorrection(addr Relocatable, refApTracking parser.ApTrackingData, hintApTracking parser.ApTrackingData) (Relocatable, bool) {
// Reference & Hint ApTracking must belong to the same group
if refApTracking.Group == hintApTracking.Group {
addr, err := addr.SubUint(uint(hintApTracking.Offset - refApTracking.Offset))
if err == nil {
return addr, true
}
}
return Relocatable{}, false
}
```###### Implement the IdsManager
Now that we have tackled reference management, we can implement the `IdsManager`, which will allow us to "forget" what references are when implementing hints.
The IdsManager has the following structure:
* References: A map of all the ids variables the hint has access to, it maps the name of the cairo varaible to a HintReference (the parsed version of the compiled program's Reference)
* HintAptracking: The ap tracking data unique to the hint```go
type IdsManager struct {
References map[string]HintReference
HintApTracking parser.ApTrackingData
}
```And we can also implement friendlier versions of the functions we implemented in the previous section, that take the name of the ids variable, instead of the reference and hint ap tracking data:
```go
// Returns the value of an identifier as a MaybeRelocatable
func (ids *IdsManager) Get(name string, vm *VirtualMachine) (*MaybeRelocatable, error) {
reference, ok := ids.References[name]
if ok {
val, ok := getValueFromReference(&reference, ids.HintApTracking, vm)
if ok {
return val, nil
}
}
return nil, ErrUnknownIdentifier(name)
}// Returns the address of an identifier given its name
func (ids *IdsManager) GetAddr(name string, vm *VirtualMachine) (Relocatable, error) {
reference, ok := ids.References[name]
if ok {
addr, ok := getAddressFromReference(&reference, ids.HintApTracking, vm)
if ok {
return addr, nil
}
}
return Relocatable{}, ErrUnknownIdentifier(name)
}
```We can also make more specialized versions of the Get method, that will also handle conversions to Felt or Relocatable, as we will almost always know which type of value we are expecting when implementing hints:
```go
// Returns the value of an identifier as a Felt
func (ids *IdsManager) GetFelt(name string, vm *VirtualMachine) (lambdaworks.Felt, error) {
val, err := ids.Get(name, vm)
if err != nil {
return lambdaworks.Felt{}, err
}
felt, is_felt := val.GetFelt()
if !is_felt {
return lambdaworks.Felt{}, ErrIdentifierNotFelt(name)
}
return felt, nil
}// Returns the value of an identifier as a Relocatable
func (ids *IdsManager) GetRelocatable(name string, vm *VirtualMachine) (Relocatable, error) {
val, err := ids.Get(name, vm)
if err != nil {
return Relocatable{}, err
}
relocatable, is_relocatable := val.GetRelocatable()
if !is_relocatable {
return Relocatable{}, errors.Errorf("Identifier %s is not a Relocatable", name)
}
return relocatable, nil
}
```Lastly, we can also implement a method to insert a value into an ids variable (as we already know how to calculate their address)
```go
// Inserts value into memory given its identifier name
func (ids *IdsManager) Insert(name string, value *MaybeRelocatable, vm *VirtualMachine) error {addr, err := ids.GetAddr(name, vm)
if err != nil {
return err
}
return vm.Segments.Memory.Insert(addr, value)
}
```##### Implementing a HintProcessor: CompileHint
The `CompileHint` method will be in charge of converting the hint-related data from the compiled json into a format that our processor can use to execute each hint. For our `CairoVmHintProcessor` we will use the following struct:
```go
type HintData struct {
Ids IdsManager
Code string
}
```
Where IdsManager is the struct we just saw in the previous section, a struct which manages all kinds of interaction between the hint implemented in go and the cairo variables available to it, and Code is the python code of the hint.And we will implement a `CompileHint` method which receives the hint's data from the compiled program in the form of `HintParams`, and a reference to the compiled json's `ReferenceManager`, a list of references to all ids variables in the program. And performs the following steps:
1. Create a map from variable name to HintReference
2. Iterate over the hintParams's `ReferenceIds` field (a map from an ids name to an index in the ReferenceManager). For each iteration:1. Remove the path from the reference's name (shortening full paths such a "__main__.a" to just the variable name "a"),
2. Fetch the reference from the ReferenceManager (using the index from the ReferenceIds)
3. Parse the Reference into a `HintReference`
4. Insert the parsed reference into the map we created in 1, using the shortened name (from 2.1) as a key3. Create an IdsManager using the map from 1, and the hintParam's ap tracking data
4. Create a `HintData` struct with the IdsManager and the hintParam's Code```go
func (p *CairoVmHintProcessor) CompileHint(hintParams *parser.HintParams, referenceManager *parser.ReferenceManager) (any, error) {
references := make(map[string]HintReference, 0)
for name, n := range hintParams.FlowTrackingData.ReferenceIds {
if int(n) >= len(referenceManager.References) {
return nil, errors.New("Reference not found in ReferenceManager")
}
split := strings.Split(name, ".")
name = split[len(split)-1]
references[name] = ParseHintReference(referenceManager.References[n])
}
ids := NewIdsManager(references, hintParams.FlowTrackingData.APTracking)
return HintData{Ids: ids, Code: hintParams.Code}, nil
}
```##### Hint Interaction: Constants
###### How are Constants handled by hints and the cairo compiler
Hints can also access constant variables using the ids syntax, for example, a hint can access the `MAX_SIZE` constant from a cairo program using `ids.MAX_SIZE`. While the behaviour from the hint's standpoint is identical to regular ids variables, they are handled differently by both the compiler and the vm.
They are part of the compiled program's `Idenfifiers` field, and can be identified by the `const` type. We may also find aliases for them in the `Identifiers` section, aliases happen when a cairo file imports constants from another cairo file, in such cases we will have an identifier of type `const` under the file where the constant was declared's path, and an identifier of type `alias` under the file where the constant was imported's path, pointing to the original constant's identifier. For example:
```json
"starkware.cairo.common.cairo_keccak.keccak.BLOCK_SIZE": {
"destination": "starkware.cairo.common.cairo_keccak.packed_keccak.BLOCK_SIZE",
"type": "alias"
},
"starkware.cairo.common.cairo_keccak.packed_keccak.BLOCK_SIZE": {
"type": "const",
"value": 3
},
```This is an extract from a compiled cairo program, where we can see that there is a constant `BLOCK_SIZE`, with value 3, declared in packed_keccak.cairo file, that was then imported by the keccak.cairo file.
###### How does the vm extract the constants for hint execution
As constants are not unique to any specific hint, they are not provided to the HintProcessor's `CompileHint` method, but are instead provided directly to the `ExecuteHint` method. Before providing these constants, we need to first extract them from the Identifiers field of the compiled program. This works as follows:
1. Create a map to store the constants, maping full path constant names to their Felt value
2. Iterate over the program's `Identifiers` field, and check the type of each identifier. If the identifier is of type `const`, add its value to the map created in 1. If the identifier is of type `alias`, search for the identifier at its destination (we will see how to do this next), and if its of type `const`, add it to the map created in 1 under the alias' name.
3. Return the map created in 1```go
func (p *Program) ExtractConstants() map[string]lambdaworks.Felt {
constants := make(map[string]lambdaworks.Felt)
for name, identifier := range p.Identifiers {
switch identifier.Type {
case "const":
constants[name] = identifier.Value
case "alias":
val, ok := searchConstFromAlias(identifier.Destination, &p.Identifiers)
if ok {
constants[name] = val
}
}
}
return constants
}
```In order to search for the aliased identifier we need to do so recursively, as constants can be imported form file A into file B, then from file B into file C and so on.
To do so we use a recursive function which receives the destination field of an alias type identifier and a reference to the identifiers map. It will then look for the identifier using the received destination. If the new identifier is a constant, it wil return its value, if it is an alias it will call itself again with the new alias' destintation, and if its none, it will return false, indicating that the alias was not pointing to a constant.```go
func searchConstFromAlias(destination string, identifiers *map[string]Identifier) (lambdaworks.Felt, bool) {
identifier, ok := (*identifiers)[destination]
if ok {
switch identifier.Type {
case "const":
return identifier.Value, true
case "alias":
return searchConstFromAlias(identifier.Destination, identifiers)
}
}
return lambdaworks.Felt{}, false
}
```###### How does the IdsManager handle constants
Before looking into how the IdsManager handles constants, we'll have to add a new field to it:
```go
type IdsManager struct {
References map[string]HintReference
HintApTracking parser.ApTrackingData
AccessibleScopes []string
}
```
AccessibleScopes is a list of paths that a hint has access to, for example, if we were to write a hint in a function `foo` of a cairo program called `program`, that hint's accessible scopes will look something along the likes of `["program", "program.foo"]`. This list is taken directly from the `HintParams`' `AccessibleScopes` field in the compiled json.We can use this accessible scopes to determine the correct path for a cairo constant when implementing a hint. To do so, we will be searching for a constant in the map of constants provided by the vm, using the name of the constant in the hint and the possible paths in the accessible scopes, going from innermost (in the example, "program.foo"), to outermost (in the example, "program").
We will be adding this behaviour to the `IdsManager`, by adding a function that will return the value of a constant given its name (without its full path) and the map of constants, following these steps:1. Iterate over the list of accessible scopes in reverse order
2. For each path in accessible scopes, append the name of the constant to get the full-path constant's name
3. Using the full-path constant names, try to fetch from the constants map
4. Once a match is found, return the value from the constant map```go
func (ids *IdsManager) GetConst(name string, constants *map[string]lambdaworks.Felt) (lambdaworks.Felt, error) {
// Hints should always have accessible scopes
if len(ids.AccessibleScopes) != 0 {
// Accessible scopes are listed from outer to inner
for i := len(ids.AccessibleScopes) - 1; i >= 0; i-- {
constant, ok := (*constants)[ids.AccessibleScopes[i]+"."+name]
if ok {
return constant, nil
}
}
}
return lambdaworks.FeltZero(), errors.Errorf("Missing constant %s", name)
}
```##### Hint Interaction: ExecutionScopes
Up until now we saw how hints can interact with the vm and the cairo variables, but what about the interaction between hints themselves?
To answer this question, we will introduce the concept of `Execution Scopes`, they consist of a stack of dictionaries that can hold any kind of variables. These scopes are accessible to all hints, allowing data to be shared between hints without the cairo program being aware of them. As it consists of a stack of dictionaries (from now on referred to as scopes), hints will only be able to interact with the last (or top level) scope. Hints can also remove and create new scopes, we will call these operations `ExitScope` and `EnterScope`. To better illustrate this behaviour, lets make a generic example:* HINT A: Adds variable n = 3 (Scopes = [{n: 3}])
* HINT B: Fetches variable n and updates its value to 5 (Scopes = [{n: 5}])
* HINT C: Uses the EnterScope operation (Scopes = [{n: 5}, {}])
* HINT D: Adds variable n = 3 (Scopes = [{n: 5}, {n: 3}])
* HINT E: Prints the value of n (3), then used the ExitScope operation (Scopes = [{n: 5}])
* HINT F: Prints the value of n (5)Now that we know how execution scopes work, implementing them is quite simple:
```go
type ExecutionScopes struct {
data []map[string]interface{}
}
```
We have a stack (represented as a slice), of maps that connect a varaible's name, to its value, accepting any kind of variables as valueWe should also note that when creating an `ExecutionScopes`, it comes with one initial scope (called main scope), which can't be exited
```go
func NewExecutionScopes() *ExecutionScopes {
data := make([]map[string]interface{}, 1)
data[0] = make(map[string]interface{})
return &ExecutionScopes{data}
}
```With this struct we can implement the basic operations:
*EnterScope*
Adds a new scope to the stack, which is received by the method
```go
func (es *ExecutionScopes) EnterScope(newScopeLocals map[string]interface{}) {
es.data = append(es.data, newScopeLocals)}
```*ExitScope*
Removes the last scope from the stack, guards that the main scope is not removed by the operation.
```go
func (es *ExecutionScopes) ExitScope() error {
if len(es.data) < 2 {
return ErrCannotExitMainScop
}
es.data = es.data[len(es.data) - 1]return nil
}
```*AssignOrUpdateVariable*
Inserts a variable to the current scope (aka the top one in the stack), overwitting the previous value if it exists
```go
func (es *ExecutionScopes) AssignOrUpdateVariable(varName string, varValue interface{}) {
locals, err := es.getLocalVariablesMut()
if err != nil {
return
}
(*locals)[varName] = varValue
}
```*Get*
Fetches a variable from the current scope
```go
func (es *ExecutionScopes) Get(varName string) (interface{}, error) {
locals, err := es.GetLocalVariables()
if err != nil {
return nil, err
}
val, prs := locals[varName]
if !prs {
return nil, ErrVariableNotInScope(varName)
}
return val, nil
}
```*DeleteVariable*
Removes a variable from the current scope
```go
func (es *ExecutionScopes) DeleteVariable(varName string) {
locals, err := es.getLocalVariablesMut()
if err != nil {
return
}
delete(*locals, varName)
}
```And the helper methods for these methods:
```go
func (es *ExecutionScopes) getLocalVariablesMut() (*map[string]interface{}, error) {
locals, err := es.GetLocalVariables()
if err != nil {
return nil, err
}
return &locals, nil
}func (es *ExecutionScopes) GetLocalVariables() (map[string]interface{}, error) {
if len(es.data) > 0 {
return es.data[len(es.data)-1], nil
}
return nil, ExecutionScopesError(errors.Errorf("Every enter_scope() requires a corresponding exit_scope()."))
}
```##### Hint Implementation Examples
Now that we have all the necessary tools to begin implementing hints, lets look at some examples:
###### IS_LE_FELT
The python code we have to implement is the following:
"memory[ap] = 0 if (ids.a % PRIME) <= (ids.b % PRIME) else 1"
The first thing we notice is that its uses the ids variables "a" and "b" so this gives as an opportunity to use our `IdsManager`. We can also look at the context of this hint, in this case the common library function is_le_felt (in the math_cmp module) to see that ids.a and ids.b are both felt values.
We can divide the hint into the following steps:
1. Fetch ids.a as a Felt
2. Fetch ids.b as a Felt
3. Compare the values of a and b (we don't need to perform % PRIME, as our Felt type already takes care of it)
4. Insert either 0 or 1 at the current value of ap depending on the comparison in 3And implement the hint:
```go
func isLeFelt(ids IdsManager, vm *VirtualMachine) error {
a, err := ids.GetFelt("a", vm)
if err != nil {
return err
}
b, err := ids.GetFelt("b", vm)
if err != nil {
return err
}
if a.Cmp(b) != 1 {
return vm.Segments.Memory.Insert(vm.RunContext.Ap, NewMaybeRelocatableFelt(FeltZero()))
}
return vm.Segments.Memory.Insert(vm.RunContext.Ap, NewMaybeRelocatableFelt(FeltOne()))
}
```###### ASSERT_LE_FELT_EXCLUDED_0
The python code we have to implement is the following:
"memory[ap] = 1 if excluded != 0 else 0"
This hint is quite similar to the previous example, except that instead of comparing ids variables it uses this "excluded" variable. As this variable is neither an ids, nor is it created during the hint, we can tell that it is a variable created by a previous hint, shared through the current execution scope. With this knowledge, we can divide the hint into the following set of steps:
1. Fetch excluded from the execution scopes
2. Cast the excluded variable to a concrete type. In this case, as we have previously implemented the hint that creates this variable, we know its type is 'int'
3. Compare the values of excluded vs 0
4. Insert either 0 or 1 at the current value of ap depending on the comparison in 3```go
func assertLeFeltExcluded0(vm *VirtualMachine, scopes *ExecutionScopes) error {
// Fetch scope var
excludedAny, err := scopes.Get("excluded")
if err != nil {
return err
}
excluded, ok := excludedAny.(int)
if !ok {
return errors.New("excluded not in scope")
}
if excluded == 0 {
return vm.Segments.Memory.Insert(vm.RunContext.Ap, NewMaybeRelocatableFelt(FeltZero()))
}
return vm.Segments.Memory.Insert(vm.RunContext.Ap, NewMaybeRelocatableFelt(FeltOne()))
}
```#### Proof mode
TODO
#### Temporary Segments
TODO