This article is first part of the series of articles, where we are trying to demystify Cairo White Paper in simple and easy to understand terminology.
Cairo is an architecture that allows describing the computation in the form of a computer program and then generating a proof of integrity for that computation.
The Cairo architecture is designed for:
(1) ease of writing and reading programs to be used as provable statements,
(2) efficient proving, based on the STARK proof system.
Cairo implements an assembly language and on top of which, a full programming language in which the computation can be described.
Cairo uses Zero Knowledge cryptography, which is much more efficient in allowing to verify the computation by:
1. Introducing zero-knowledge for privacy and
2. Enabling succinct verification, which is exponentially more efficient than re-execution.
Note that while Cairo was designed to be used with the STARK protocol, it can also be used with many other finite field-based proof systems, such as SNARKs.
If we look into mathematics of verification, computation being verified needs to be represented in terms of polynomial equations over a finite field. This process is called arithmetization. Examples of such representations are arithmetic circuits, Quadratic Span Programs (aka R1CS) and Algebraic Intermediate Representations (AIRs).
Requirement of representing computation in terms of polynomial is complex and challenging. In addition, some of the approaches for doing the arithmetization process result in unnecessary computation (in case of branches and loops). Please refer to Cairo White Paper for more details.
CPUs and the Von Neumann architecture are well suited to address challenge of representing computation in terms of polynomial: One can design a single universal system of polynomial equations representing the execution of an arbitrary computer program written for some fixed instruction set. In the context of preprocessing SNARKs, this approach was used in the vnTinyRAM system.
Cairo uses von Neumann architecture that can be used with the STARK proof system to generate proofs of computational integrity. As such, it is the first STARK von Neumann architecture.
The main advantages of Cairo are:
Efficiency: The Cairo instruction set was chosen so that the corresponding AIR will be as efficient as possible. Construction of general von neumann architecture requires around 1000 variables per cycle. Compare this to the 51 variables required by Cairo’s AIR (more on this in later articles). Moreover, Cairo implements idea of builtins (more on this in later articles), which further reduces overhead of executing predefined operations (for example, applying a cryptographic hash function).
Algebraic RISC: Cairo has a small yet relatively expressive, instruction set. All of the instructions are encoded using 15 flags and three integers.
Nondeterministic Continuous Read-Only Random-Access Memory Instead of the conventional read-write memory model, Cairo uses a unique memory model, allowing a very efficient AIR implementation
Permutation range-checks Permutation range-checks, allow one to check (in an AIR) that a value is in the range [0; 216) using only 3 trace cells (compared to the 16 trace cells required by the approach of using the binary representation). Each instruction uses 3 such range-checked values, so such efficiency is crucial.
Trace cells are described in section 2.1
Builtins: Builtins significantly reduce the overhead which was added due to the transition from hand-written AIR to Cairo code. Builtins are described in detail later in the article
Efficient public memory An important feature — each memory cell which is shared with the verifier, adds a verification cost of only 4 arithmetic operations. This results in reduction of gas cost.
Nondeterministic Von Neumann advantages: (1) proving programs where only the hash (rather than the code) is known to the verifier and (2) proving multiple different programs in one proof to reduce the amortized verification costs
The name Cairo comes from the term “CPU AIR” — an AIR implementing the concept of a CPU
2. Design principles
Cairo provides a layer of abstraction (called Algebraic Intermediate Representation, AIR) around STARKs that simplifies the way the computation is described.
This section presents architecture and design of Cairo.
2.1 Algebraic Intermediate Representation (AIR) and Randomized AIR with Preprocessing (RAP)
The STARK proof system is based on AIRs (Algebraic Intermediate Representation.
Decomposition of AIR:
● AIR is a list of polynomial constraints (or equations)
● These polynomial equations operate on a two-dimensional list of field elements
● These field elements are elements of a finite field
● Field elements are called traces or witness
● STARK proof system proves that there exists a trace satisfying the constraints.
The Cairo AIR is, in fact, not really an AIR, but it is a Randomized AIR with Preprocessing (RAP). A RAP is a broader definition of the term AIR, that allows an additional step of interaction between the prover and the verifier, further reducing the verification cost.
2.2 Von Neumann architecture
In Cairo framework, computation described as an AIR, which is different than describing computation described as a computer program written on a general-purpose computer architecture.
Two main approaches that handle translation to AIR are:
● The ASIC approach: ASIC (Application Specific Integrated Circuit) are also known as semiconductor chips. In this approach, compiler (or translator) takes program as input and translates it to AIR, in the same manner semiconductor chips are designed by writing specification of chip as a program (in Verilog or VHDL) and translating the specification to semiconductor chips using translation tools.
● The CPU approach: In this approach, single AIR is designed which is independent of the computation being proved that behaves like a CPU — the loop of fetching an instruction from memory, executing that instruction, and continuing to the next instruction.
ASIC approach does not have overhead of fetching/decoding of instruction and data from memory.
However, use of builtins reduces overheads of CPU approach, causing CPU approach to perform similar to ASIC for many computations. If a certain computation cannot take advantage of existing builtins, one has a trade-off: you may choose between (1) accepting the performance loss and (2) designing a new builtin that will improve the computation’s performance.
CPU approach has many advantages:
● A small constraint set: This improves the verification costs.
● A single AIR: Compared to ASIC based AIR, where different verifier needs to be implemented for different computations, CPU approach requires that AIR has to be implemented only once (rather than per application)
● CPU approach simplifies the process of building recursive STARK proofs since AIR is independent of application
CPU approach is outcome of following Von Neumann architecture, in which the program’s bytecode and data are located within the same memory.
Advantages of CPU approach based on Von Neumann architecture:
Bootloading: Loading programs from their hash
A program, called “the bootloader” computes and outputs the hash of the bytecode of
another program and then starts executing it as above. This way, the verifier only needs to know the hash of the program being executed and not its full bytecode.
This improves both privacy and scalability:
Running several different programs in the same proof
The bootloader described above can be extended to execute several programs.
Hashes of different programs can be combined to generate single hash and generate a single proof attesting to the validity of all of the programs, and the verification costs will be shared among these programs.
As an example: Verification of the proofs of two programs of 1 million steps
● Total verification cost of each verified separately: 2 log2(T) =~ 794
● Total verification cost of both verified together: log2(2T) =~ 438
Cost reduction of factor 1.8 (or almost 2)
Advanced optimizations (just in time compilation and bytecode generation)
Some advanced optimizations are possible via automatic generation of bytecode during the execution of a program. Instead of fetching values from the memory in a function, a program may clone the function’s bytecode and place some values directly inside the instructions that require them.
All other forms of bytecode generation are also possible: A program can generate Cairo bytecode according to some rules and then execute it.
Incrementally verifiable computation (recursive proofs)
A recursive proof is a proof attesting to the validity of another proof. For example, if A1 is a proof of a computation A0, and A2 is proof of A1, then this concept is called “incrementally verifiable computation”.
ASIC approach is not suitable for recursive proofs because of circular dependency of the program being proven depending on the verifier’s code.
However, with the CPU approach, the verifier does not depend on the program, which simplifies recursive proving — as the circular dependency breaks.
2.3 The instruction set
The instruction set is the set of operations any CPU can perform in a single step, and it’s applicable to Cairo CPU as well.
2.3.1 Key metric for Cairo Instruction Set
The most important metric when designing an AIR (and, therefore, when designing an instruction set that will be executed by an AIR) is to minimize the number of trace cells used, which is equivalent to the number of variables in a system of polynomial equations.
2.3.2 Algebraic RISC
Cairo instruction set tries to create a balance between (1) a minimal set of simple instructions that require a very small number of trace cells and (2) powerful enough instructions that will reduce the number of required steps. As such:
1. Addition and multiplication are supported over the base field (for example, modulo a fixed prime number) rather than for 64-bit integers
2. Checking whether two values are equal is supported, but there is no instruction for checking whether a certain value is less than another value (such an instruction would have required many more trace cells — since a finite field does not support an algebraic-friendly linear ordering of its elements).
This kind of instruction set is called Algebraic RISC (Reduced Instruction Set Computer):
● “RISC” refers to the minimality of the instruction set
● “Algebraic” refers to the fact that the supported operations are field operations
The Cairo instruction set can simulate any Turing Machine and hence is Turing-complete.
In order to support comparison operation, Cairo uses concept of builtins (described later in the article).
In programming jargon, register refers to the operands on which a computer instruction operate on. There are two kinds of registers:
● General purpose register: They reside on the CPU, hence access to them is cheaper/faster
● Memory: Memory is external to CPU, hence access to memory is expensive and slow
As we have seen, since constraints on optimization of AIR (hence on instructions executed by AIR) is different than optimization constraints on general programming instructions. In AIR, we try to minimize number of trace cells.
Different approaches to implement registers are:
1. No general-purpose registers — all of the instructions are performed directly on memory.
2. Some general-purpose registers — instructions are performed on those registers and usually, at most, one memory cell.
3. Bounded stack machines — those can be thought of as machines with many general-purpose registers, where the different instructions shift the values between the registers.
Optimization goals of Cairo leads to option 1. One Cairo instruction may deal with up to 3 values from the memory and perform one arithmetic operation (either addition or multiplication) on
two of them, and store the result in the third.
Cairo has 2 address registers, called ap and fp, which are used for specifying which memory cells the instruction operates on. For each of the 3 values in an instruction, you can choose either an address of the form ap + off or fp + off where off is a constant offset in the range [-2**15, 2**15]. Thus, an instruction may involve any 3 memory cells out of 2 * 2**16 = 131072. In many aspects, this is similar to having these many registers (implemented in a much cheaper way).
Accessing memory cells that cannot be described in the form above is possible using an instruction that takes the value of a memory cell and treats it as the address of another memory cell (concept of pointer)
An important aspect in the design of Cairo was to allow the programmer to take advantage of nondeterministic programming.
To understand what is meaning of non-determinism in this context, it means in order to prove a computation, prover can do additional work that is not part of the computation.
Let’s take an example to elaborate:
When computing y=sqrt(x), instead of computing sqrt(x), in non-deterministic programming, prover can prove y**2 = x (using hints, since this computation is not provided to verifier). And this can be done using Cairo (multiplication) instruction.
Cairo introduces the notion of “hints” to take advantage of nondeterministic programming. Hints are pieces of code that are inserted between Cairo instructions where additional work is required by the prover. Since hints are only used by the prover, hints can be written in any programming language.
The existing implementation of Cairo uses Python as the language for writing hints.
Most prevalent memory model used in today’s computer architecture is random access read-write memory, abbreviated as RAM, where an instruction accesses a random memory location for reading or writing.
Different types of memory models are:
● Read-Write Memory This is the most familiar memory model, described above.
● Write-Once Memory Memory is written only once, and can’t be read before writing
● Nondeterministic Read-Only Memory The prover chooses all the values of the memory, and the memory is immutable. The instruction may only read from it.
For AIR, the main trade-off in choosing a memory model is between allowing efficient implementation of various algorithms and efficient representation of each memory access.
It turns out that, for AIR, for most of the memory accesses in programs, a Nondeterministic Read-Only Memory is sufficient, hence Cairo uses a nondeterministic read-only memory model.
In fact, one more restriction is applied to gain efficiency. The memory address space is continuous, which means that if there is a memory access to address x and another memory access to address y, then for every address x < a < y, there must be a memory access to this address. This additional restriction on the prover allows a very efficient AIR implementation of memory accesses with only 5 trace cells per access
2.6.1 Public memory
An important advantage of the way Cairo implements the nondeterministic read-only memory as an AIR is that it allows the prover to efficiently convince the verifier that some memory addresses contain certain values, through shared memory, also called public memory.
This mechanism is extremely efficient compared to the approach where memory is not public, and more constraints needs to be added since address is not known in advance when memory is not public.
Public memory mechanism can be used for:
● Loading the bytecode (see Section 3.3) of the program into public memory
● Passing the arguments and return values of the program (between the prover and the verifier).
2.6.2 Handling on-chain-data in blockchain applications
In blockchain applications, it’s extremely important to reduce the gas cost of “data availability” or “on-chain data”, meaning of which is to make data available on-chain to users for inspection. In Cairo framework, “data-availability” is achieved through public memory mechanism: one may put the data in a contiguous segment of memory cells and include those cells in the public memory. Then, the verification cost is the sum of the cost of transmitting the data to the blockchain and the computation cost of only 4 arithmetic operations per data element and the hash of the data.
2.7 Program input and program output
A Cairo program may have:
Program input — the input of the program. This data is not shared with the verifier. In terms of proof systems, this is the witness. Hints may be used to parse the input.
Program output — the data that is generated during the execution of the program, that is shared with the verifier. It will become part of the data externalized using the public memory mechanism described earlier.
Builtin are very important and powerful concept. They represent paradigm shift in the way computation is approached.
Since Cairo framework is designed for efficient AIR implementation, implementation of certain operations in Cairo instructions are extremely costly. Integer division is one such example.
Cairo introduces the concept of builtins to address implementation of such computations.
A builtin enforces some constraints (which depend on the builtin) on the Cairo memory.
As an example of builtin, range-check builtin enforce that all the values for the memory cells in some fixed address range are within the range [0, 2**128]. Memory cells constrained by the range-check builtin are called range-checked cells.
Builtins are invoked by simply reading from or writing values in the memory cells affected by the builtin. This kind of communication is also known as memory-mapped I/O.
Take the range-check builtin, for example, if you want to verify that a value x is within the range [0; 2128), just copy it (using a Cairo instruction) to a range-checked cell. If you want to verify that x is within the range [0;B] where B < 2**128 you can write x to one range-checked cell and B — X to another.
Adding builtins does not affect the CPU constraints. It just means that the same memory is shared between the CPU and the builtins, as shown in the Figure.
Builtins as Memory Mapped IO