Expand description

Binary format for transactions and modules.

This module provides a simple Rust abstraction over the binary format. That is the format of modules stored on chain or the format of the code section of a transaction.

file_format_common.rs provides the constant values for entities in the binary format. (The binary format is evolving so please come back here in time to check evolutions.)

Overall the binary format is structured in a number of sections:

  • Header: this must start at offset 0 in the binary. It contains a blob that starts every Diem binary, followed by the version of the VM used to compile the code, and last is the number of tables present in this binary.
  • Table Specification: it’s a number of tuple of the form (table type, starting_offset, byte_count). The number of entries is specified in the header (last entry in header). There can only be a single entry per table type. The starting offset is from the beginning of the binary. Tables must cover the entire size of the binary blob and cannot overlap.
  • Table Content: the serialized form of the specific entries in the table. Those roughly map to the structs defined in this module. Entries in each table must be unique.

We have two formats: one for modules here represented by CompiledModule, another for transaction scripts which is CompiledScript. Building those tables and passing them to the serializer (serializer.rs) generates a binary of the form described. Vectors in those structs translate to tables and table specifications.

Structs

Enums

  • An Ability classifies what operations are permitted for a given type
  • Bytecode is a VM instruction of variable size. The type of the bytecode (opcode) defines the size of the bytecode.
  • A SignatureToken is a type declaration for a location.
  • StructFieldInformation indicates whether a struct is native or has user-specified fields
  • Visibility restricts the accessibility of the associated entity.

Constants

  • Index 0 into the LocalsSignaturePool, which is guaranteed to be an empty list. Used to represent function/struct instantiation with no type arguments – effectively non-generic functions and structs.

Functions

Type Definitions

  • The pool of address identifiers (addresses used in ModuleHandles/ModuleIds). Does not include runtime values. Those are placed in the ConstantPool
  • Index into the code stream for a jump. The offset is relative to the beginning of the instruction stream.
  • The pool of Constant values
  • The pool of identifiers.
  • Index of a local variable in a function.
  • Max number of fields in a StructDefinition.
  • The pool of Signature instances. Every function definition must define the set of locals used and their types.
  • Generic index into one of the tables in the binary format.
  • Type parameters are encoded as indices. This index can also be used to lookup the kind of a type parameter in the FunctionHandle and StructHandle.
  • The pool of TypeSignature instances. Those are system and user types used and their composition (e.g. &U64).