Binary Canonical Serialization (BCS) Standard Guide

Binary Canonical Serialization (BCS) is the fundamental serialization format powering the Supra L1 ecosystem. BCS is a binary canonical non-self-describing serialization format designed for efficient data structure serialization. The Supra MoveVM, utilizes BCS for all critical blockchain operations, ensuring deterministic and efficient data handling across the network.

Overview

Key Characteristics:

  • Binary format: Compact, efficient storage and transmission

  • Canonical: Deterministic serialization ensures identical outputs for identical inputs

  • Non-self-describing: The reader must know the expected data format beforehand

  • Comprehensive: Used for all on-chain data, API responses, and transaction arguments

Primitive Types

All primitive types in BCS follow little-endian byte ordering for multi-byte integers.

Boolean (bool)

Booleans are serialized as a single byte with strict value constraints.

Value

Bytes

true

0x01

false

0x00

#[test_only]
module supra_example::bcs_examples {
    use std::bcs;
    use std::from_bcs;
    
    #[test]
    fun test_bool_serialization() {
        // Serialize
        let val: bool = true;
        let bytes: vector<u8> = bcs::to_bytes(&val);
        assert!(bytes == vector[0x01], 0);
        
        // Deserialize
        let val_des = from_bcs::to_bool(bytes);
        assert!(val_des == true, 1);
    }
}

Unsigned Integers

U8 (8-bit unsigned integer)

U16 (16-bit unsigned integer)

2 bytes in little-endian order:

U32 (32-bit unsigned integer)

4 bytes in little-endian order:

U64 (64-bit unsigned integer)

8 bytes in little-endian order:

U128 (128-bit unsigned integer)

16 bytes in little-endian order:

U256 (256-bit unsigned integer)

32 bytes in little-endian order:

Variable Length Encoding (Uleb128)

Uleb128 (unsigned 128-bit variable length integer) uses a variable number of bytes where the most significant bit indicates continuation. This encoding is commonly used for:

// Currently not supported by itself in Move

Complex Types

Sequences and Fixed Sequences

Sequences (Vectors): Sequences are serialized as a variable length vector of an item. The length of the vector is serialized as a Uleb128 followed by repeated items.

Fixed Sequences: Fixed Sequences are serialized without the leading size byte. The reader must know the number of bytes prior to deserialization. This is efficient when the size is known at compile time.

Strings

Strings are serialized as a vector of bytes with UTF-8 encoding. The length is stored as a Uleb128 prefix followed by the UTF-8 encoded bytes.

Account Addresses

Account Address is serialized as a fixed 32-byte vector of bytes.

Structs

Structs are serialized as an ordered set of fields. The fields are serialized in the order they are defined in the struct.

Option Types

Options are serialized as a single byte to determine whether it's filled. If the option is None, the byte is 0x00. If the option is Some, the byte is 0x01 followed by the serialized value.

Enums

Enums are serialized as a uleb128 to determine which variant is being used. The variant index is followed by the serialized value of the variant data.

Maps

Maps are stored as a sequence of key-value tuples. The length of the map is serialized as a Uleb128 followed by repeated key-value pairs. Maps are typically sorted by key for canonical ordering.

BCS Stream Deserialization Module

Supra includes a specialized BCS stream deserialization module that enables efficient processing of BCS-formatted byte arrays into Move primitive types. This module is available in the official Supra framework repository.

Deserialization Strategies

Per-Byte Deserialization

  • Used for most primitive types to minimize gas consumption

  • Processes each byte individually to match length and type requirements

  • Optimized for cost-effective on-chain operations

Function-Based Deserialization

  • Used specifically for deserialize_address function

  • Leverages aptos_std::from_bcs due to type constraints

  • Higher gas cost but necessary for certain complex types

Key Features

Important Considerations

Non-Self-Describing Format

BCS is a non-self-describing format, which means:

  • The serialized data does not contain type information

  • The deserializer must know the expected type structure beforehand

  • Schema evolution requires careful planning and versioning

  • Type mismatches during deserialization can cause runtime errors

Canonical Ordering

BCS ensures canonical (deterministic) serialization:

  • Same input always produces same output

  • Field order in structs matters and must be consistent

  • Map entries are typically sorted by key

  • This is crucial for consensus and verification

Sample Example

Conclusion

Binary Canonical Serialization provides Supra Move developers with a robust, efficient, and deterministic foundation for data handling across the entire Supra ecosystem. By mastering BCS fundamentals and leveraging the specialized stream deserialization capabilities within the Supra framework, developers can build more efficient and reliable applications while minimizing gas costs and maximizing performance.

Last updated