StreamingFast Substreams fundamental knowledge

Working with Substreams Fundamentals

Developers working with Substreams will create and touch many separate pieces of technology including the Substreams engine, command line interface, configuration files, Rust modules, and protobufs.
This documentation aims to outline information to further help developers working with Substreams. Specifically, how the multitude of different pieces fit together including the manifest, protobufs, Rust modules, module handlers, WASM, and Substreams CLI.
Substreams in Action

Key Steps

  • Identify smart contract addresses of interest including wallets, decentralized exchanges (DEXs), etc.
  • Identify data, and define and create protobufs.
  • Write Rust Substreams event handler functions.
  • Update substreams manifest, point to protobufs and handlers.
  • Issue command to Substreams CLI passing manifest.

The Substreams Engine

The Substreams engine basically is the CPU, or brain, of the Substreams system. The engine handles requests, and communication and orchestrates the transformation of blockchain data.
Note: The Substreams engine is responsible for running data transformations defined by developers to process targeted blockchain data.
Developers send commands, flags, and a reference to the manifest configuration file through the Substreams CLI to the Substreams engine.
Developers create the data transformation strategies in Substreams “module handlers” defined using the Rust programming language. The module handlers act on protobuf-based data models referenced from within the Substreams manifest. Learn more about the protobufs for the different blockchains in the chains and endpoints section of the Substreams documentation.

How Substreams Modules Communicate

The Substreams engine runs the code defined by developers in the Rust-based module handlers.
Note: Substreams modules have a uni-directional flow of data. The data can be passed from one module to another, but only in a single direction.
The flow of data is defined in the Substreams manifest through the “inputs” and “outputs” fields of the configuration file. These fields generally reference the protobuf definitions for the targeted blockchain data. The flow of data can also be defined using the “inputs” field to send data directly from one module to another.

What is a Substreams DAG?

Substreams modules are composed through a directed acyclic graph (DAG).
Note: The flow of data from one module to another is determined by the fundamental rules and principles of DAGs; a one directional flow.
The Substreams manifest references the modules, the handlers defined within them, and lays out the intention of how each is used by the Substreams engine.
Directed acyclic graphs contain nodes, in this case, modules, that communicate in only one direction, passing from one node, or module, to another.
The Substreams engine creates the “compute graph”, or “dependency graph” at runtime through commands sent to the CLI using code in modules referenced by the manifest.

Protobufs for Substreams

Substreams module handlers linked to protobuf
View the protobuf file in the repo by visiting the following link.
View the Rust module handlers in the file in the repo by visiting the following link.
Protocol buffers, or protobufs, are the data models operated on by the Rust-based module handler functions. Data models are defined and outlined in the protobufs.
Note: Protobufs include the names of the data objects and the fields contained and accessible within them.
Many of the protobuf definitions have already been created, such as the erc721 token model, that can be used by developers creating Substreams data transformation strategies.
Custom smart contracts targeted by developers, such as UniSwap, will have protobuf definitions that have already been created for them by others. The custom data models are referenced in the Substreams manifest and made available to module handler functions.
In object-oriented programming terminology, the protobufs are the objects or object models. In front-end web development terms, protobufs are similar to the REST, or other data access API.
Firehose and Substreams treat the data as the API.
Protobufs essentially provide the API to the targeted data, usually associated with a smart contract address.

Writing Rust Modules for Substreams

Writing Rust Modules for Substreams
Designing an overall strategy for how to manage and transform data is the first thing developers will do when creating a Substreams implementation. Substreams modules are processed by the engine with the relationships between them defined in the manifest.
The design and complexity of the modules and the way they work together will be based on the smart contracts and data being targeted by the developer.
Note: Substreams modules work together by passing data from one module to another until finally returning an output transformed according to the rules in the manifest, modules, and module handler functions.
Two types of module handlers are defined within the Rust modules; maps and stores. The two module types work in conjunction to sort, sift, temporarily store and transform blockchain data from smart contracts for use in data sinks, such as databases or subgraphs.
Continue to the modules documentation to learn more about detailed aspects of their use and purpose.