Comment on page


StreamingFast Substreams modules overview


Modules are an important part of Substreams, offering hooks into the execution of the Substreams compute engine. You can create Substreams data manipulation and transformation strategies within modules.
Modules are small pieces of Rust code running in a WebAssembly (WASM) virtual machine. They coexist within the stream of blocks sent by the Substreams compute engine, which arrives from a blockchain node.
Modules have one or more inputs, which can be in the form of a map or store, or a Block or Clock object received from the blockchain's data source.
Substreams modules data interaction diagram
The diagram shows how the transfer_map module extracts the transfers in a Block and tracks the total number of transfers.
Note: You can use multiple inputs in blockchains because they are clocked, which allows for synchronization between multiple execution streams and improved performance compared to conventional streaming engines.
As seen in the counters store example diagram, modules can also take in multiple inputs. In this case, two modules feed into a store, effectively tracking multiple counters.
Multiple module inputs diagram
Every time a new Block is processed, all of the modules are executed as a directed acyclic graph (DAG).
Note: The protocol's Block protobuf model always serves as the top-level data source and executes deterministically.

Single output

Modules have a single typed output, which is typed to inform consumers of the types of data to expect and how to interpret the bytes being sent.
Tip: In subsequent modules, input from one module's data output is used to form a chain of data flow from module to module.

map versus store modules

To develop most non-trivial Substreams, you will need to use multiple map and store modules. The specific number, responsibilities, and communication methods for these modules will depend on the developer's specific goals for the Substreams development effort.
The two module types are commonly used together to construct the directed acyclic graph (DAG) outlined in the Substreams manifest. The two module types are very different in their use and how they work. Understanding these differences is vital for harnessing the full power of Substreams.

map modules

map modules are used for data extraction, filtering, and transformation. They should be used when direct extraction is needed avoiding the need to reuse them later in the DAG.
To optimize performance, you should use a single map module instead of multiple map modules to extract single events or functions. It is more efficient to perform the maximum amount of extraction in a single top-level map module and then pass the data to other Substreams modules for consumption. This is the recommended, simplest approach for both backend and consumer development experiences.
Functional map modules have several important use cases and facts to consider, including:
  • Extracting model data from an event or function's inputs.
  • Reading data from a block and transforming it into a custom protobuf structure.
  • Filtering out events or functions for any given number of contracts.

store modules

store modules are used for the aggregation of values and to persist state that temporarily exists across a block.
Important: Stores should not be used for temporary, free-form data persistence.
Unbounded store modules are discouraged. store modules shouldn't be used as an infinite bucket to dump data into.
Notable facts and use cases for working store modules include:
  • store modules should only be used when reading data from another downstream Substreams module.
  • store modules cannot be output as a stream, except in development mode.
  • store modules are used to implement the Dynamic Data Sources pattern from Subgraphs, keeping track of contracts created to filter the next block with that information.
  • Downstream of the Substreams output, do not use store modules to query anything from them. Instead, use a sink to shape the data for proper querying.