Substreams introduce a few new concepts to The Graph ecosystem, inspired by traditional large-scale data systems, fused with the novelties of blockchain.
Substreams is:
  • A streaming-first system
    • Based on gRPC and protobuf
    • Based on the StreamingFast Firehose
  • A remote code execution framework, that is:
    • highly cacheable
    • highly parallelizable
  • Composable down to individual modules, and allows a community to build higher-order modules with great ease
  • Deterministic, as it feeds from deterministic blockchain data
Substreams is not:
  • A relational database
  • A REST service
  • Concerned directly with how the data is stored
  • A general-purpose non-deterministic event stream processor
The word Substreams refers to:
  • A plurality of streams, each in the form of a module.
  • Packed in a single package, but streamable individually (a subunit of a package)
  • Streams composed from imported modules, blended, enriched or refined together (as in sub or downstream component).
  • A wink to Subgraphs
  • A manifest or package will usually contain more than one module, and/or import one or more modules. It is therefore fitting to talk about a package being a Substreams package.
The Substreams engine is completely agnostic of the underlying blockchain protocol, and works solely on data extracted from nodes using the Firehose. Different protocols have different chain-specific extensions (e.g. Ethereum, which exposes eth_calls).
Copy link