Stacked DRG PoRep

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg
This section describes Stacked DRG PoRep (SDR), the specific ProofofReplication (PoRep) used in Filecoin. In this construction, the prover encodes the original data into a replica and commits to it. An offline PoRep proves that the commitment to the replica is a valid commitment of the encoded original data.
SDR has been presented by Ben Fisch at EUROCRYPT19.
Introduction

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg
Background on ProofofReplication

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.backgroundonproofofreplication

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.backgroundonproofofreplication
ProofofReplication enables a prover P to convince a verifier V that P is storing a replica R, a physically independent copy of some data D, unique to P. The scheme is defined by a tuple of polynomial time algorithms (Setup, Replication, Prove, Verify). The assumption is that generation of a replica after Replicate must be difficult (if not impossible) to generate.

Setup: On setup, the public parameters of the proving systems are set.

Replicate: On replication, either a party or both (depending on the scheme, in our case the prover only!) generate a unique permutation of the original data D, which we call replica R.

Prove: On receiving a challenge, the prover must generate a proof that it is in possession of the replica and that it was derived from data D. The prover must only be able to respond to the challenge successfully if it is in possession of the replica, since would be difficult (if not impossible) to generate a replica that can be used to generate the proof at this stage

Verify: On receiving the proof, the verifier checks the validity of the proof and accepts or rejects the proof.
Timebounded ProofofReplication

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.timeboundedproofofreplication

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.timeboundedproofofreplication
Timing assumption. Timebounded ProofofReplication are constructions of PoRep with timing assumptions. The assumption is that generation of the replica (hence the Replication) takes some time t that is substantially larger than the time it takes to produce a proof (hence time(Prove)) and the roundtrip time (RTT) for sending a challenge and receiving a proof.
Distinguishing Malicious provers. A malicious prover that does not have R, must obtain it (or generate it), before the Prove step. A verifier can distinguish an honest prover from a malicious prover, since the malicious one will take too long to answer the challenge. A verifier will reject if receiving the proof from the prover takes longer than a timeout (bounded between proving time and replication time).
Background on Stacked DRG PoRep

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.backgroundonstackeddrgporep

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.backgroundonstackeddrgporep
Stacked DRG PoRep (SDR) is a specific ProofofReplication construction that we use in Filecoin. SDR has been designed by Ben Fisch at EUROCRYPT19. At a high level, SDR ensures that the Replicate step is a slow nonparallelizable sequential process by using a special type of graph called Depth Robust Graphs (DRG).
Encoding using DRGs. A key is generated by sequentially labeling nodes in the graph such that each label depends on the labels of its parents. The depth robustness property of these graphs ensure that the sequential labeling steps are not parallelizable. The final labels are used as a key to encode the original data.
TODO: This probably needs a more thorough rewrite.
Stacked DRGs. SDR builds on the above by stacking DRG graphs into LAYERS
layers. Each layer is connected to the previous by a Bipartite Expander Graph. The combination of DRGs and expander graphs guarantee the security property of PoRep. As before, the key produced by the final layer is used to encode the original data, yielding the replica.
Generating SDR proofs. Given the following public parameters:
ReplicaId
is a unique replica identifier (see the Filecoin Proofs spec for details).CommD
is the Merkle tree root hash of the input data to the first layer.CommC
is the Merkle tree root hash of the SDR column commitments.CommRLast
is the Merkle tree root hash of the replica.CommR
is the onchain commitment to the replica, dervied as the hash of the concatenation ofCommC
andCommRLast
.
An SDR proof proves that some data whose committment is CommD
has been used to run a Replicate
algorithm and generated some data. CommR
is the onchain commitment to both the replicated data and to intermediate stages required to prove Replicate
was performed correctly.
An SDR proof consists of a set of challenged DRG nodes for each layer, a set of parent nodes for each challenged node and a Merkle tree inclusion proof for each node provided. The verifier can then verify the correct labeling of each node and that the nodes given were consistent with the prover’s commitments.
Making proofs succinct with SNARKs: The proof size in SDR is too large for blockchain usage (~100MB TODO: check this), mostly due to the large amount of Merkle tree inclusion proofs required to achieve security. We use SNARKs to generate a proof of knowledge of a correct SDR proof. In other words, we implement the SDR proof verification algorithm in an arithmetic circuit and use SNARKs to prove that it was evaluated correctly.
The SNARK circuit proves that given Merkle roots CommD
, and CommR
, the prover correctly derived the labels at each layer and correctly performed the final encoding.
PoRep in Filecoin

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.porepinfilecoin

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.porepinfilecoin
ProofofReplication proves that a Storage Miner is dedicating unique storage for each sector. Filecoin Storage Miners collect new clients’ data in a sector, run a slow encoding process (called Seal
) and generate a proof (SealProof
) that the encoding was generated correctly.
In Filecoin, PoRep provides two guarantees: (1) spacehardness: Storage Miners cannot lie about the amount of space they are dedicating to Filecoin in order to gain more power in the consensus; (2) replication: Storage Miners are dedicating unique storage for each copy of their clients data.
Glossary:
 sector: a fixedsize block of data of
SECTOR_SIZE
bytes which generally contains clients’ data.  unsealed sector: a concrete representation (on disk or in memory) of a sector’s that follows the “Storage Format” described in Client Data Processing (currently
paddedfr32v1
is the required default).  sealed sector: a concrete representation (on disk or in memory) of the unique replica generated by
Seal
from an unsealed sector. A sector contains one or more pieces.  piece: a block of data of at most
SECTOR_SIZE
bytes which is generally a client’s file or part of.
Stacked DRG Construction

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg
Public Parameters

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.publicparameters

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.publicparameters
The following public parameters are used in the Stacked DRG Replication and Proof Generation algorithms:
TODO:
The Appendix should explain why we picked those valuesJust interpolate a table of the Orient parameters and reconcile naming.
name  type  description  value 

SECTOR_SIZE 
uint 
Number of nodes in the DRG in bytes  68,719,476,736 
LAYERS 
uint 
Number of Depth Robust Graph stacked layers.  10 
BASE_DEGREE 
uint 
InDegree of each Depth Robust Graph.  6 
EXPANSION_DEGREE 
uint 
Degree of each Bipartite Expander Graph to extend dependencies between layers.  8 
GRAPH_SEED 
uint 
Seed used for random number generation in baseParents . 
TODO 
NODE_SIZE 
uint 
Size of each node in bytes.  32B 
The following constants are computed from the public parameters:
name  type  description  computation  value 

PARENTS_COUNT 
uint 
Total number of parent nodes  EXPANSION_DEGREE + BASE_DEGREE 
13 
GRAPH_SIZE 
uint 
Number of nodes in the graph  SECTOR_SIZE / NODE_SIZE 
2,147,483,648 
TREE_DEPTH 
uint 
Height of the Merkle Tree of a sector  LOG_2(GRAPH_SIZE) 
31 
The following additional public parameters are required:
TAPER
:Float
: Fraction of each layer’s challenges by which to reduce nextlowest layer’s challenge count.TAPER_LAYERS
:uint
: Number of layers which should be tapered. FIXME: update for current tapering.Data
is a byte array initialized to the content of unsealed sector and will be mutated inplace by the replication process.
Hash Functions

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.hashfunctions

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.hashfunctions
We have describe three hash functions:
name  description  size of input  size of output  construction 

KDFHash 
Hash function used as a KDF to derive the key used to label a single node.  TODO  32B 
SHA256 
ColumnHash 
Hash function used to hash the labeled leaves of each layer (see SDR Column Commitments).  TODO  32B 
JubjubPedersen 
RepCompress 
Collision Resistant Hash function used for the Merkle tree.  2 x 32B + integer height 
32B 
JubjubPedersen 
RepHash 
Balanced binary Merkle tree based used to generate commitments to sealed sectors, unsealed sectors, piece commitments, and intermediate parts of the ProofofReplication.  TODO  32B 
Uses RepCompress 
RepHash

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.rephash

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.rephash
RepHash
is a vector commitment used to generate commitments to sealed sectors, unsealed sectors, piece commitments and intermediate stepds of the ProofofReplication. Filecoin uses a balanced binary Merkle tree for RepHash
. The leaves of the Merkle tree are pairs of adjacent nodes.
RepHash
inputs MUST respect a valid Storage Format. (TODO: What does this mean?)
Stacked DRG Graph

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.stackeddrggraph

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.stackeddrggraph
The slow sequential encoding required is enforced by the depth robustness property of the SDR graph.
Encoding with SDR: The data from a sector (of size SECTOR_SIZE
) is divided in NODE_SIZE
nodes (for a total of GRAPH_SIZE
nodes) and arranged in a directed acyclic graph. The structure of the graph is used to label the nodes sequentially to generate a key with which to encode the original data: in order to label a node, its parents must be labeled (see the “Layer Labeling” section below). We repeat this process for LAYERS
layers, where the input to a next layer is the output of the previous one.
Generating the SDR graph: The SDR graph is divided in LAYERS
layers. Each layer is a directed acyclic graph and it combines a Depth Robust Graph (DRG) and a Bipartite Expander graph.
TODO:
this isn’t quite right.
We provide an algorithm (SDR
) which computes the parents of a node. In high level, the parents of a node are computed by combining two algorithms: some parents (BASE_DEGREE
of them) are computed via the BucketSample
algorithm extended with a direct ordering of nodes, others (EXPANSION_DEGREE
of them) are computed via the Chung
algorithm.
SDRGraph
: SDR Graph algorithm

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.sdrgraphsdrgraphalgorithm

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.sdrgraphsdrgraphalgorithm
Overview: Compute the DRG and Bipartite Expander parents using respectively BucketSample
and ChungExpander
.
Inputs

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.inputs

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.inputs
name  description  Type 

node 
The node for which the parents are being computed  uint 
layer 
The layer of the SDR graph  uint 
Outputs

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.outputs

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.outputs
name  description  Type 

parents 
The ordered parents of node node on layer layer 
[PARENTS_COUNT]uint 
Algorithm

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.algorithm

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.algorithm

If
layer = 1
: Compute
drgParents = BucketSample(node)
 Set
parents
to bedrgParents
.
 Compute

If
layer > 1
: Compute
drgParents = BucketSample(node)
 Compute
expanderParents = ChungExpander(node)
 Set
parents
to be the concatenation ofdrgParents
andexpanderParents
 Compute
We provide below a more succinct representation of the algorithm:
TODO: Reference to code in filproofs/algorithms.go — or restructure all this.
Timespace tradeoff

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.timespacetradeoff

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.timespacetradeoff
Computing the parents using both BucketSample
and ChungExpander
for every layer can be an expensive operation, however, this can be avoided by caching the parents.
BucketSample
: Depth Robust Graphs algorithm

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.bucketsampledepthrobustgraphsalgorithm

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.bucketsampledepthrobustgraphsalgorithm
This section describes how to compute the “base parents” of the SDR graph, which is the equivalent of computing the parents of a Depth Robust Graph.
The properties of DRG graphs guarantee that a sector has been encoded with a slow, nonparallelizable process. We use the BucketSample
algorithm that is based on DRSample (
ABH17) and described in
FBGB18 and generates a directed acyclic graph of indegree BASE_DEGREE
.
BucketSample
DRG graphs are random graphs that can be deterministically generated from a seed; different seeds lead with high probability to different graphs. In SDR, we use the same seed GRAPH_SEED
for each layer of the SDR graph such that they are all based on the same underlying DRG graph.
The parents of any node can be locally computed without computing the entire graph. We call the parents of a node calculated in this way base parents.
BucketSample
extends BucketSampleInner
to include the node’s ‘immediate predecessor’. Each node except the first in a
DRG generated by BucketSample
has the node whose index is one less than its own as a parent. This ensures that
visiting nodes whose indexes are sequential will result in a graph traversal in topological order.
ChungExpander
: Bipartite Expander Graphs

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.chungexpanderbipartiteexpandergraphs

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.chungexpanderbipartiteexpandergraphs
TODO: explain why we link nodes in the current layer
Each node in layers other than the first has EXPANSION_DEGREE
parents generated via the ChungExpander
algorithm. Note that the indexes returned refer to labels from the previous layer. TODO: Make this all clearer with explicit notation.
TODO: link to relevant filproofs/algorithms.go
TimeSpace tradeoff

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.timespacetradeoff1

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.timespacetradeoff1
Computing these parents can be expensive (especially due to the hashing required by the Feistel algorithm). A miner can trade this computation by storing the expansion parents.
Feistel construction

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.feistelconstruction

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.feistelconstruction
We use three rounds of Feistel
to generate a permutation to compute the parents of the Bipartite Expander graph.
TODO: link to filproofs/feistel
Replication

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg
The Replication phase turns an unsealed sector into a sealed sector by first generating a key, then using the key to encode the orignal data.
Before running the Replicate
algorithm, the prover must ensure that the sector is correctly formatted with a valid “Storage Format” (currently paddedfr32v1
is the required default).
TODO: inputs are missing
The Replication Algorithm proceeds as follows:
 Calculate
ReplicaID
usingHash
(SHA256):
ReplicaID
is a 32byte array constructed by hashing the concatenation of the following values:
ProverId
is a 32byte array uniquely identifying a prover.SectorNumber
is an unsigned 64bit integer in littleendian encoding represented as an 8byte array.RandomSeed
is a 32byte array of randomness extracted from the chain.CommD
is the Merkle root obtained by performingRepHash
on the original data represented inpaddedfr32v1
.
ReplicaID := Hash(ProverID  SectorNumber  RandomSeed  CommD)
 Perform
RepHash
onData
to yieldCommD
andTreeD
:
CommD, TreeD = RepHash(data)
Layer Labeling

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.layerlabeling

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.layerlabeling
TODO: DefineGraph
. We need to decide if this is an object we’ll explicitly define or if its properties (e.g.,GRAPH_SIZE
) are just part of the replication parameters and all the functions just refer to the same graphs being manipulated across the entire replication process. (At the moment I’ve avoided defining aGraph
structure as in other specs I didn’t see any object methods, just standalone functions.)
TODO: link to filproofs/algorithms
Proof Generation

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg
Overview:
 Challenge Derivation
 Proof Generation
 Circuit Proof Generation
TODO: write a single algorithm which includes the spec below
Challenge Generation

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.challengegeneration

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.challengegeneration
TODO: Link to filproofs/algorithms
Calculate LAYER_CHALLENGES : [LAYERS]uint
: Number of challenges per layer. (This will be passed to the SDR circuit proof.)
Derive challenges for each layer (call DeriveChallenges()
).
Witness Generation

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.witnessgeneration

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.witnessgeneration
TODO: Link to filproofs/algorithms
Layer Challenge Counts

State

Theory Audit

Edit this section

sectionalgorithms.porepold.stacked_drg.layerchallengecounts

State

Theory Audit
 Edit this section

sectionalgorithms.porepold.stacked_drg.layerchallengecounts
TODO: we should just list current parameters and show this as a calculation for correctness, this should not mandatory to implement.