Rate Limiting Nullifiers in practice, applied to an anonymous p2p network, like Waku.

## Introduction

Rate Limiting Nullifier (RLN) is a zero-knowledge gadget that allows users to prove 2 pieces of information,

- They belong to a permissioned membership set
- Their rate of signaling abides by a fixed number that has been previously declared

The "membership set" introduced above, is in the form of a sparse, indexed merkle tree. This membership set can be maintained on-chain, off-chain or as a hybrid depending on the network's storage costs. Waku makes use of a hybrid membership set, where insertions are tracked in a smart contract. In addition, each Waku node maintains a local copy of the tree, which is updated upon each insertion.

Users register themselves with a hash of a locally generated secret, which is then inserted into the tree at the next available index. After having registered, users can prove their membership by proving their knowledge of the pre-image of the respective leaf in the tree. The leaf hashes are also referred to as commitments of the respective users. The actual proof is done by a Merkle Inclusion Proof, which is a type of ZK proof.

The circuit ensures that the user's secret does indeed hash to a leaf in the tree, and that the provided Merkle proof is valid.

After a User generates this Merkle proof, they can transmit it to other users, who can verify the proof. Including a message's hash within the proof generation, additionally guarantees integrity of that message.

A malicious user could generate multiple proofs per epoch. they generate multiple proofs per epoch. However, when multiple proofs are generated per epoch, the malicious user's secret is exposed, which strongly disincentivizes this attack. This mechanism is further described in malicious User secret interpolation mechanism

Note: This blog post describes rln-v1, which excludes the range check in favor of a global rate limit for all users, which is once per time window. This version is currently in use in waku-rln-relay.

## RLN Protocol parameters

Given below is the set of cryptographic primitives, and constants that are used in the RLN protocol.

- Proving System:
`groth16`

- Elliptic Curve:
`bn254`

(aka bn128) (not to be confused with the 254 bit Weierstrass curve) - Finite Field: Prime-order subgroup of the group of points on the
`bn254`

curve - Default Merkle Tree Height:
`20`

- Hashing algorithm:
`Poseidon`

- Merkle Tree:
`Sparse Indexed Merkle Tree`

- Messages per epoch:
`1`

- Epoch duration:
`10 seconds`

## Malicious User secret interpolation mechanism

note: all the parameters mentioned below are elements in the finite field mentioned above.

The private inputs to the circuit are as follows: -

`identitySecret: the randomly generated secret of the user`

identityPathIndex: the index of the commitment derived from the secret

pathElements: elements included in the path to the index of the commitment

Following are the public inputs to the circuit -

`x: hash of the signal to the finite field`

rlnIdentifier: application-specific identifier which this proof is being generated for

epoch: the timestamp which this proof is being generated for

The outputs of the circuit are as follows: -

`y: result of Shamir's secret sharing calculation`

root: root of the Merkle tree obtained after applying the inclusion proof

nullifier: uniquely identifies a message, derived from rlnIdentifier, epoch, and the user's secret

With the above data in mind, following is the circuit pseudocode -

`identityCommitment = Poseidon([identitySecret])`

root = MerkleInclusionProof(identityCommitment, identityPathIndex, pathElements)

externalNullifier = Poseidon([epoch, rlnIdentifier])

a1 = Poseidon([identitySecret, externalNullifier])

y = identitySecret + a1 * x

nullifier = Poseidon([a1])

To interpolate the secret of a user who has sent multiple signals during the same epoch to the same rln-based application, we may make use of the following formula -

$a_1 = {(y_1 - y_2) \over (x_1 - x_2)}$

where $x_1$, $y_1$ and $x_2$, $y_2$ are shares from different messages

subsequently, we may use one pair of the shares, $x_1$ and $y_1$ to obtain the `identitySecret`

$identitySecret = y_1 - a_1 * x$

This enables RLN to be used for rate limiting with a *global* limit. For arbitrary limits,
please refer to an article written by @curryrasul, rln-v2.

## Waku's problem with DoS

In a decentralized, privacy focused messaging system like Waku, Denial of Service (DoS) vulnerabilities are very common, and must be addressed to promote network scale and optimal bandwidth utilization.

### DoS prevention with user metadata

There are a couple of ways a user can be rate-limited, either -

- IP Logging
- KYC Logging

Both IP and KYC logging prevent systems from being truly anonymous, and hence, cannot be used as a valid DoS prevention mechanism for Waku.

RLN can be used as an alternative, which provides the best of both worlds, i.e a permissioned membership set, as well as anonymous signaling. However, we are bound by k-anonymity rules of the membership set.

Waku-RLN-Relay is a libp2p pubsub validator that verifies if a proof attached to a given message is valid. In case the proof is valid, the message is relayed.

## Performance analysis

Test bench specs: AMD EPYC 7502P 32-Core, 4x32GB DDR4 Reg.ECC Memory

This simulation was conducted by @alrevuelta, and is described in more detail here.

The simulation included 100 waku nodes running in parallel.

Proof generation times -

Proof verification times -

A spammer node publishes 3000 msg/epoch, which is detected by all connected nodes, and subsequently disconnect to prevent further spam -

## Security analysis

Barbulescu and Duquesne
conclude that that the `bn254`

curve has only 100 bits of security.
Since the bn254 curve has a small embedding degree,
it is vulnerable to the MOV attack.
However, the MOV attack is only applicable to pairings,
and not to the elliptic curve itself.
It is acceptable to use the bn254 curve for RLN,
since the circuit does not make use of pairings.

An analysis on the number of rounds in the Poseidon hash function was done,
which concluded that the hashing rounds should *not* be reduced,

The smart contracts have *not* been audited, and are not recommended for real world deployments *yet*.

## Storage analysis

The storage overhead introduced by RLN is minimal. RLN only requires 34 megabytes of storage, which poses no problem on most end-user hardware, with the exception of IoT/microcontrollers. Still, we are working on further optimizations allowing proof generation without having to store the full tree.

## The bare minimum requirements to run RLN

With proof generation time in sub-second latency, along with low storage overhead for the tree, it is possible for end users to generate and verify RLN proofs on a modern smartphone.

Following is a demo provided by @rramos that demonstrates waku-rln-relay used in react native.

Warning: The react native sdk will be deprecated soon, and the above demo should serve as a PoC for RLN on mobiles

## RLN usage guide

Zerokit implements api's that allow users to handle operations to the tree, as well as generate/verify RLN proofs.

Our main implementation of RLN can be accessed via this Rust crate, which is documented here. It can used in other langugages via the FFI API, which is documented here. The usage of RLN in Waku is detailed in our RLN Implementers guide, which provides step-by-step instructions on how to run Waku-RLN-Relay.

Following is a diagram that will help understand the dependency tree -

## Future work

- Optimizations to zerokit for proof generation time.
- Incrementing tree depth from 20 to 32, to allow more memberships.
- Optimizations to the smart contract.
- An ability to signal validity of a message in different time windows.
- Usage of proving systems other than Groth16.