The Shoal framework significantly enhances the performance of the Aptos network, with Bullshark latency dropping by 80%.

Shoal Framework: Significantly Drop Bullshark Latency on Aptos

Aptos Labs recently solved two important open problems in DAG BFT, significantly dropping latency and eliminating the need for timeouts in deterministic real-world protocols for the first time. Overall, in fault-free conditions, Bullshark's latency improved by 40%, and in fault conditions, it improved by 80%.

Shoal is a framework that enhances any Narwhal-based consensus protocol ( such as DAG-Rider, Tusk, Bullshark ) through pipelining and leader reputation. Pipelining reduces DAG sorting latency by introducing an anchor point in each round, while leader reputation further improves latency by ensuring that the anchor point is associated with the fastest validating nodes. Additionally, leader reputation allows Shoal to leverage asynchronous DAG constructions to eliminate timeouts in all scenarios. This enables Shoal to provide what we refer to as universal responsiveness, which encompasses the optimistic responses typically required.

Our technology is very simple, involving multiple instances of underlying protocols running one after another in sequence. Therefore, when instantiated with Bullshark, we get a group of "sharks" that are in a relay race.

Detailed Explanation of Shoal Framework: How to Drop Bullshark Latency on Aptos?

Motivation

In the pursuit of high performance in blockchain networks, there has been a continuous focus on reducing communication complexity. However, this approach has not led to a significant increase in throughput. For example, the Hotstuff implemented in early versions of Diem only achieved 3500 TPS, far below our goal of achieving 100k+ TPS.

However, the recent breakthrough stems from the realization that data propagation is the main bottleneck based on the leader protocol and can benefit from parallelization. The Narwhal system separates data propagation from core consensus logic and proposes an architecture where all validators propagate data simultaneously, while the consensus component only orders a smaller amount of metadata. The Narwhal paper reports a throughput of 160,000 TPS.

In previous articles, we introduced Quorum Store. Our Narwhal implementation separates data propagation from consensus, and how we use it to scale the current consensus protocol Jolteon. Jolteon is a leader-based protocol that combines Tendermint's linear fast path and PBFT-style view changes, reducing Hotstuff latency by 33%. However, it is clear that leader-based consensus protocols cannot fully leverage Narwhal's throughput potential. Despite separating data propagation from consensus, as throughput increases, the leaders of Hotstuff/Jolteon still face limitations.

Therefore, we decided to deploy Bullshark, a zero-communication-overhead consensus protocol, on top of the Narwhal DAG. Unfortunately, compared to Jolteon, the DAG structure that supports high throughput for Bullshark incurs a 50% latency cost.

In this article, we introduce how Shoal significantly reduces Bullshark latency.

DAG-BFT Background

Let's start by understanding the relevant background of this article.

Each vertex in the Narwhal DAG is associated with a round. To enter round r, a validator must first acquire n-f vertices belonging to round r-1. Each validator can broadcast one vertex per round, and each vertex must reference at least n-f vertices from the previous round. Due to the asynchrony of the network, different validators may observe different local views of the DAG at any given time.

A key property of DAG is not ambiguous: if two validating nodes have the same vertex v in their local DAG view, then they have exactly the same causal history of v.

Detailed Explanation of the Shoal Framework: How to Drop Bullshark Latency on Aptos?

Overall Ranking

The total order of all vertices in the DAG can be achieved without additional communication overhead. To this end, the validators in DAG-Rider, Tusk, and Bullshark interpret the structure of the DAG as a consensus protocol, where vertices represent proposals and edges represent votes.

Although the group intersection logic on the DAG structure is different, all existing Narwhal-based consensus protocols have the following structure:

  1. Anchor Points: Every few rounds (, for example, there will be a predetermined leader in the two rounds of Bullshark ), and the leader's peak is called the anchor point;

  2. Ordering Anchor Points: Validators independently but deterministically decide which anchor points to order and which anchor points to skip;

  3. Causal History Ordering: Validators process their ordered anchor point list one by one, and for each anchor point, they order all previously unordered vertices in its causal history according to some deterministic rules.

The key to ensuring security is to make sure that in the above steps (2), all honest validating nodes create an ordered anchor list so that all lists share the same prefix. In Shoal, we make the following observations about all the protocols mentioned above:

All validators agree on the first ordered anchor point.

Bullshark latency

The latency of Bullshark depends on the number of rounds between the ordered anchors in the DAG. While the synchronous version of Bullshark is more practical and has better latency than the asynchronous version, it is far from optimal.

Question 1: Average Block Latency. In Bullshark, each even round has an anchor point, and each odd round's vertex is interpreted as a vote. In common cases, two rounds of DAG are needed to order the anchor points; however, the vertices in the causal history of the anchor points require more rounds to wait for the anchor points to be sorted. In common cases, the vertices in the odd rounds require three rounds, while the non-anchor point vertices in the even rounds require four rounds.

Question 2: Failure case latency, the above latency analysis applies to fault-free situations; on the other hand, if a leader in a round fails to broadcast the anchor point fast enough, the anchor point cannot be ordered ( and is thus skipped ). Therefore, all unordered vertices from previous rounds must wait for the next anchor point to be ordered. This significantly drops the performance of the geo-replication network, especially since Bullshark timeouts are used to wait for the leader.

Detailed Explanation of the Shoal Framework: How to Reduce Bullshark Latency on Aptos?

Shoal Framework

Shoal addresses these two latency issues by enhancing Bullshark( or any other Narwhal-based BFT protocol) through pipelining, allowing for an anchor point in each round and reducing the latency of all non-anchor vertices in the DAG to three rounds. Shoal also introduces a zero-cost leader reputation mechanism in the DAG, which biases selection towards fast leaders.

Challenge

In the context of the DAG protocol, pipeline and leader reputation are considered challenging issues for the following reasons:

  1. The previous pipeline attempted to modify the core Bullshark logic, but this seems fundamentally impossible.

  2. The introduction of leader reputation in DiemBFT and its formalization in Carousel is based on the dynamic selection of future leaders according to the past performance of validators, the idea of anchoring in ( Bullshark. Although discrepancies in leadership identity do not violate the security of these protocols, in Bullshark, it could lead to completely different orderings, which raises the core issue that dynamically and deterministically selecting round anchors is necessary to resolve consensus, and validators need to reach agreement on ordered history to choose future anchors.

As evidence of the difficulty of the problem, we note that the implementation of Bullshark, including the one currently in production, does not support these features.

Protocol

Despite the above challenges, as the saying goes, the solution proves to be hidden behind simplicity.

In Shoal, we rely on the ability to perform local computations on the DAG and have implemented the capability to save and reinterpret information from previous rounds. With the core insight that all validators agree on the first ordered anchor point, Shoal sequentially combines multiple Bullshark instances for pipelining, making ) the first ordered anchor point the switching point of the instances, and the causal history of ( anchor points used to calculate the reputation of leaders.

![Detailed explanation of the Shoal framework: How to reduce Bullshark latency on Aptos?])https://img-cdn.gateio.im/webp-social/moments-46d37add0d9e81b2f295edf8eddd907f.webp(

Assembly Line

V that maps rounds to leaders. Shoal runs instances of Bullshark one after another, so for each instance, the anchor is predetermined by mapping F. Each instance orders an anchor, which triggers the switch to the next instance.

Initially, Shoal launched the first instance of Bullshark in the first round of DAG and ran it until the first ordered anchor point was determined, such as in round r. All validators agreed on this anchor point. Therefore, all validators could confidently agree to reinterpret the DAG starting from round r+1. Shoal simply launched a new Bullshark instance in round r+1.

In the best case, this allows Shoal to order an anchor in each round. The anchor points in the first round are sorted by the first instance. Then, Shoal starts a new instance in the second round, which has an anchor point that is sorted by that instance, and then another new instance orders anchor points in the third round, and the process continues.

![Detailed Explanation of the Shoal Framework: How to Reduce Bullshark Latency on Aptos?])https://img-cdn.gateio.im/webp-social/moments-0b0928cb6240e994c1514c75e080a4b2.webp(

Leader Reputation

During the Bullshark sorting, skipping anchor points increases latency. In this case, pipeline technology is powerless, as a new instance cannot be started until the previous instance orders the anchor points. Shoal ensures that it is less likely to choose the corresponding leader to handle lost anchor points in the future by using a reputation mechanism to assign a score to each validator node based on the recent activity history of each validator node. Validators that respond and participate in the protocol will receive high scores; otherwise, validator nodes will be assigned low scores as they may crash, be slow, or act maliciously.

The concept is to deterministically recalculate the predefined mapping F from rounds to leaders every time the score updates, favoring the higher-scoring leaders. In order for the validators to reach consensus on the new mapping, they should agree on the scores, thus reaching consensus on the history used to derive the scores.

In Shoal, pipeline and leadership reputation can naturally combine, as they both use the same core technology, which is to reinterpret the DAG after reaching consensus on the first ordered anchor point.

In fact, the only difference is that after sorting the anchor points in round r, the validators only need to calculate the new mapping F' starting from round r+1 based on the causal history of the ordered anchor points in round r. Then, the validating nodes execute a new instance of Bullshark using the updated anchor selection function F' starting from round r+1.

![A Comprehensive Explanation of the Shoal Framework: How to Reduce Bullshark latency on Aptos?])https://img-cdn.gateio.im/webp-social/moments-859e732e16c3eee0e2c93422474debc2.webp(

No more timeouts

Timeouts play a critical role in all leader-based deterministic partially synchronous BFT implementations. However, the complexity they introduce increases the number of internal states that need to be managed and observed, which complicates the debugging process and requires more observability techniques.

Timeouts can also significantly increase latency, as it is crucial to configure them properly, and they often require dynamic adjustments because they heavily depend on the environment ) network (. Before moving to the next leader, the protocol pays the full timeout latency penalty for the faulty leader. Therefore, timeout settings cannot be too conservative, but if the timeout period is too short, the protocol may skip good leaders. For example, we observe

APT2.52%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 6
  • Repost
  • Share
Comment
0/400
FortuneTeller42vip
· 07-29 06:41
This is too awesome, an 80% increase!
View OriginalReply0
StableGeniusDegenvip
· 07-27 13:52
Hardcore tech PI! A price rise is just around the corner.
View OriginalReply0
EthSandwichHerovip
· 07-26 07:15
Is the latency dropping? Bull, bull, just like that, To da moon.
View OriginalReply0
GasOptimizervip
· 07-26 07:14
80% latency optimization is better than any gas fee.
View OriginalReply0
ContractSurrendervip
· 07-26 06:55
Aptos bull is just too expensive.
View OriginalReply0
BearMarketSurvivorvip
· 07-26 06:46
Aptos bull is coming!
View OriginalReply0
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)