Biblio
Atomic multicast is a communication primitive that delivers messages to multiple groups of processes according to some total order, with each group receiving the projection of the total order onto messages addressed to it. To be scalable, atomic multicast needs to be genuine, meaning that only the destination processes of a message should participate in ordering it. In this paper we propose a novel genuine atomic multicast protocol that in the absence of failures takes as low as 3 message delays to deliver a message when no other messages are multicast concurrently to its destination groups, and 5 message delays in the presence of concurrency. This improves the latencies of both the fault-tolerant version of classical Skeen's multicast protocol (6 or 12 message delays, depending on concurrency) and its recent improvement by Coelho et al. (4 or 8 message delays). To achieve such low latencies, we depart from the typical way of guaranteeing fault-tolerance by replicating each group with Paxos. Instead, we weave Paxos and Skeen's protocol together into a single coherent protocol, exploiting opportunities for white-box optimisations. We experimentally demonstrate that the superior theoretical characteristics of our protocol are reflected in practical performance pay-offs.
Real world wireless networks usually have diverse connectivity characteristics. Although existing works have identified replication as the key to the successful design of routing protocols for these networks, the questions of when the replication should be used, by how much, and how to distribute packet copies are still not satisfactorily answered. In this paper, we investigate the above questions and present the design of the Hybrid Routing Protocol (HRP). We make a key observation that delay correlations can significantly impact performance improvements gained from packet replication. Thus, we propose a novel model to capture the correlations of inter-contact times among a group of nodes. HRP utilizes both direct delays feedback and the proposed model to estimate the replication gain, which is then fed into a novel regret-minimization algorithm to dynamically decide the amount of packet replication under unknown network conditions. We evaluate HRP through extensive simulations. We show that HRP achieves up to 3.5x delivery ratio improvement and up to 50% delay reduction, with comparable and even lower overhead than state-of-art routing protocols.