We are happy to inform you that your paper Effects of Topology Knowledge and Relay Depth on Asynchronous Consensus has been accepted to appear in the programme of OPODIS 2018. Congratulations! You will find the reviews below. Please revise your paper to address the reviewers’ comments and suggestions. The deadline to submit the camera-ready version of your paper is November 14, 2018. Detailed instructions for preparing the camera-ready version of your paper and how to submit it will be posted in the Information for Authors section of the OPODIS 2018 website. To have an accepted paper included in the OPODIS 2018 program and proceedings, at least one author must register for the conference and attend it to present the paper there. Registration information will be available at https://opodis2018.comp.polyu.edu.hk Thank you for submitting your work to OPODIS. We look forward to seeing you at the conference in Hong Kong. Best regards, Faith Ellen and Luis Rodrigues OPODIS 2018 PC Chairs ----------------------- REVIEW 1 --------------------- PAPER: 53 TITLE: Effects of Topology Knowledge and Relay Depth on Asynchronous Consensus AUTHORS: Dimitris Sakavalas, Lewis Tseng and Nitin Vaidya ----------- Overall evaluation ----------- Well-motivated extension to previous work on an interesting problem, although not a lot of novelty. ----------- Summary of Contribution ----------- The paper studies the approximate consensus problem in asynchronous message-passing systems with crash failures where the communication topology is an arbitrary directed graph and there are various restrictions on how far messages can be relayed and how much topology information the nodes have. The specific goal is to precisely characterize the communication topologies in which the problem is solvable. This paper is a variation on prior work [32] by two of the authors, which included a study of the same problem but without the limitations on initial knowledge and relay depth. Restricting the amount of information and relay depth is important for having practical algorithms. An exact characterization is shown for the following cases: (1) Nodes initially know their k-neighborhood and can propagate their state values to nodes at most k hops away, 1 <= k <= n, where n is the number of nodes (so this is for a restricted class of algorithms, called "iterative k-hop"). For this case, a condition called k-CCA is proposed and it is shown that if the network does not satisfy k-CCA then approximate consensus is not solvable. Then an algorithm called k-LocWA is presented and shown to solve approximate consensus as long as the network satisfies k-CCA. Condition k-CCA is very similar to the condition CCA in [32] and algorithm k-LocWA is very similar to the algorithm WA in [32] (main differences are that the wait condition for moving to the next phase is different and message propagation is limited to distance k). (2) Nodes initially know only their immediate neighbors but they can send (and forward) messages with arbitrary content throughout the network (so this is for a general class of algorithms). For this case, the authors point out that the necessity of condition CCA follows from [32] (since [32] showed CCA is necessary even when nodes have more initial knowledge). Then an algorithm is presented that essentially combines learning the topology through flooding with the iterative approach of k-LocWA (and WA). This algorithm works as long as the network satisfies CCA. ----------- What are the good things about the paper? ----------- The general problem (from [32]) is a natural question relating to feasibility of consensus, which was overlooked for many years. The restriction of the problem to limited knowledge and relay depth is well-motivated. Tight characterizations are achieved for a reasonable class of algorithms in case 1 and for all algorithms in case 2. ----------- What are the weaknesses of the paper? ----------- There is not a lot of novelty, as mostly the results are fairly straightforward extensions of [32]. The issue of termination is not dealt with clearly. I was assuming that at some point each process should terminate with a decision value but the algorithms (unlike those in [32]) do not terminate. Then I searched for the actual definition of the problem and only found the following (p. 3): "Intuitively, the fault-free nodes must be in the range of all the inputs, and are guaranteed to be within ε of each other for some ε > 0 after a sufficiently large number of rounds." This definition does not say that the output must remain within epsilon of each other forever after, although I guess that is required? Is this issue related to the terminology "approximate" versus "asymptotic" as in footnote 3 on page 3? I believe the calculation of the k-WAIT condition (page 9) and the n-WAIT condition (page 12) can take exponential (local computation) time, as the node has to check all possible subsets of its neighbors with size at most f. If I am correct, this undercuts the motivation of efficiency for the problem definition. Section 5.2 on real-time speed-up was very confusing; for starters, it would help to define the timed model more clearly. ----------- Detailed Technical Comments ----------- page 2, line 65: "algorithm" misspelled. page 4, line 123: "Fischer" misspelled. page 5, theorem 2: Does this result hold regardless of topology knowledge or relay depth? page 6: * line 224: what is the meaning of the . parameter? * Receive step of iterative k-hop algorithms: How does the node know when the receive is done, i.e., when it has waited long enough? Later I see that this is addressed, but it was a bit confusing on my initial read. Perhaps say very early that the node will apply some criterion to decide when to leave step 2. * line 248: "(ii)" should be "2." page 7: * Definition 4 (1-CCA): It took me a lot of careful comparison between the definition of CCA and this definition to figure out what the difference is. Perhaps help the reader and point out that 1-CCA requires the existence of a single node that has f+1 incoming neighbors, while CCA doesn't. * line 262: "inspired by Algorithm WA": The differences are that messages are not echoed, the algorithm doesn't terminate, and the wait condition is different. It would be helpful to discuss more about these differences, especially the issue of (non)termination. page 8: * line 307: F[p] is not defined * line 318: "or" should be "for" page 9: * line 344: "its" should be "their" * line 345ff: How expensive is it to calculate condition k-WAIT? * line 372: "that there exist" should be "such that there exist at" * line 373: "terminology" should be "the terminology" page 10: * line 381: What corruption? You are only considering crash failures. * line 397: I don't see how the statement of Theorem 14 means that lower k requires higher connectivity. * line 403: "consist" should be "consist of" * line 405: Wouldn't it be more precise to say that Condition CCA is equivalent to Condition d-CCA, where d is the diameter of the graph? (I.e., use diameter d instead of number of nodes n) page 11: * Convergence Time Comparison: Do you know of any lower bounds on the number of phases? * line 437: "differed" should be "deferred" * line 443: I believe "k' >= k" should be "k' > k" page 12: * line 460: "interests" should be "interest" * line 491: Since the learned graph is reset to t ----------------------- REVIEW 2 --------------------- PAPER: 53 TITLE: Effects of Topology Knowledge and Relay Depth on Asynchronous Consensus AUTHORS: Dimitris Sakavalas, Lewis Tseng and Nitin Vaidya ----------- Overall evaluation ----------- Approximate agreement is an old and interesting problem in fault-tolerant distributed computing. On the other hand, the paper contribution is somewhat incremental with respect to the authors previous work (podc'15). ----------- Summary of Contribution ----------- The paper addresses the approximate consensus problem in the following setting: - processes are asynchronous and may crash. Up to f processes may crash. - the communication network is a directed graph - only the class of iterative algorithm is considered. Essentially, each node p maintains a value, which is its estimate of the agreement value. This value is updated in asynchronous phase based on a subset the values of other node in the ball of radius k centered at node p. The possible subsets depend on the topology of the ball and the number of failures to tolerate. The main result is a tight characterization, depending on f and k, of the networks for which approximate agreement is solvable by an algorithm in the restricted class. A complexity analysis of the algorithm presented for the upper bound is also provided. ----------- What are the good things about the paper? ----------- The characterization is tight. The paper is rather well-written. Approximate agreement has a long history in distributed fault-tolerant computing, and has regained interest in the recent years. ----------- What are the weaknesses of the paper? ----------- The paper heavily relies on previous work by the author ([32] Fault-tolerant consensus in directed graphs. PODC 2015), although in the current work, only crash-failures are considered. The techniques used in the paper at hand are essentially the same. The k relay depth assumption is actually a constraint imposed on the algorithm, not a consequence of the communication system. Even if messages are relayed at most k times, a perhaps incomplete map of the network can be constructed. Should additional constraints on the size of the memory of the nodes or on the size of message size should be added? Node can actually learn (part of) the topology ----------- Detailed Technical Comments ----------- Approximate consensus should appear in the title. Typo algroithm line 65 Introduction: What are iterative algorithms? If there is no iterative algorithm, does it mean that there is no algorithm at all? Can the necessary conditions be beaten if non-iterative algorithms are allowed? line 89 what does CCA stand for? k relay depth is actually a constraint imposed on the algorithm, not the network. What is the motivation behind k-hop iterative algorithm? The introduction should make clear that the paper considers in fact a restricted class of approximate consensus algorithm. The assumption that messages are relayed at most k times does not prevent the node from building perhaps incomplete maps of the network. This is actually done in section 4. Typo l188 lim p-> \infty ----------------------- REVIEW 3 --------------------- PAPER: 53 TITLE: Effects of Topology Knowledge and Relay Depth on Asynchronous Consensus AUTHORS: Dimitris Sakavalas, Lewis Tseng and Nitin Vaidya ----------- Overall evaluation ----------- I think the paper is well-written and provides interesting results: it considers fault-tolerant approximate consensus in asynchronous multi-hop directed networks, and improves on previous work by investigating more practical solutions for large-scale networks. The results and techniques are direct extensions of previous work by two of the same authors, by generalizing the n-hop case to the k-hop case for arbitrary k in the range 1,...,n. All of the details needed to verify the results were included. ----------- Summary of Contribution ----------- This paper studies asynchronous fault-tolerant approximate consensus in message-passing models where the communication graph is not complete, and the channels are not necessarily bi-directional. These are modeled by arbitrary directed graphs. Further, the paper places two restrictions: an upper bound on each node's initial knowledge of the topology (each node initially knows its k-hop neighbourhood), and an upper bound on how far a node's message can be forwarded (the relay depth of each node's messages is k'). This paper addresses the following question in asynchronous directed networks with crash faults: what is a tight condition on the underlying communication graphs for achieving approximate consensus if each node has only k-hop topology knowledge and relay depth k''? The motivation is that prior work assumed that nodes had n-hop topology knowledge and used flooding (i.e., relay depth n), which is impractical in large-scale systems. This paper considers what can be accomplished using "local" algorithms. The main contributions of this paper are: - for arbitrary k in the range 1,...,n, in the case where k-hop topology knowledge is known and relay depth is k, the authors derive a family of tight conditions, called Condition k-CCA, for solving approximate consensus in directed networks. They prove that the conditions are necessary, and provide an algorithm k-LocWA that works when the conditions hold. - in the case where 1-hop topology knowledge is known and relay depth is n, Condition CCA (from a previous paper) is necessary and sufficient. The main contribution in this case is an algorithm that learns about the topology of the network as nodes flood the network. A sufficient "estimate" of the topology is learned, despite the fact that some nodes might crash and some messages delayed. ----------- What are the good things about the paper? ----------- - the results are of practical interest, particularly in the context of applicability to large networks. The algorithms improve on the previous algorithms that assume complete topology knowledge and flood the network. - shows the tradeoff between convergence time vs. the relay depth and topology knowledge. Large amounts of topology knowledge and large relay depth are costly in large-scale networks, but one can weigh this against the increased convergence time as restrictions on these values get tighter. ----------- What are the weaknesses of the paper? ----------- - the given problem statement for approximate consensus doesn't seem to make sense (or is ill-defined) in an asynchronous model since there are no phases. It does make sense if the paper is only restricting attention to the class of phase-based solutions in this paper (e.g., Iterative k-hop Algorithms) - it seems like an incremental result that reuses the techniques from Tseng and Vaidya "Fault-tolerant consensus in directed graphs" (PODC 2015), but re-stating the conditions and algorithms for k-hop neighbourhoods instead of n-hop neighbourhoods. However, care was taken to make sure all of the details go through. ----------- Detailed Technical Comments ----------- - the definition of approximate consensus on Page 5 talks about phases or iterations. What is a phase? It seems that "phase" depends on the definition of your algorithms, so the problem doesn't seem to be defined well. - line 188: the subscript of the limit should be infinity - in the definition of Iterative k-hop Algorithm, the notion of phase isn't clear to me. How do nodes know when a phases start/end? How do they know when to apply the transition function, i.e., can they know when they've received all messages from their k-hop in-neighbourhood? Reading ahead, your algorithms define phase boundaries differently, e.g., after an adequate subset of messages is received. Perhaps be clearer in your definition of Iterative k-hop Algorithms that nodes attempt to perform steps 1 and 2, but that step 3 will be performed after a certain waiting condition on the set of received messages is satisfied. - The sentence before Theorem 14 seems out of place: the subsequent results don't talk about connectivity of the graph.