Node Failure (node + failure)

Distribution by Scientific Domains


Selected Abstracts


On the Internet routing protocol Enhanced Interior Gateway Routing Protocol: is it optimal?

INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH, Issue 3 2006
James R. Yee
Abstract Cisco's proprietary routing protocol, EIGRP (Enhanced Interior Gateway Routing Protocol) is one of the two most widely employed routing protocols in the Internet. The underlying algorithm is reputed to be optimal with respect to the EIGRP metric. We construct a counterexample to illustrate that it is not optimal. We implemented the test network from the counterexample in our Networking Lab and it was confirmed that the Cisco routers did not find optimal routes. We suggest ways in which the EIGRP algorithm can be improved. These suggestions would also improve the operation of the Diffusing Updating Algorithm, the portion of EIGRP used to recover from link/node failures. [source]


The Neutralizer: a self-configurable failure detector for minimizing distributed storage maintenance cost

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 2 2009
Zhi Yang
Abstract To achieve high data availability or reliability in an efficient manner, distributed storage systems must detect whether an observed node failure is permanent or transient, and if necessary, generate replicas to restore the desired level of replication. Given the unpredictability of network dynamics, however, distinguishing permanent and transient failures is extremely difficult. Though timeout-based detectors can be used to avoid mistaking transient failures as permanent failures, it is unknown how the timeout values should be selected to achieve a better tradeoff between detection latency and accuracy. In this paper, we address this fundamental tradeoff from several perspectives. First, we explore the impact of different timeout values on maintenance cost by examining the probability of their false positives and false negatives. Second, we propose a self-configurable failure detector called the Neutralizer based on the idea of counteracting false positives with false negatives. The Neutralizer could enable the system to maintain a desired replication level on average with the least amount of bandwidth. We conduct extensive simulations using real trace data from a widely deployed peer-to-peer system and synthetic traces based on PlanetLab and Microsoft PCs, showing a significant reduction in aggregate bandwidth usage after applying the Neutralizer (especially in an environment with a low average node availability). Overall, we demonstrate that the Neutralizer closely approximates the performance of a perfect ,oracle' detector in many cases. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Capacity provisioning and failure recovery for Low Earth Orbit satellite constellation

INTERNATIONAL JOURNAL OF SATELLITE COMMUNICATIONS AND NETWORKING, Issue 3 2003
Jun Sun
This paper considers the link capacity requirement for an LEO satellite constellation. We model the constellation as an N×N mesh-torus topology under a uniform all-to-all traffic model. Both primary capacity and spare capacity for recovering from a link or node failure are examined. In both cases, we use a method of ,cuts on a graph' to obtain lower bounds on capacity requirements and subsequently find algorithms for routing and failure recovery that meet these bounds. Finally, we quantify the benefits of path-based restoration over that of link-based restoration; specifically, we find that the spare capacity requirement for a link-based restoration scheme is nearly N times that for a path-based scheme. Copyright © 2003 John Wiley & Sons, Ltd. [source]


Protecting IPTV against packet loss: Techniques and trade-offs

BELL LABS TECHNICAL JOURNAL, Issue 1 2008
Natalie Degrande
Packet loss ratios that are harmless to the quality of data and Voice over Internet Protocol (VoIP) services may still seriously jeopardize that of Internet Protocol television (IPTV) services. In digital subscriber line (DSL)-based access networks, the last mile in particular suffers from packet loss, but other parts of the network may do so too. While on the last mile link, the packet loss is due to bit errors, in other parts of the network it is caused by buffers overflowing or the network experiencing (short) outages due to link or node failures. To retrieve lost packets, the application layer (AL) can use either a forward error correction (FEC) or a retransmission scheme. These schemes, when properly tuned, increase the quality of an IPTV service to an adequate level, at the expense of some overhead bit rate, extra latency, and possibly an increase in channel change time. This paper compares the performance of FEC schemes based on Reed-Solomon (RS) codes with that of retransmission schemes, all tuned to conform to the same maximum overhead bit rate allowed on the last mile link and on the feeder link, and their possible impact on the channel change time. We take into account two kinds of loss processes that can occur: isolated packet losses and burst packet losses. In almost all scenarios, retransmission outperforms FEC. © 2008 Alcatel-Lucent. [source]