list | abstracts | bib | .html ]

CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems

paper pdf    paper ps   

Abstract

We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.

Citation

Maysam Yabandeh, Nikola Knezević, Dejan Kostić, and Viktor Kuncak. CrystalBall: Predicting and preventing inconsistencies in deployed distributed systems. In 6th USENIX Symp. Networked Systems Design and Implementation (NSDI '09), 2009.

BibTex Entry

@INPROCEEDINGS{YabandehETAL09CrystalBall,
  author = {Maysam Yabandeh and Nikola Kne\v{z}evi\'c and Dejan Kosti\'c and Viktor Kuncak},
  title = {{CrystalBall}: Predicting and Preventing Inconsistencies in Deployed Distributed Systems},
  year = 2009,
  booktitle = {6th USENIX Symp. Networked Systems Design and Implementation (NSDI '09)},
  url = {http://www.usenix.org/events/nsdi09/tech/full_papers/yabandeh/yabandeh_html/index.html},
  localurl = {http://lara.epfl.ch/~kuncak/papers/YabandehETAL09CrystalBall.pdf}
}

list | abstracts | bib | .html ]