Society is increasingly dependent on complex computer systems, reflected in current trends such as IoT, and cloud computing. Bugs in these systems impose threats for security and safety and can cause immense financial losses. Even after detecting a bug, it is challenging to understand the interactions of parameters which caused it. Bugs can be due to issues in single components as well as faulty synchronization between components in distributed and networked systems. A particularly challenging class of bugs which fail to manifest or manifest differently during debugging are Heisenbugs. This project proposes automated diagnosis techniques for Heisenbugs in order to provide guidance for bug fixing.
The already challenging task of identifying the cause of a bug becomes even more cumbersome if those bugs disappear or change their behavior under observation. Such bugs occur in a range of contexts including elusive concurrency bugs as well as unintended system alterations during debugging and — as a pun on the name of Werner Heisenberg — are often referred to as Heisenbugs. Heisenbugs can be caused by various sources of nondeterminism on different system levels, many of which developers and testers might not even be aware of. This paper provides formal foundations for rigorously reasoning about causes of Heisenbugs. It provides a formal definition of Heisenbugs in terms of a hyperproperty and introduces a framework for determining the causality of Heisenbugs in presence of multiple candidate causes based on said hyperproperty. We analyze the properties of causes and the implications on practical causal analyses.
A faithful characterization of backdoors is a prerequisite for an effective automated detection. Unfortunately, as we demonstrate, formalization attempts in terms of temporal safety properties prove far from trivial and may involve several revisions. Moreover, given the complexity of the task at hand, a hapless revision of a property may not only eliminate but also introduce inaccuracies in the specification. We introduce a method called differential property monitoring that addresses this challenge by monitoring discrepancies between two versions of a property, and illustrate that this technique can also be used to analyze observations of untrusted components. We demonstrate the utility of the approach using a range of case studies – including the recently discovered xz backdoor.