posted on 2024-07-11, 18:36authored byXiaoYuan Xie
Spectrum-based fault localization (SBFL) has been widely studied due to its simplicity and effectiveness. However, it still has some challenging problems. The application of SBFL in the absence of test oracle and the selection of the most effective risk evaluation formulas are amongst the most critical problems. In this thesis, we are going to address these two problems. Currently, all existing SBFL techniques have assumed the existence of a test oracle. Otherwise, the program spectrum will not be associated with the testing result of failed or passed, and as a consequence, there will be insufficient information to perform the risk evaluation. However, in many real-world applications, it is very common that test oracles do not exist, and hence SBFL cannot be applied in such situations. Therefore, in this thesis, we propose a novel concept of metamorphic slice resulted from the integration of metamorphic testing and program slicing, to alleviate the oracle problem for SBFL. In our approach, instead of using the program slice and the testing result of failed or passed for an individual test case, metamorphic slice and the testing result of violation or nonviolation of a metamorphic relation are used. Since we need not to know the execution result for an individual test case, the existence of test oracle is no longer a prerequisite to SBFL. Experimental results show that our proposed solution delivers a performance comparable to the performance of existing SBFL techniques for the situations where test oracles exist. As a consequence, our study has significantly extended the scope of the applicability of SBFL. For the second problem of selecting the most effective risk evaluation formulas, though it has been one of the most important tasks in SBFL, there does not exist a completely satisfactory solution. It is well-known that risk evaluation is very critical in SBFL and hence many studies have been conducted to compare the performance among various risk evaluation formulas. Most of the previous studies have adopted an empirical approach, which however, can hardly be considered as sufficiently comprehensive because of the huge possible combinations of various factors in SBFL. Though there are some studies aiming at overcoming the limitations of the empirical studies through a theoretical approach, these studies were based on the most strict type of equivalence that does not properly reflect the more realistic scenario, and did not adopt the most commonly used performance metric. Therefore, in this thesis, we provide a theoretical investigation on the effectiveness of risk evaluation formulas. We define two types of relations between different formulas, namely, equivalent and better. To identify the relations between different formulas, we develop an innovative framework for the theoretical investigation. Our framework is based on the concept that the determinant for the effectiveness of a formula is the number of statements with risk values higher than that of the faulty statement. Our framework groups all program statements into three disjoint sets with risk values higher than, equal to and lower than that of the faulty statement, respectively. For different formulas, the sizes of their sets are compared using the notion of subset. We use this framework to identify the maximal formulas which should be the only candidate formulas for use. Compared with previous studies, our conclusions are derived from a completely theoretical analysis, and hence are more robust. Besides, we adopt the most commonly used performance metric, and use a more general and intuitively appealing type of equivalence relation.
History
Thesis type
Thesis (PhD)
Thesis note
A thesis submitted for the degree of Doctor of Philosophy, Swinburne University of Technology, 2012.