Full article: Acceptance testing based test case prioritization

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Software testing is an important and expensive phase of development. Whenever changes are made in the code, it becomes a time-consuming process to execute all the test cases in regression testing. Therefore, the testing process needs some test case reduction techniques and prioritization techniques to improve the regression testing process. Test case prioritization aims at ordering test cases to increase the fault detection capability. There are many existing techniques for test case reduction as well as for prioritization that use the coverage information which degrades the number of ties uncounted during the prioritization. This paper will take its focus on the multi-level random walk algorithm, which has been used for test case reduction. In this process, test case selection for further reduction is done randomly on every iteration that degrades the performance of a testing process in terms of coverage and will also generate a situation for random test case tie. To overcome this situation of random test case selection and handling test case tie, a solution is being proposed in this paper, which includes a combination of optimized multi-level random walk and genetic algorithm. In regression testing, another important aspect is the test case prioritization that finds fault as early as possible if test cases are prioritized properly. So, this paper introduces new prioritization techniques, which are based on fault prediction in acceptance testing. The performance of the proposed approach in terms of fault detection is evaluated with the help of many programs.

Keywords:

Public Interest Statement

This research explores the power of testing automation that includes test case reduction and prioritization in regression testing. In this era, reduction and prioritization of test cases have increased the efficiency of testing process in regression testing (Bach, Citation1996). This research work also highlights the challenge in today’s software automation changing of client requirement and technology changes. Additionally, the proposed approach enhances the feature of existing automated testing tools through test case selection and prioritization.

1. Introduction

Software testing is expensive for executing code with the intent of finding bugs in the software product. It also deals with the validation and verification process that verifies whether the product meets the technical and business requirements (Chaturvedi and Kulothungan, Citation2014). Regression testing includes two important parts: functional and non-functional testing. Regression testing is a part of software testing, which means retesting the code after changing the parts of the application. It is a process of testing in which the test cases are re-executed to check whether the previous functionality is working properly (Anand et al., Citation2013). In regression testing, many issues are faced in terms of resources, time, and cost. So, to reduce the cost of regression testing, software testers introduced the concept of prioritization. Prioritization is done to find the useful and representative set of test cases, by some measure, which is made to run on earlier phases of the regression testing process. (Alian et al., Citation2016; Fraser et al., Citation2014; Gligoric et al., Citation2015)

The test case prioritization problem is defined as finding the different sequence of test cases for which the values of fault detection are achieved. (Caprara et al., Citation2000; Wong et al., Citation1998)

Definition:

Given: T, a test suite, DPT, the set of different permutations of T. “F” is a function from DPT to the numbers.

Problem:

Find T’ε DPT such that

∀ T” εDPT,

F (T’) ≥F (T”)

Many existing techniques for reduction and prioritization use code coverage information gathered through instrumentation and execution of the code order test that degrade the performance of fault definition rate.(Eghbali & Tahvildari, Citation2016; Di Nardo et al., Citation2013)

Some other iterative and greedy approaches iterate “n” times, where “n” is the number of test cases in the test suite. In each iteration, it selects one test and keeps it into the ordering as the next item.

Sepehr Eghbali and Ladan Tahvildari (Sebastian Elbaum et al., Citation2000) developed test case prioritization using Lexicographical ordering for improving fault detection capability. They said that most of the approaches use common coverage information from previously executed test cases and use iteration procedure to obtain ordering test cases due to which process will take more time to order test cases. To avoid this problem, they proposed a new heuristic for breaking ties in coverage-based technique. In this paper, initially, they argue that acting randomly in the coverage of ties can degrade the performance of AT algorithm (Duggal and Suri, Citation2008). They used this proposed approach for breaking the ties effectively. He proposed a basic algorithm using the lexicographical ordering of commutative coverage vector and GetLO algorithm by modifying the basic algorithm to reduce its time complexity.

Sultan H. Aljahdali et al. (Roongruangsuwan and Daengdej, Citation2005–2010) discussed genetic algorithm (GA), feature, and limitation of GA in software testing. In this paper, initially, they have discussed elements of the GA, initial population, calculating fitness value, selection, crossover, mutation and stop criteria, etc. In the second part of the paper, they have discussed and analyzed different approaches which are based on GA and the kind of coverage and fitness function used in the method. At the end of this paper, they have mentioned some limitations that occur when used in the following situation. (i) Using control flow coverage, (ii) simple genetic operator, (iii) not considering some data type and multiple procedures, (iv) manually selecting a path, (v) randomly selecting the initial population, and (vi) solid fitness function. Finally, the authors have said that two parameters that give higher fitness to inputs are considered closer to satisfy the test requirement. The parameters are controlled dependency and branch distance. (Fraser et al., Citation2014; Yoo & Harman, Citation2007)

In this research, the GA module is used to find optimizing test cases during the test case reduction process for handling test case ties. The procedure is described as follows: Experimental set for GA

While (Termination not true)

Do Begin

Population initialization, Selection, Crossover, Mutation,

Replacement for next generation

End

An extensive review has been carried out to find faults in the existing literature, which says that the existing reduction technique used in this paper faces some problem with test case tie due to random selection of test cases on every iteration thus making the overall test suite complex. Therefore, to improve the test suite and reduce the complexity by maintaining the coverage ratio some optimization techniques have been introduced. It considers two optimization techniques which include a combination of optimized multi-level random walk and optimized algorithm (GA). The multi-walk algorithm is a test suite reduction algorithm that finds local and global optima by random walk search to simplify the original problem into a reduced test suite through the backbone and by removing shielded test cases. On the other hand, to improve the ordering of test cases and to reduce the prioritized test cases an optimization algorithm (GA) is used, which is very powerful and is a widely used stochastic search process. A GA is an evolutionary algorithm based on natural selection. It is to find approximate solutions for optimization and search problems. The genetic algorithm aims to achieve better results through selection, crossover, and mutation.(Chen and Lau, Citation1998; Solanki and Singh, Citation2014)

A multi-level random walk is a software test case reduction technique that is taken as a focus area in this research. It tries to find an optimal and refined solution for the original problem instance. (McMaster & Memon, Citation2007)

1.1. Model for test data generation for multiple path

The CFG of a program is a diagramG = {N, E, S, e}

where N = Set of nodes, E = set of edges, S = Starting node, e = exit node of the graph.

Each node “m” is a statement in the program.

Each edge (m_i,m_j) indicates a transfer control from node m_i to m_j.

A path in the sequence P4 = m₁,m₂ … … .m_n Such that there exists edge from node m_i to m_j+1

where i = 1, 2 … .n-1.

Here sequence path varies for larger application to reduce the complexity. We can use string (0, 1).

{

0, PA not includes branch (i)

C_(i) =

1, PA includes branch (i)

Let input vector V = (x₁, … .X_n) and domain of x₁ be D_i;

Input domain D(program) = D₁,xD₂ … D_S

Then here program accept “V” as input, then path denoted by PA(V).

1.2. Objective function

Applying GA for test reduction problem. The approach to forming an objective function consists of two parts, one is approach level (AL) and another one is branch distance (BD).

The approach level deals with how execution comes to the conditional node, which controls the testing object.

If PA ≠PA(V), approach level of input V to a target path PA is number string between PA and PA_(V).Otherwise, AL _PA^(X) = 0.

For example:

Conditional Statement

${$ If C ≤ 10 Then branch distance of V defined as

0 if C(V) ≤10

BD(V,C ≤ 10) =

10 + C(V) otherwise

1.3.1. Model for genetic algorithm

Genetic algorithm one of the most popular optimization algorithms that are based on natural genetics and selection process. Before applying GA to any problems whole units are divided into a small unit that is called genes.

The main steps of genetic process are as follows.

Generating number of population that is equation no of test cases in test suite. (P)
Set up the termination criteria T
Calculate Cyclomatic complexity to find the number of independent paths.
Calculating cross over probability C_P
Calculating mutation probability M_p
Generate initial population

G_j = {g_j1,g_j2, … .g_jn}

Suppose target path is {P₁,P₂, … P_n}

Then generate subpopulation randomly

N⁽¹⁾(P_i) = {V_i1⁽¹⁾,V_i2⁽¹⁾ … .V_im⁽¹⁾}

i = 1,2, … .n and calculate the value of AL_p1(V_ij^(t)).Here t is iteration.

7. Calculate fitness value of each test case that means each individual V_i

f(d_j) = $\sum_{i = k}^{k} g i * w i / C (V j)$

C(V_j) is the cost of V_j.

For subpopulation

f(V_ij^(t)) = AL_Pi(V_ij^(t) +BD_Pi(X_ij^(t))

8. Generation cross-over mutation operation based on weight age of statement (or) cost of the module.

Algorithm 1.1 GA algorithm

The remaining part of the paper is organized as follows. Section 2 describes the problem description and the multi-walk algorithm for the test case reduction technique. Section 3 introduces the enhanced multi-walk algorithm with help of a GA for handing test case ties during the reduction process and also presents a proposed model for test case prioritization. Section 4 presents performance analysis with the existing approach. Section 5 describes the empirical studies. Section 6 describes the related work. Finally, Section 7 concludes the paper and presents some future research.

2. Problem description

A multi-level random walk is a software test case reduction technique that is taken as an area for research. It tries to find an optimal and refined solution for the original problem instance. However, this algorithm still has few shortcomings associated with it. At every level, a search is being performed, selection of random test cases is made and this selection of random test cases increases the complexity of the entire test suite. Moreover, re-execution of test cases will affect regression testing making it time-consuming and expensive process. Further, there will be a situation when the random test cases will meet a test case tie at some point of time, which leads to statement coverage ratio being impacted. To overcome all such scenarios and to make the overall test suite more effective a solution that is preferred is the incorporation of optimization technique (GA) with the existing reduction and prioritization technique. In regression testing, another important one is test case prioritization that finds fault as early as possible if test cases are prioritized properly. But most of the existing prioritization techniques do not consider the effectiveness of acceptance techniques. Due to this reason, this paper introduced new prioritization techniques which are based on fault prediction in acceptance testing. (Fraser & Arcuri, Citation2013; Girgis, Citation2005)

2.1. Multi-walk algorithm for test suite reduction

Test case reduction is important for regression testing because no test cases affect the cost of the regression testing process. In this situation, the system needs effective test cases from the original test suite to check whether the existing products are getting affected by the modified ones.

A multi-level random walk algorithm is used in this paper for test case reduction. One of the most common algorithms for test case selection is the random walk algorithm that uses local optima and backbone test cases to simplify the original problem into small problems by removing the shielded test cases. At each level, a random walk is made and an intersection or the common part is locked, discarding those test cases which are not locked or not shielded. But this algorithm reduces the problem through random selection during the selection process that removes some effective test cases. To overcome the problem of this, the proposed approach uses genetic and multi-walk algorithm for optimizing test cases instead of random selection. (Akimoto et al., Citation2015; Watkins, Citation1995)

At this point the solution obtained by multi-level random walk is not much optimized due to the fact that statement coverage ratio is not maintained properly hence some optimized algorithms can be thought of as invocation with it.

2.1.1. Initial coverage matrix

shows the initial coverage matrix of all the test cases, the set of statements in the program, and its weight. Here intersect value represents coverage information about the test case. If statement executed by test case, then marked ‘1ʹ otherwise ‘0ʹ.

Acceptance testing based test case prioritization

Abstract

Public Interest Statement

1. Introduction

1.1. Model for test data generation for multiple path

1.2. Objective function

1.3.1. Model for genetic algorithm

2. Problem description

2.1. Multi-walk algorithm for test suite reduction

2.1.1. Initial coverage matrix

Table 1. Coverage matrix

2.1.2. Reductive level 1 matrix

Table 2. Reduction matrix

2.1.3. Test case reduction percentage

2.1.4. Effectiveness of test cases in terms of statement weightage

3. Proposed approach for test case reduction and ordering

3.1. Optimized multi-walk algorithm for test reduction

3.1.1. Algorithm for test case reduction

3.2. Algorithm 3.1 Optimal Multi Walk Algorithm

3.2.1. Initial coverage matrix

Table 3. Initial test case

3.2.2. Configuration setting for genetic algorithm

3.2.3. Reduction process

3.2.3.1. Initial population

3.2.3.2. Fitness values

Table 4. Genetic parameter setup

Table 5. Fitness value for test cases

3.2.3.3. Selection

Table 6. Output after selection

3.2.3.4. Crossover

Table 7. Output after crossover

4. Mutation

Table 8. Level 1 of test reduction

Table 9. Level 2 of test reduction

Table 10. Level 3 of test reduction

Table 11. Final level of test reduction

4.1. Test case prioritization based on fault prediction in acceptance testing

4.1.1. Initial matrix

Table 12. Initial matrix

4.1.2. Test case coverage in each module

4.1.3. Fault in acceptance testing from previous release

4.1.4. Algorithm for test effort calculation

4.1.5. Test case mapping for final ordering of the test cases

5. Performance analysis

5.1. Performance analysis basic and optimized multi-walk algorithm for test case reduction

5.2. Performance analysis of random and proposed prioritization technique for test cases prioritization

5.3. Discussion on the results

Table 13. Coverage matrix for each module

Table 14. History of acceptance testing

Table 15. Test effort vs. module size

Table 16. Test effort vs. code size after modification

Table 17. Test effort vs. predicted faults

Table 18. Test effort vs. fault density

Table 19. Reduction techniques analysis

Table 20. Prioritization techniques analysis

6. Related works

7. Conclusion and future work

Additional information

Funding

Notes on contributors

U Geetha

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date