880
Views
3
CrossRef citations to date
0
Altmetric
Software Quality, Reliability and Security

Theoretical and empirical validation of software trustworthiness measure based on the decomposition of attributes

, &
Pages 1181-1200 | Received 30 Jan 2022, Accepted 29 Mar 2022, Published online: 07 Apr 2022

Abstract

From the perspective of attribute decomposition, there are a variety of software trustworthiness metric models. However, little attention has been paid to using more rigorous methods and to performing theoretical validation. Axiomatic methods formalise the empirical understanding of software attributes through defining ideal metric properties. They can offer precise terms for the software attributes' quantification. We have utilised them to assess software trustworthiness on the basis of attribute decomposition, presented four properties, constructed a software trustworthiness measure (STMBDA for short). In this paper, we extend the set of properties, introduce two new properties, namely non-negativity and proportionality, and perfect substitutability and expectability. We verify the theoretical rationality of STMBDA by demonstrating that it conforms to the new property set and the empirical validity by evaluating the trustworthiness of 23 spacecraft software. The validation results show that STMBDA is able to effectively assess the spacecraft software trustworthiness and identify weaknesses in the development process.

1. Introduction

Software is increasingly ubiquitous in our everyday lives and plays an indispensable role in the function of our society. A system failure caused by the software operation (directly or indirectly) can have very severe consequences, resulting in not only monetary, time, or property losses, but also casualties (Wong et al., Citation2010Citation2017). As a result, software trustworthiness has attracted widespread attention in recent years. Its measurement has become a hot topic among researchers (He et al., Citation2018; Maza & Megouas, Citation2021). Software trustworthiness is able to be represented through many attributes (Chen & Tao, Citation2019; Gupta et al., Citation2021; Steffen et al., Citation2006); in this paper, these attributes are referred to as trustworthy attributes. Trustworthy attributes are often at levels that cannot be directly measured, so they are further decomposed into sub-attributes. A lot of software trustworthiness metric models on the basis of attribute decomposition are established. However, few studies have focused on more rigorous methods for measuring software trustworthiness and theoretically validating these measures. Strictly measuring the trustworthiness of software is able to assist with the assessment and improvement of software trustworthiness. Theoretical validation is a necessary activity for defining meaningful metric models and a required step for empirical validation of them (Srinivasan & Devi, Citation2014). Theoretical validation methods can be divided into two categories (Srinivasan & Devi, Citation2014): one is on the basis of the measurement theory (Briand, Emam, et al., Citation1996; Zuse & Bollmann-Sdorra, Citation1991), and the other is in the view of axiomatic approaches (Briand, Morasca, et al., Citation1996). Axiomatic approaches, which formally describe the empirical understanding of software attribute by defining desired metric properties, have been used to measure internal attributes, such as size, inheritance, complexity, and so on (Meneely et al., Citation2012).

To make software trustworthiness measurement more stringent, we once used axiomatic approaches to assess software trustworthiness on the basis of attributes. We extend this work to utilise axiomatic approaches to evaluate software trustworthiness from the point of attribute decomposition, give the expected properties of software trustworthiness measurement in the view of attribute decomposition, including monotonicity, acceleration, sensitivity, substitutivity and expectability, and establish a software trustworthiness measure (STMBDA) (Tao et al., Citation2015). In this paper, we complete the property set and introduce two new properties, namely non-negativity and proportionality. The reasons for introducing these two properties are as follows The measure result of software trustworthiness cannot be negative and non-negativity is used to describe this case. Software trustworthiness is the user's subjective recognition of the objective quality of software. The quantification of software trustworthiness needs to truly reflect users' approval. A software that is trusted requires that individual attribute (sub-attribute) values should not be too low, proportionality is to characterise this situation. At the same time, the substitutivity and expectability are improved. The improved substitutability adds two constraints to better depict the substitution. The first is that substitutivity between critical and non-critical attributes should be not only harder than that between critical attributes but also more difficult than that between non-critical attributes. The second is that the substitutivity between sub-attributes should be harder than that between attributes. The improved expectability adds user expectations related to sub-attributes. We verify STMBDA's theoretical rationality by proving that it satisfies the new property set. Its empirical validity is verified by utilising it to measure the trustworthiness of 23 spacecraft software. Compared with some established software trustworthiness metric models, STMBDA is able to better evaluate the software trustworthiness.

The rest of the work is organised as follows. We present the related works in Section 2. The expected measurement properties from the point of the attribute decomposition are described in Section 3, including the properties presented in Tao et al. (Citation2015), the newly introduced properties and the improved properties. STMBDA is introduced in Section 4 and we carry out its theoretical validation in the same section. We give STMBDA-based measurement process in Section 5. We conduct an empirical validation of STMBDA by a real case in Section 6. The comparative study is presented in Section 7. We end the paper with the conclusion and future work in the last section.

2. Related works

Typical models comprise uncertain theory (Shi et al., Citation2008), evidence theory (Ding et al., Citation2012), machine learning (Devi et al., Citation2019; Lian & Tang, Citation2022; Medeiros et al., Citation2020; Xu et al., Citation2021), system testing (Muhammad et al., Citation2018), crowd wisdom (H. M. Wang, Citation2018), social-to-software framework (Yang et al., Citation2018), heuristic-systematic processing model (Gene & Tyler, Citation5384–5393), users feedback (B. H. Wang et al., Citation2019), STRAM (Security, Trust, Resilience, and Agility Metrics) (Cho et al., Citation2019), vulnerability loss speed index (Gul & Luo, Citation2019), input validation coding practices behaviour (Lemes et al., Citation2019), trajectory matrix (Tian & Guo, Citation2020), trustworthy evidence of source code (Liu et al., Citation2021), axiomatic approaches (Tao & Chen, Citation2009Citation2012; J. Wang et al., Citation2015), etc.

Next, we select some of the typical models mentioned above for a detailed introduction. Shi et al. (Citation2008) first build a software dependability indicator system on the basis of the existing research results and then give a software dependability metric method by combining AHP and the fuzzy synthesised evaluation method. Ding et al. (Citation2012) develop a software trustworthiness evaluation model on the basis of evidential reasoning. In this method, they also introduce two discounting factor estimation approaches to measure the reliability degree of evaluation result. Medeiros et al. (Citation2020) utilise machine learning algorithms to obtain the knowledge related to vulnerabilities from software metrics extracted from source code of some representative software projects. Devi et al. (Citation2019) analyse the effects of class imbalance and class overlap in traditional learning models, and their research results can be used to classify data sets of software trustworthiness assessment with class imbalance. Xu et al. propose a QoS prediction model. In this model, neural networks and matrix factorisation are combined to perform non-linear collaborative filtering on the latent feature vectors of users and services (Xu et al., Citation2021). Lian and Tang (Citation2022) give an API recommendation method on the basis of neural graph collaborative filtering technique. The experimental results show that the performance of this method is better than the most advanced methods in API recommendation. Falcone and Castelfranchi (Citation2002) study the relationship between trust and control. Muhammad et al. (Citation2018) give the software trustworthiness rating strategy, which utilises the system test execution completion score to measure software trustworthiness. Yang et al. present a Social-to-Software software trustworthiness framework. This framework consists of a generalised index loss, the ability trustworthiness measurement solution, basic standard trustworthiness measurement solution and identity trustworthiness measurement solution (Yang et al., Citation2018). Wang et al. establish an updating model of software component trustworthiness. The component's trustworthy degree is calculated according to users' feedback, the updating weight is obtained based on the number of users, and the final trustworthy degree of the system is computed by the Euler distance (B. H. Wang et al., Citation2019). Xie et al. (Citation2022) present an approach to cover specifications of nominal behaviour and security, in which combined validation is utilised to verify the SysML models' nominal behaviour and Fault Tree Analysis for security analysis.

3. Properties of software trustworthiness measures based on attribute decomposition

Trustworthy attributes are divided into critical and non-critical attributes. Critical attributes are the trustworthy attributes that the software must have, and others are called non-critical attributes (Tao & Chen, Citation2009). Using the same notations as Tao and Chen (Citation2009), denote the value of critical attributes by y1,,ym, and the values of non-critical attributes by ym+1,,ym+s. Assuming that there are n sub-attributes to form trustworthy attributes, set their values as x1,,xn, respectively. Let yi(1im+s) be attribute measure functions about x1,,xn, and T be the software trustworthiness measure function of yi(1im+s). The expected measure properties in the view of attribute decomposition are as follows. Among them, monotonicity, acceleration, and sensitivity are properties that have been proposed in Tao et al. (Citation2015), non-negativity and proportionality are newly introduced properties, and substitutivity and expectability are improved properties.

(1) Non-negativity

  1. 0T.

Non-negativity implies that the evaluation result of software trustworthiness is non-negative.

(2) Proportionality

  1. (c1,c2+)c1yiyjc2,1i,jm+s,

  2. (c3,c4+)c3xixjc4,1i,jn.

Proportionality refers to the assumption that there should be an appropriate proportionality between attributes (sub-attributes). For example, supposing that critical attributes of a certain type of software consist of resilience and survivability, high-confidence software of this type requires both good resilience and high survivability. Very good resilience and low survivability or very high survivability and poor resilience are not appropriate. There are similar reasons for the proportional suitability of sub-attributes.

(3) Monotonicity (Tao et al., Citation2015)

  1. T/yi0,1im+s,

  2. yi/xk0,1im+s,1kn.

That is, the increased value of the attribute does not cause a decrease in the software's trustworthy degree, and the increase in the sub-attribute value does not result in a lower attribute value.

(4) Acceleration (Tao et al., Citation2015)

  1. 2T/yi20,1im+s

  2. 2yi/xk20,1im+s,1kn.

Acceleration is used to characterise the rate of change of a attribute (sub-attribute). When only one attribute yi is increased and the other attributes yj(ji) remain unchanged, the efficiency of using attribute yi will decrease. For the sub-attributes, there are similar explanations.

(5) Sensitivity (Tao et al., Citation2015)

  1. 0TTyiyi=TyiyiT=f1(yi,wi),1im+s,

  2. 0yiyixkxk=yixkxkyi=f2(xk,wi,k),1im+s,1kn,

where wi is the ith attribute's weight, wi,k is the weight of kth sub-attribute that constitutes the ith attribute, f1 is a function of yi and wi, f2 is a function of xk and wi,k. Sensitivity represents the percentage change in software trustworthiness (attribute value) caused by the percentage change in the attribute value (sub-attribute value). They should be non-negative and associated with the attributes (sub-attributes) and their weights. Furthermore, the software trustworthiness should be more sensitive to the smallest critical attribute relative to its weight. The reason is that its improvement is able to greatly increase the software trustworthiness ; therefore, the percentage change in its value will result in a relatively larger percentage change in the software trustworthiness.

(6) Substitutivity

  1. (c5,c6+)c5σrtc6,

where (1) σrt=d(yr/yt)d(T/ytT/yr)×T/ytT/yryr/yt,1r,tm+s,rt.(1) σrt are applied to indicate the difficulty of substituting between attributes, complying with 0σrt. The smaller σrt, the harder it is to replace the rth and tth attributes.

  1. (c7,c8+)c7σiklc8,

where (2) σikl=d(xk/xl)d(yi/xlyi/xk)×yi/xlyi/xkxk/xl,1im+s,1k,ln,kl.(2) σikl are applied to represent the difficulty of replacement between the sub-attributes that constitute the ith attribute. Similarly, σikl meets 0σikl, the smaller σikl, the harder it is to replace the kth and the lth sub-attribute.

Property 1 and property 2 indicate that the attributes are able to be replaced for each other to a certain extent, so can the sub-attributes.

  1. σijσrt, 1im,m+1jm+s,1r,tm,ij,rt,

    σijσrt, 1im,m+1jm+s,m+1r,tm+s,ij,rt.

  2. σiklσrt, 1i,r,tm+s,1k,ln,rt,kl.

Property 3 states that the substitutivity between critical and non-critical attributes should be not only harder than that between critical attributes but also more difficult than that between non-critical attributes. Property 4 means that the substitutivity between sub-attributes should be harder than that between attributes.

(7) Expectability

  1. (x0min{x1,,xn})(x0yimax{x1,,xn}),

  2. (y0min{y1,,ym+s})(y0Tmax{y1,,ym+s}),

where x0 and y0 are the user's minimum expected values for sub-attributes and attributes, respectively. Expectability implies that if all sub-attributes (attributes) meet the user's expectations, then the attribute trustworthiness (software trustworthiness) should also achieve the user's expectations and be less than or equal to the maximum value of all sub-attributes (attributes).

4. A software trustworthiness measure based on the decomposition of attributes and its theoretical validation

In this section, we introduce the software trustworthiness measure on the basis of the attribute decomposition constructed in Tao et al. (Citation2015) and theoretically validate it by demonstrating that it conforms to the properties described in Section 3.

Definition 4.1

Software trustworthiness measure based on the decomposition of attributes (STMBDA for short) (Tao et al., Citation2015)

STMBDA is defined as follows. {T={α[(min1im{yi}10)ϵy1α1y2α2ymαm]ρ+β[ym+1βm+1ym+2βm+2ym+sβm+s]ρ}1ρyi=(k=1nwi,kxkρi)1ρi,1im+s.Where

  1. m, s and n are the number of critical attributes, non-critical attributes and sub-attributes, respectively;

  2. yi(1im+s) are the values of attributes;

  3. T is the software trustworthiness measure function of y1,,ym+s;

  4. α and β represent the proportion of critical and non-critical attributes, such that α+β=1 and 0β<0.5<α1.

  5. αi(1im) are the weight values of critical attributes, satisfying i=1mαi=1 and 0αi1;

  6. βj(m+1jm+s) express the weight values of the non-critical attributes with j=m+1m+sβj=1 and 0βj1;

  7. wi,k(1im+s,1kn) are the weight values of sub-attributes making up the ith attribute with k=1nwi,k=1 and 0wi,k1, if the kth sub-attribute is not one of the sub-attributes forming the ith attribute, set wi,k=0;

  8. ϵ is utilised to control the effect of the smallest critical attribute on the software trustworthiness with 0ϵmin{1αmin,lny0lnyminlnyminln10}, among them, y0 is a value that all attributes must achieve and min is the i with min1im{yi};

  9. ρ is a parameter associated with the substitutivity between critical and non-critical attributes with 0<ρ;

  10. ρi(1im+s) are parameters related to the substitutivity between the sub-attributes that make up the ith attribute satisfying 0<ρρi;

  11. xk(1kn) are the values of sub-attributes with 1max{x0,y0}xk10, among them, x0 is a value that all sub-attributes must reach.

For convenience, we denote the i with min1im{yi} by min, the i with max1im{yi} by max, the i with minm+1im+s{yi} by min, the i with maxm+1im+s{yi} by max, and let ymin=min1im{yi},ymax=max1im{yi},ymin=minm+1im+s{yi},ymax=maxm+1im+s{yi},a1=[(ymin10)ϵy1α1y2α2ymαm]ρ,b1=[ym+1βm+1ym+2βm+2ym+sβm+s]ρ.The proof of Proposition 4.2(2) has been given in the Claim 1 in Tao et al. (Citation2015). Here we only give the proof of Proposition 4.2(1).

Proposition 4.2

  1. 1x0yimax1kn{xk}10,1im+s.

  2. T conforms to non-negativity and 1T10.

Proof.

(1) Because {0<ρi,1im+s,0wi,k1,1im+s,1kn,1xk10,1kn,then k=1nwi,k{max1kn{xk}}ρik=1nwi,kxkρik=1nwi,k{min1kn{xk}}ρi.Substituting k=1nwi,k=1 in the above inequality, we can get {max1kn{xk}}ρik=1nwi,kxkρi{min1kn{xk}}ρi.Raising to 1ρi power in the above inequality, it follows that min1kn{xk}yi={k=1nwi,kxkρi}1ρimax1kn{xk}.From the definition of STMBDA, we know that 1max{x0,y0}xi10,1kn,then {1x0min1kn{xk}yimax1kn{xk}10,1im+s,1y0min1kn{xk}yimax1kn{xk}10,1im+s.

Proposition 4.3

Proportionality holds for T.

Proof.

Since 1xi10,1in,then for 1i, jn xi10xixjxi1110xixj101.According to Proposition 4.2(1), we can obtain that 1yi10,1im+s,it follows that for 1i,jm+s, yi10yiyjyi1110yiyj101.The proposition follows immediately from what we have proved.

Proposition 4.4

Monotonicity is satisfied by T (Tao et al., Citation2015).

Proposition 4.5

T complies with acceleration (Tao et al., Citation2015).

Proposition 4.6

Sensitivity holds for T (Tao et al., Citation2015).

The proof process of Proposition 4.4, 4.5, and 4.6 can be found in Tao et al. (Citation2015)

Proposition 4.7

T meets substitutivity.

Proof.

By Equation (Equation2), the substitutivity between the sub-attributes making up the ith (1im+s) attribute is able to be determined as follows: (3) σikl=1ρi+1,1im+s,1k,ln,kl.(3) Similarly, through Equation (Equation1), for the substitutivity between critical attributes, it follows that (4) σij=1,1i,jm,ij,(4) for the substitutivity between non-critical attributes, we can deserve (5) σij=1,m+1i,jm+s,ij,(5) and the substitutivity between critical and non-critical attributes can be obtained as σij={1+yiyjT/yiT/yj(1+ρβi)+(1+ραj)yiyjT/yiT/yj,jmin,1jm,m+1im+s,1+yiyjT/yiT/yj(1+ρβi)+(1+ρϵ+ραj)yiyjT/yiT/yj,j=min,m+1im+s.Since for jmin,1jm,m+1im+s, 0αj,βi1,and 0<ρ,then we have (1+ρmin{αj,βi})(1+yiyjT/yiT/yj)(1+ρβi)+(1+ραj)yiyjT/yiT/yj(1+ρmax{αj,βi})(1+yiyjT/yiT/yj),therefore, 11+ρmax{αj,βi}σij11+ρmin{αj,βi}.Observe that for jmin,1jm,m+1im+s, {11+ρ11+ρmax{αj,βi},11+ρmin{αj,βi}1,and 0<ρρi,1im+s,it follows (6) 11+ρiσij1,jmin,1jm,m+1im+s.(6) Similarly, for j=min, m+1im+s, 11+ρmax{ϵ+αj,βi}σij11+ρmin{ϵ+αj,βi}.Note that for j=min,m+1im+s, {11+ρ11+ρmax{ϵ+αj,βi},11+ρmin{ϵ+αj,βi}1,and 0<ρρi,1im+s,it follows (7) 11+ρiσij1,j=min,=m+1im+s.(7) It can be seen from Equation (Equation3) that the sub-attributes can be replaced with each other to a certain extent. According to Equations (Equation4), (Equation5), (Equation6) and (Equation7), attributes can be replaced by each other to a certain extent. By Equations (Equation6) and (Equation7), the substitutivity between critical and non-critical attributes is harder than both the substitutivity between critical attributes and that between non-critical attributes, and the substitutivity between sub-attributes is harder than that between attributes.

In summary, the proposition is proved.

Proposition 4.8

T satisfies expectability.

Proof.

By the definition of STMBDA, we can obtain that x0min{x1,,xn}.According to Proposition 4.2(1), it follows that x0yimax{x1,,xn}.Since {0αi1,1im,i=1mαi=1.and from the proof process of Proposition 4.2(1) we know that 1yi10,1im.Then ymin1+ϵ10ϵmin1im{(yi10)ϵ}y1α1y2α2ymαmymax.Likewise, we can prove yminym+1βm+1ym+2βm+2ym+sβm+symax.Because of 0<ρ and 0α,β1, it follows that αymaxρ+βymaxραa1+βb1α(ymin1+ϵ10ϵ)ρ+βyminρ.Due to α+β=1, we obtain that min{ymaxρ,ymaxρ}αa1+βb1max{(ymin1+ϵ10ϵ)ρ,yminρ}.So min{ymin1+ϵ10ϵ,ymin}Tmax{ymax,ymax}=max{y1,,ym+s}.Recall that we have proved that 1y0yi10,1im+s,then y0min{y1,,ym+s}.Since 0ϵmin{1αmin,lny0lnyminlnyminln10},it follows that y0min{ymin1+ϵ10ϵ,ymin},and we can deserve that y0Tmax{y1,,ym+s}.

From the conclusions of Proposition 4.2, 4.3, 4.4, 4.5, 4.6, 4.7 and Proposition4.8, the following theorem can be obtained.

Theorem 4.9

STMBDA conforms to all the seven properties introduced in Section 3.

5. Measurement process based on STMBDA

The STMBDA-based measurement process is given in Figure . For a given software, Step 1 is used to determine the sub-attribute values x1,,xn, the sub-attribute weight values wi,k(1im+s,1kn) and the parameter values ρ1,,ρm+s. Then the attribute values yi(1im+s) are calculated by STMBDA in Step 2. Step 3 computes critical attribute weight values α1,,αm, the non-critical attribute weight values βm+1,,βm+s, and the parameter values ρ and ϵ. The above weight values can be calculated using the method proposed in Tao and Chen (Citation2012). In Step 4, STMBDA is utilised to get the degree of software trustworthiness based on the results of Step 2 and Step 3.

Figure 1. Measurement procedure based on STMBDA.

Figure 1. Measurement procedure based on STMBDA.

6. Empirical validation

The empirical validation contains case study, survey and experiment (Fenton & Bieman, Citation2015). An empirical validation of STMBDA is conducted by a real case in this section. The trustworthiness of spacecraft software is one of the crucial factors to guarantee the success of space mission. However, their current evaluation is only qualitative. To tighten the measurement of trustworthiness of the spacecraft software, with the assistance of STMBDA, their trustworthiness is assessed by utilising axiomatic approaches. The trustworthy attributes of spacecraft software contain 9 attributes, subdivided into 28 sub-attributes (J. Wang et al., Citation2015). The attributes, sub-attributes and the corresponding weight values are described in Table  (J. Wang et al., Citation2015).

Table 1. Trustworthy attributes, sub-attributes of spacecraft software and their weight values.

Following the weight values of the nine attributes, the first four attributes are regarded as critical attributes and the last five attributes as non-critical attributes. The following parameter values for STMBDA are then deserved by Table : m = 4, s = 5, n = 28, α=0.20+0.17+0.15+0.11=0.63, β=0.09+0.09+0.09+0.05+0.05=0.37, and (α1,α2,α3,α4)=(0.200.63,0.170.63,0.150.63,0.110.63)=(0.32,0.27,0.24,0.17),(β5,β6,β7,β8,β9)=(0.090.37,0.090.37,0.090.37,0.050.37,0.050.37)=(0.24,0.24,0.24,0.14,0.14).If the kth sub-attribute is not in the set of the sub-attributes constituting the ith attribute, then wi,k=0, which are not given in this table.

Trustworthiness of each sub-attribute is classified into four levels: A, B, C and D. To calculate the attribute values based on STMBDA, levels of sub-attributes are converted to specific values. Level A is converted to 10, Level B to 9, Level C to 7, and Level D to 2. A panel of 10 experts was invited to classify the 28 sub-attributes of the 23 spacecraft software (J. Wang et al., Citation2015). The scoring method is that each expert divides them according to the four grades of A, B, C, and D in the form of a combination of subjective and objective principles. For the specific classification standards, please refer to Section 7.2 of the reference (Chen & Tao, Citation2019). We chose 11 representative software as subjects, which are numbered 2, 4, 6, 7, 9, 18, 19, 20, 21, 22 and 23. Figure  shows the distributions of sub-attribute values of these 11 software. The horizontal axis of each subgraph of Figure  displays the number of sub-attribute, and the vertical axis shows the sub-attribute values.

Figure 2. Distributions of sub-attribute values of 11 representative software.

Figure 2. Distributions of sub-attribute values of 11 representative software.

Given that the difficulty of the substitution between the sub-attributes constituting critical attributes should be greater than that between the sub-attributes making up non-critical attributes, let ρj<ρi(1i4,5j9). For simplicity, set ρ1=ρ2=ρ3=ρ4,ρ5=ρ6=ρ7=ρ8=ρ9, their specific values used in this case are given in the Figure . The distributions of attribute values computed through STMBDA of these 11 software are presented in Figure . In each subgraph of Figure , the vertical axis represents the number of attribute and the horizontal axis displays the attribute value. In order to compare with the model given in J. Wang et al. (Citation2015) (referred to as PBSTE3), PBSTE3 is also used to measure the attribute trustworthiness of these 11 representative software. In PBSTE3, both the trustworthiness measurement model and the attribute measurement model are in the form of the product of power functions. According to the sub-attribute weight values given in Table , the attribute value distributions calculated by the attribute measurement model in PBSTE3 are shown as the yellow line in Figure .

Figure 3. Distributions of attribute values of 11 representative software.

Figure 3. Distributions of attribute values of 11 representative software.

It can be seen from Figures  and that the measurement results of attribute trustworthiness obtained by STMBDA can reflect the actual development of spacecraft software. For instance, since such software is developed by GJB5000A standard, the software change control for these 11 software is generally good. Meanwhile, the weaknesses in the software development process are able to be easily identified. For example, these 11 software generally lack special testing, the reason is that the testing verification technology is not advanced because of the design of dynamic timing, space, data use, and control behaviour, etc. Furthermore, the attribute metric model presented herein is more universal when compared to the attribute metric model built in J. Wang et al. (Citation2015). We can adjust software attribute trustworthiness through the parameters ρi(1i9) in STMBDA. If we have higher trustworthy requirements for attribute, then we raise the values of ρi(1i9), and vice versa. However, the distributions of sub-attribute values in Figure  demonstrate that some grading criteria are too high to be achieved and some are too low to be easily implemented, thus the grading standards of sub-attribute trustworthiness need to be improved in future applications.

For the given parameter values (ρ1,ρ2,ρ3,ρ4,ρ5,ρ6,ρ7,ρ8,ρ9)=(6,6,6,6,3,3,3,3,3),the distributions of trustworthy degrees of these 11 software calculated through STMBDA are given in Figure . The yellow line in Figure  is obtained by the software trustworthiness metric model presented in J. Wang et al. (Citation2015) according to the weight values presented in Table . As can be seen from Figure , STMBDA does not greatly raise the trustworthy degree of the software due to the high value of individual sub-attribute, but the sub-attributes with lower values will reduce the degree of the software. That is, for the software to be trusted, each attribute must be trusted to a certain value. Meanwhile, the software trustworthiness measurement model built in J. Wang et al. (Citation2015) cannot reflect this situation well. In a similar way, we can adjust the software trustworthiness through the parameter ρ as required. If we have a higher trustworthy requirement for software, then we raise the value of ρ, and vice versa. At the same time, we can regulate the influence of the minimum critical attribute on software trustworthiness by modifying the parameter ϵ

Figure 4. Distributions of trustworthy degrees of 11 representative software.

Figure 4. Distributions of trustworthy degrees of 11 representative software.

This case demonstrates that STMBDA is suited to the measurement of the spacecraft software trustworthiness and can effectively assess their trustworthiness and accurately discover the vulnerability in the development process, which is very important for improving the development level of this type of software.

7. Comparative study

In the remainder of this section, we compare STMBDA with PBSTE1 (Tao & Chen, Citation2009), PBSTE2 (Tao & Chen, Citation2012), PBSTE3 (J. Wang et al., Citation2015), evidence theory-based software trustworthiness measure (ERBSTM) (Ding et al., Citation2012) and fuzzy theory-based software trustworthiness measure (FTBSTM) (Shi et al., Citation2008) by the properties described in Section 3. The comparison results are given in Table , among them, × indicates the measure does not meet the corresponding property and √ represents the measure conforms to the corresponding property.

Table 2. Comparative study by the properties introduced in Section 3.

It is easy to prove that all the metric models comply with non-negativity. PBSTM1, PBSTM2, PBSTM3, ERBSTM and FTBSTM do not take into account the issue associated with proportionality, so none of them meet proportionality. Ding et al. (Citation2012) and Shi et al. (Citation2008) demonstrate both ERBSTM and FTBSTM satisfy expectability. J. Wang et al. (Citation2015) show that PBSTM3 conforms to monotonicity, acceleration, sensitivity, substitutivity and expectability.

Next, we give a counter-example to show that neither PBSTM1 nor PBSTM2 satisfies the expectability. For a given software, suppose the number of its critical attributes m is 3 and the number of its non-critical attributes s is 2. The weight vector of critical attributes is (α1,α2,α3)=(0.5278,0.3325,0.1396), and that of non-critical attributes is (β4,β5)=(0.6667,0.3333). Let ϵ=0.01 and (y1,y2,y3,y4,y5)=(8,8,7,8,8). Then the trustworthy degree of this software is 6.69 calculated by PBSTM1 and 6.54 computed by PBSTM2, both of which are less than minimum value of all the trustworthy attributes. Thus, neither PBSTM1 nor PBSTM2 complies with the expectability.

The rest of the comparison results can be obtained from the comparative study of Tao and Zhao (Citation2018).

Finally, it can be concluded from Table  that STMBDA is superior to all five methods from the perspective of the properties introduced in Section 3.

8. Conclusion and future work

In this paper, we complete the set of expected properties of software trustworthiness measurement based on attribute decomposition, give two new properties, namely non-negativity and proportionality, improve substitutability and expectability, and introduce STMBDA given in Tao et al. (Citation2015). We verify the theoretical rationality of STMBDA through showing that it complies with the new property set, and the empirical validity by measuring the trustworthiness of 23 spacecraft software. We also present a comparative study, which shows that STMBDA outperforms the other five models in terms of the new property set. It should be noted that we only give the expected property set from the perspective of experience, which is reasonable but not complete. On the other hand, in the empirical validation, we specify the parameter values ϵ, ρ and ρi in advance and do not give them the specific solving algorithms.

Several issues are worthy for further investigation. Firstly, we believe that the software trustworthiness measure properties given in this paper are necessary, but not sufficient. We will be interested in extending and perfecting this set of properties. Secondly, attribute decomposition-based software trustworthiness measures that do not meet the properties described in Section 3 cannot be taken as legitimate measures. However, metrics satisfying these properties should be used only as candidate metrics, and they still need to be better checked. We have utilised STMBDA to spacecraft software trustworthiness measurement, and we will use STMBDA for trustworthiness measurement of other types of software to conduct a comprehensive empirical validation in the future. Thirdly, we do not give a way to calculate the parameter values ϵ, ρ and ρi in STMBDA, and how to determine these parameter values is also important for future work.

Acknowledgments

A preliminary version of this work was presented at 20th IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C) [Decomposition of Attributes Oriented Software Trustworthiness Measure Based on Axiomatic Approaches (Tao et al., Citation2020)].

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was financially supported by the Doctoral Research Fund of Zhengzhou University of Light Industry [grant number 2016BSJJ037]; and the Science and Technology Project of Henan Province [grant number 202102210351], [grant number 212102210076].

References

  • Briand, L. C., Emam, K. E., & Morasca, S. (1996). On the application of measurement theory in software engineering. Empirical Software Engineering, 1(1), 61–88. https://doi.org/10.1007/BF00125812
  • Briand, L. C., Morasca, S., & Basili, R. V. (1996). Property-based software engineering measurement. IEEE Transactions on Software Engineering, 22(1), 68–86. https://doi.org/10.1109/32.481535
  • Chen, Y. X., & Tao, H. W. (2019). Software trustworthiness measurement evaluation and enhancement specifications. Science Press.
  • Cho, J. H., Xu, S. H., Hurley, P. M., Mackay, M., Benjamin, T., & Beaumont, M. (2019). STRAM: Measuring the trustworthiness of computer-based systems. ACM Computing Surveys, 51(6), 1–47. https://doi.org/10.1145/3277666
  • Devi, D., Biswas, S. K., & Purkayastha, B. (2019). Learning in presence of class imbalance and class overlapping by using one-class SVM and undersampling technique. Connection Science, 31(2), 105–142. https://doi.org/10.1080/09540091.2018.1560394
  • Ding, S., Yang, S. L., & Fu, C. (2012). A novel evidential reasoning based method for software trustworthiness evaluation under the uncertain and unreliable environment. Expert Systems with Applications, 39(3), 2700–2709. https://doi.org/10.1016/j.eswa.2011.08.127
  • Falcone, R., & Castelfranchi, C. (2002). Issues of trust and control on agent autonomy. Connection Science, 14(4), 249–263. https://doi.org/10.1080/0954009021000068763
  • Fenton, N., & Bieman, J. (2015). Software metrics: A rigorous and practical approach (3rd ed.). CRC Press.
  • Gene, M. A., & Tyler, J. R. (2018, January 3–6). Trustworthiness perceptions of computer code a heuristic-systematic processing model. Proceedings of the 51st Hawaii International Conference on System Sciences, Hilton Waikoloa Village, Hawaii (pp. 5384–5393).
  • Gul, J., & Luo, P. (2019, August 5–8). A unified measurable software trustworthy model based on vulnerability loss speed index. Proceedings of 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering, Rotorua, New Zealand (pp. 18–25).
  • Gupta, S., Aroraa, H. D., Naithania, A., & Chandrab, A. (2021). Reliability assessment of the planning and perception software competencies of self-driving cars. International Journal of Performability Engineering, 17(9), 779–786. https://doi.org/10.23940/ijpe.21.09.p4.779786
  • He, J. F., Shan, Z. G., Wang, J., Pu, G. G., Fang, Y. F., Liu, K., Zhao, R. Z., & Zhang, Z. T. (2018). Review of the achievements of major research plan of trustworthy software. Bulletin of National Natural Science Foundation of China, 32(3), 291–296. https://doi.org/10.16262/j.cnki.1000-8217.2018.03.009
  • Lemes, C. I., Naessens, V., & Vieira, M. (2019, October 28–31). Trustworthiness assessment of web applications: Approach and experimental study using input validation coding practices. Proceedings of IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), Berlin, Germany (pp. 435–445).
  • Lian, S. X., & Tang, M. D. (2022). API recommendation for Mashup creation based on neural graph collaborative filtering. Connection Science, 34(1), 124–138. https://doi.org/10.1080/09540091.2021.1974819
  • Liu, H., Tao, H. W., & Chen, Y. X. (2021). An approach for trustworthy evidence of source code oriented aerospace software trustworthiness measurement. Aerospace Control and Application, 47(2), 32–41. https://doi.org/10.3969/j.issn.1674-1579
  • Maza, S., & Megouas, O. (2021). Framework for trustworthiness in software development. International Journal of Performability Engineering, 17(2), 241–252. https://doi.org/10.23940/ijpe.21.02.p8.241252
  • Medeiros, N., Ivaki, N., Costa, P., & Vieira, M. (2020). Vulnerable code detection using software metrics and machine learning. IEEE Access, 8, 219174–219198. https://doi.org/10.1109/Access.6287639
  • Meneely, A., Smith, B., & Williams, L. (2012). Validating software metrics: A spectrum of philosophies. ACM Transactions on Software Engineering and Methodology, 21(4), 1–28. https://doi.org/10.1145/2377656.2377661
  • Muhammad, D. M. S., Fairul, R. F., Loo, F. A., Nur, F. A., & Norzamzarini, B. (2018). Rating of software trustworthiness via scoring of system testing results. International Journal of Digital Enterprise Technology, 1(1/2), 121–134. https://doi.org/10.1504/IJDET.2018.092637
  • Shi, H. L., Ma, J., & Zou, F. Y. (2008, August 29–September 2). Software dependability evaluation model based on fuzzy theory. Proceedings of International Conference on Computer Science & Information Technology, Singapore (pp. 102–106).
  • Srinivasan, K. P., & Devi, T. (2014). Software metrics validation methodologies in software engineering. International Journal of Software Engineering & Applications, 5(6), 87–102. https://doi.org/10.5121/ijsea
  • Steffen, B., Wilhelm, H., Alexandra, P., Becker, S., Boskovic, M., Dhama, A., Hasselbring, W., Koziolek, H., Lipskoch, H., Meyer, R., Muhle, M., Paul, A., Ploski, J., Rohr, M., Swaminathan, M., Warns, T., & Winteler, D. (2006). Trustworthy software systems: A discussion of basic concepts and terminology. ACM SIGSOFT Software Engineering Notes, 31(6), 1–18.https://doi.org/10.1145/1218776.1218781
  • Tao, H. W., & Chen, Y. X. (2009, September 15–18). A metric model for trustworthiness of softwares. Proceedings of the 2009 IEEE/WIC/ACM International Conference on Web Intelligence and International Conference on Intelligent Agent Technology, Milan, Italy (pp. 69–72).
  • Tao, H. W., & Chen, Y. X. (2012). A new metric model for trustworthiness of softwares. Telecommunication Systems, 51(2-3), 95–105. https://doi.org/10.1007/s11235-011-9420-9
  • Tao, H. W., Chen, Y. X., & Pang, J. M. (2015, May 18). A software trustworthiness measure based on the decompositions of trustworthy attributes and its validation. Proceedings of Industrial Engineering, Management Science and Applications, Tokyo, Japan (pp. 981–990).
  • Tao, H. W., Chen, Y. X., & Wu, H. Y. (2020, December 11–12). Decomposition of attributes oriented software trustworthiness measure based on axiomatic approaches. Proceedings of the 20th IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C), Macau, China (pp. 308–315).
  • Tao, H. W., & Zhao, J. (2018). Source codes oriented software trustworthiness measure based on validation. Mathematical Problems in Engineering, 2018(3), 1–10.https://doi.org/10.1155/2018/6982821
  • Tian, J. F., & Guo, Y. H. (2020). Software trustworthiness evaluation model based on a behaviour trajectory matrix. Information and Software Technology, 119(1), 106233. https://doi.org/10.1016/j.infsof.2019.106233
  • Wang, B. H., Chen, Y. X., Zhang, S., & Wu, H. Y. (2019). Updating model of software component trustworthiness based on users feedback. IEEE Access, 7, 60199–60205. https://doi.org/10.1109/Access.6287639
  • Wang, H. M. (2018). Harnessing the crowd wisdom for software trustworthiness: Practices in China. Software Engineering Notes, 43(1), 6–11. https://doi.org/10.1145/3178315.3178328
  • Wang, J., Chen, Y. X., Gu, B., Guo, X. Y., Wang, B. H., Jin, S. Y., Xu, J., & Zhang, J. Y. (2015). An approach to measuring and grading software trust for spacecraft software. Scientia Sinica Techologica, 45(2), 221–228. https://doi.org/10.1360/N092014-00479
  • Wong, W. E, Debroy, V., Surampudi, A., Kim, H., & Siok, M. F. (2010, June 9–11). Recent catastrophic accidents: Investigating how software was responsible. Proceedings of the 4th IEEE International Conference on Secure Software Integration and Reliability Improvement (SSIRI), Singapore (pp. 14–22).
  • Wong, W. E, Li, X. L., & Laplante, P. A. (2017). Be more familiar with our enemies and pave the way forward: A review of the roles bugs played in software failures. Journal of Systems and Software, 133(2/3), 68–94. https://doi.org/10.1016/j.jss.2017.06.069
  • Xie, J., Tan, W. A., Yang, Z. B., Li, S. M., Xing, L. Q., & Huang, Z. Q. (2022). SysML-based compositional verification and safety analysis for safety-critical cyber-physical systems. Connection Science, 34(1), 911–941. https://doi.org/10.1080/09540091.2021.2017853
  • Xu, J. L., Xiao, L. J., Li, Y. H., Huang, M. W., Zhuang, Z. C., Weng, T. H., & Liang, W. (2021). NFMF: Neural fusion matrix factorisation for QoS prediction in service selection. Connection Science, 33(3), 753–768. https://doi.org/10.1080/09540091.2021.1889975
  • Yang, X., Jabeen, G., Luo, P., Zhu, X. L., & Liu, M. H. (2018). A unified measurement solution of software trustworthiness based on social-to-software framework. Journal of Computer Science and Technology, 33(3), 603–620. https://doi.org/10.1007/s11390-018-1843-2
  • Zuse, H., & Bollmann-Sdorra, P. (1991, May 5). Measurement theory and software measures. Proceedings of the BCS-FACS Workshop on Formal Aspects of Measurement, South Bank University, London (pp. 219–259).