356
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Attack Model for Generic Intelligent Systems

ORCID Icon

Abstract

Intelligent systems, machine learning models, and artificial intelligence methods are increasing in complexity, and the integration of such technologies has also increased. However, the rise in the adoption of intelligent systems has raised many challenges, including increased attack surfaces, which has resulted in a greater number of cyber threats. Thus, we investigate and analyze potential threats against intelligent systems and their components, and we identify attacks that can occur against the data and related processes used in intelligent systems. The resultant attack model is shown to contain threats that can affect confidentiality, integrity, and availability of intelligent systems.

Introduction

Since the birth of artificial intelligence in 1956 and its emergence thirty years later along with the growth of the Internet, the number of technological advances, Internet users, digital services, and applications has grown exponentially. Some of the primary objectives of emerging technological innovations are to provide convenience and solve existing social and technological problems. One field that has received attention increasingly over the years is intelligent systems and their components, e.g., machine learning models and artificial intelligence methods.

An intelligent system or device employs advanced technologies to perceive and respond to surrounding data sources and the environment. An intelligent system can come in various shapes and forms, e.g., robots or software applications. The data input to intelligent systems can also be from various sources, including video cameras, sensors, and data from databases or social media. Once the data are input to the system, they are processed using complex computational methods, i.e., machine learning, artificial intelligence, and deep learning, and the system outputs useful information for predicting events or making decisions. Examples of intelligent system usage include factory automation (Okeme et al., Citation2021), assistive robots (Khamis et al., Citation2021), military applications (Bistron & Piotrowski, Citation2021), education (Okacha et al., Citation2021), and medical care, with the most recent medical applications being related to COVID-19 (de Freitas Barbosa et al., Citation2022).

Even though intelligent systems are beneficial and applicable in many domains, several challenges need to be considered, e.g., uncertainty of the environment, volatility, and unpredictability of the world in general, time-consuming computation, information loss, and more importantly, security. In this paper, we focus on security issues related to intelligent systems. Such intelligent are computer systems; thus, the various processes in such systems, e.g., data collection, data input, data processing, and output, are susceptible to attacks.

From a security perspective, there are three main characteristics that must be considered. First, data confidentiality ensures that private data are protected. The second characteristic is data integrity, which means that data or information perceived, processed, and output by intelligent systems must be unchanged by an unauthorized entity. Third, data and system availability are related to the ability of authorized entities to access data, information, systems, or networks.

There are many security challenges that come with intelligent systems, e.g., a lack of threat intelligence gathering, a lack of cyber risk management, and a lack of security controls. Examples of attacks on intelligent systems include using adversarial machine learning to attack an image recognition algorithm (Shen et al., Citation2019), and bias against minorities in a company’s recruiting system (Dastin, Citation2018). These examples indicate that attacks on intelligent systems are possible; thus, attack models and threat analysis methods are required.

Existing attack models, e.g., the MITER ATT&CK Matrix for Enterprise (Xiong et al., Citation2022), describes what adversaries could do to compromise an organization’s systems and networks. However, there is a limited focus on the perspective of intelligent systems, which still requires adequate analyses to ensure that security can be improved.

The primary contributions of this paper are summarized as follows. First, we analyze vulnerabilities in intelligent systems. Second, we propose an attack model for potential cyber threats against intelligent systems. The goal of the proposed attack model is to identify the types of attacks that can occur against intelligent systems at different processing stages and different points in the architectures of such systems. The resultant attack model shows that there are multiple threats that can affect confidentiality, integrity, and availability of data and intelligent systems.

The remainder of this paper is structured as follows. Basic intelligent system architecture and existing attack modeling methods are presented in the Related Work section. The generation of an intelligent system attack model is explained in the Methodology section. The analysis of potential threats and attacks, and an attack model are presented in the Proposed Attack Model section. Finally, the Discussion and Conclusion section discusses and summarizes the paper.

Related work

This section provides background knowledge about the basic architecture of an intelligent system and attack modeling methods.

Intelligent system architecture

An intelligent system can be defined as a system that works with other components or entities, leading to perceptions and processes. These processes can be performed following some specified rules. Apart from using data for processing, intelligent systems can also use these specified rules to learn more about the data. The processes and behaviors of intelligent systems are based on two rules, i.e., they must act and make rational decisions, and they must follow all standards accepted by the community.

By definition, intelligent systems (Gregor & Benbasat, Citation1999) appear to be “smart” because they can process data to learn independently; however, to apply intelligent systems in the real world, at least three interfaces are required (Molina, Citation2022). The first interface is the process interface, which deals with data collection from various external sources, e.g., sensors or user input. The process interface also involves controlling devices based on the data input to the system. The second interface is the software interface, which is used to send and receive data between the different components of an intelligent system. For example, data exchange could be realized using functions within the software itself, between the system and a database, or between a data collection component and artificial intelligence software. The third mandatory interface is the human–machine interface, which helps with the communication between users and an intelligent system using human language on a display device in ways that can be understood by humans, such as textual, tabular, diagrammatic presentations.

Before introducing the intelligent system architecture, it is necessary to demonstrate how “intelligence” is realized over five steps, i.e., data collection, information extraction (from collected data), knowledge generation (from the extracted information), learning (from the generated knowledge), and intelligence creation ().

Figure 1. Steps toward “intelligence.”

Figure 1. Steps toward “intelligence.”

Here, the data collection step represents collecting raw symbols, e.g., text, numbers, or mathematical symbols, from various sources and methods, including sensors, users, or social networks. However, by themselves, these symbols do not contain useful meaning. Thus, in the information extraction step, the data are turned in information. By processing the data, information can be used to help answer basic who, what, where, and when questions. The third step is knowledge generation, which represents the ability to apply the collected data and obtained information to answer a given how question. In other words, the process does not stop at turning data into information, i.e., the data and information are used to generate knowledge that can be applied to answer specific questions and solve specific problems. In addition, gaining knowledge helps explain why things happen the way they do. In the learning step, a system or device obtains knowledge, behaviors, or skills, including the ability to analyze and synthesize information that can be used to solve specific problems. Finally, intelligence is realized. By going through the previous four steps and by bringing together the data, information, knowledge, and learning, we realize the ability to specify and define problems. Thus, we can then analyze, synthesize, design, and develop new methods to answer specific questions and solve specific problems.

These five steps provide the foundation for the basic architecture of intelligent systems ().

Figure 2. Basic architecture of intelligent systems.

Figure 2. Basic architecture of intelligent systems.

As shown in , the first component of the architecture is sensing, which is where data are collected. For example, various types of data can be collected using computer vision technologies to acquire images. In addition, data can be collected from Internet of Things devices and smart phones, and by scraping social media. However, at this stage, the data from the original sources may not be in a form that can be used directly in the processing stage. Thus, the data must be passed to the second component of the architecture, i.e., data digitization, which is when the data are transformed into a digital format. In addition, data from various sources can be turned into the same digital format and stored in the same place for future use. This leads to the third component, i.e., data storage. Data are typically stored in a database. The database in the intelligent system architecture is used to store data collected during the sensing process of the architecture and data that have already been processed in other parts of the architecture.

The next part of the intelligent system architecture is responsible for computing or processing the data, which can be performed in two ways. First, the data can be computed or processed using a single standalone device, e.g., a computer or a smart device. Second, the data can be computed or processed using a cloud computing environment. Note that outputs from the computing stage vary depending on the requirements and objectives of each intelligent system. For example, the outputs can be statistical analyses or a generated analysis model, including the use of artificial intelligence, machine learning, and deep learning technologies.

Ultimately, the output of the computing component will comprise a learning process and a knowledge building process, which in turn provide intelligence. One example of the output includes pattern recognition for specifying objects, animals, or humans. Another example is the evaluation or prediction of events.

In addition to proposing the architecture in , we studied other characteristics of intelligent systems (Molina, Citation2022) to ensure that the basic architecture used in our research is indeed the generalization of intelligent systems. Molina (Citation2022) characterizes intelligent systems as agent-based systems, which can either interact directly with the environment or interact with other agents. The intelligent systems that interact with the environment consist of three major components. They are the part that senses the environment, the part that acts on the environment, and the part that allows humans to interact with the intelligent system. The intelligent systems that interact with other agents also consist of the same components. The difference, however, is that with this type of intelligent system, they can either play a delegate or an advisor. On the one hand, an intelligent system plays a delegate role when a user gives the system a task to perform, and the system makes its own decisions on how to do it without any intervention of the user. On the other hand, an intelligent system plays an advisory role when the system provides useful information to the user and suggests to the user what action to take.

In addition to the three components, there is another component known as the learning component, which is a part commonly associated with intelligent behaviors (Molina, Citation2022). This is also an approach used by Mitchell (Citation1997) who formulated a definition of learning as: “a computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks T, as measured by P, improves with experience E” (Mitchell, Citation1997).

From the review of literature, other researchers (Molina, Citation2022; Wooldridge & Jennings, Citation1995) also think of intelligent systems in a similar context to our proposed basic intelligent system architecture in . Generally, the basic architecture of an intelligent system comprises data collection processes, databases, computing and/or processing components, knowledge generation, and intelligence creation, and this basic architecture can be applied to design and develop any required application. This, therefore, ensures that the proposed attack model can cover different types of intelligent systems.

Attack modeling

Attack modeling, also referred to as threat modeling, is a process to analyze potential risks, threats, and attacks (Uzunov & Fernandez, Citation2014). Attack modeling is generally thought of as an approach that can help secure software or systems in the design phase by examining potential vulnerabilities and how an adversary may attack the target system (Bedi et al., Citation2013).

Based on a previous literature review (Wenjun & Lagerstrom, Citation2019), attack modeling methods can be classified into several groups: manual modeling (Al-Fedaghi & Moein, Citation2014), automatic modeling (Frydman et al., Citation2014), formal modeling (Frydman et al., Citation2014), and graphical modeling (Al-Fedaghi & Moein, Citation2014; Meszaros & Buchalcevova, Citation2017), with STRIDE (Kohfelder & Garg, Citation1999), Process for Attack Simulation and Threat Analysis (PASTA) (Wolf et al., Citation2021), LINDDUN (Wuyts et al., Citation2020), and the MITER ATT&CK Matrix (Xiong et al., Citation2022) being more popular than others. Note that STRIDE, PASTA, and LINDDUN belong to the manual, automatic, and formal modeling categories, while MITER ATT&CK Matrix is classified to be in the manual, formal, and graphical categories (Wenjun & Lagerstrom, Citation2019).

STRIDE was originally introduced to analyze potential attacks against Microsoft products. The authors (Kohfelder & Garg, Citation1999) divides potential threats into six groups according to the security characteristic each threat looks to cause damage. These six groups of threats include (1) spoofing identity, which breaks authentication, (2) tampering with data, which causes the data to lose integrity, (3) repudiation, which is denying one’s action, (4) information disclosure, which damages confidentiality, (5) denial of service, which causes disruption to a system’s availability, and (6) elevation of privilege, which provides authorization to an unauthorized entity.

PASTA (UcedaVélez, Citation2021) takes a different approach than STRIDE. STRIDE looks at possible attacks. In contrast, PASTA models potential attacks and impacts according to business processes and objectives in a seven-step process. This process is described as follows.

  1. Specifying business objectives. Here threats and their impacts on the objectives are considered in this step.

  2. Specifying the technical scope. Weaknesses and types of threats are studied in this step.

  3. Specifying system specifications and relationships among the system’s components. In this step, the trust model of transactions is understood.

  4. Threat analysis. Potential threats are identified in this step.

  5. Vulnerability identification. In this step, weaknesses in the system architecture and the system analysis and design stage are identified in this step.

  6. Attack modeling. Here, attack simulations are performed to confirm the existence of the attack.

  7. Risk analysis. In this step, the impacts on business are analyzed if attacks occur.

The LINDDUN attack modeling approach was developed to design a secure system. LINDDUN (Deng et al., Citation2011; Wuyts et al., Citation2020) integrates privacy and security into the system, the work process, and the security policy of the system. LINDDUN specifies seven categories of threats that can be performed by an adversary.

  1. Linkability. Here, an adversary attempts to connect two components together even if they do not know the subject’s identity.

  2. Identifiability. This is when it is possible for an adversary to identify parts of the system they are interested in.

  3. Nonrepudiation. Here, an adversary attempts to deny their actions.

  4. Detectability. This is similar to identifiability. Here, an adversary can detect parts of the system they are willing to attack.

  5. Disclosure of information. Here, an attacker can access data they are not authorized to access.

  6. Unawareness. Here, the owner of the data or system has no idea their data are being shared and accessed by unauthorized entities.

  7. Noncompliance. This means that the methods used to store and process the system’s data do not follow any standard or policy.

The LINDDUN attack modeling framework involves three main steps, i.e., (1) system modeling, (2) threat identification, analysis, and documentation, and (3) threat prioritization and mitigation.

For enterprise systems, MITER ATT&CK matrices are a collection of tactics and techniques used by attackers acquired from real-world observations (ATT&CK, Citation2022). The attacks in the MITER ATT&CK enterprise matrix have been executed against many popular platforms, e.g., Windows, Linux, macOS, and cloud and container platforms. The attacks in the MITER ATT&CK mobile matrix are related to attack tactics and techniques against iOS and Android systems. MITER ATT&CK also has a collection of attacks against industrial control systems. Although presented separately, these three matrices contain tactics that adversaries can use on different platforms. Thus, these matrices can be used as a foundation for the analysis and identification of threats to a system being designed and developed.

The MITER ATT&CK enterprise and mobile matrices contain 14 tactics that adversaries can exploit to attack these platforms, and the industrial control system matrix contains 12 tactics (ATT&CK, Citation2022). Examples of these tactics include initial access, execution, privilege escalation, credential access, defense evasion, data collection, and exfiltration.

Even though the above attack modeling methods provide generic details about how to analyze and specify potential threats to a system, they do not specifically focus on modeling threats for intelligent systems. However, previous studies have focused on potential attacks on parts of intelligent systems, i.e., machine learning components, which are present in the computing component of intelligent system architecture.

For example, Microsoft engineers in the AI Working Group described a methodology to securely develop a system that involves the use of artificial intelligence and machine learning (Marshall et al., Citation2022). Their objective was to employ this technique as a supplement to their existing security development lifecycle. This work was divided into two main parts. The first part focuses on questions that should be asked when analyzing threats to artificial intelligence and machine learning systems. The second part provides details about how to mitigate specific attacks. Some of the threats identified and explained in their study included data poisoning, which is when an attacker uses irrelevant data to fool the learning and computing process of the artificial intelligence and machine learning systems. In addition, they examined the possibility of a machine learning model being stolen by an unauthorized entity.

Another previous study (Anley, Citation2022) expanded the work of Microsoft with a specific focus on the attack taxonomy of machine learning. However, Anley reported more technical details, including corresponding code examples. Anley also provided a more comprehensive list of potential attacks against machine learning systems. Although some of these attacks had been explained previously in the literature (Marshall et al., Citation2022), Anley explained them in significantly greater detail. In addition, Anley described some attacks that were not considered in the previous work, e.g., denial of service and model repurposing attacks. Like the work by Microsoft, Anley described various concepts regarding the mitigation of such threats.

Other studies have investigated the security of artificial intelligence and machine learning systems. For example, Curzon et al. (Citation2021) examined the privacy and security concerns with artificial intelligence by evaluating privacy harm through a privacy impact assessment (PIA). Curzon et al. acknowledged that the PIA concept is an evolving area of research. Nonetheless, the steps involved in PIA were applied to two system characteristics, i.e., general and technical characteristics. Here, the general characteristics involved data and the identities of individuals, and the technical characteristics involved communication over untrusted channels.

When considering privacy concerns associated with artificial intelligence and machine learning systems, Wright et al. (Citation2013) presented a simplified approach that Curzon et al. (Citation2021) also applied in their work. Wright et al. suggested that privacy considerations can be examined according to only two criteria. The first criterion is related to information flow in the system, including information collection, storage, and processing. The second criterion considers the types of data because different data types require different approaches to realize privacy protection. Note that the first criterion is confirmed by the basic intelligent architecture shown in , which shows exactly how information flows from the data collection process to the final computing processes. Thus, the approach presented by Wright et al. was applied during the development of the attack model proposed in this paper.

Previous studies have considered the security and privacy of intelligent systems extensively, however, many of these studies only focused on computing and processing components, specifically artificial intelligence and machine learning systems. Thus, an attack model that considers all components of the intelligent system architecture is required.

Methodology

We provide an explanation of how an attack model for the intelligent system architecture was constructed. The analysis for attack model generation was conducted in two parts, i.e., analysis of the data points in the architecture, and the computation and processing components of the architecture.

Data points

The first part of specifying the proposed attack model was to analyze where data are present in the architecture. Typically, the existence of data creates attack opportunities for adversaries; thus, it is important to understand that data can be categorized into three different types, i.e., static data, data in transit, and data that are being processed. Static data means data that are stored on any storage unit, e.g., a database. This includes data in the database that were collected recently and data that have been processed previously. Data in transit means that the data are actively transmitted from a source to a destination through a communication channel. Finally, data that are being actively processed by a function or algorithm for a given task must be considered, and this process could involve computation or analysis using artificial intelligence, machine learning, or deep learning algorithm.

With these categories of the types of data, it is possible to locate where data are present in the intelligent system architecture; thus, potential threats can be identified. The locations and types of data in the intelligent system architecture are shown in .

Figure 3. Locations and types of data in the intelligent system architecture.

Figure 3. Locations and types of data in the intelligent system architecture.

According to the locations and types of data shown in , the data points in the intelligent system architecture can be explained as follows. First, static data are found where data are stored. In other words, a database or databases are used when data are stored within the intelligent system itself. In addition, static data can be stored in other locations, e.g., in a cloud environment. Second, data in transit are found in several locations and processes in the intelligent system architecture, i.e., the data sensing or collecting locations, when the collected data are transmitted to storage units, and when data are transmitted from the storage units to the computing units via a network or the Internet. Third, data that are currently being processed in the intelligent system architecture are found in the data digitization points and the computing components. Here, the data digitization points are where the collected data are converted to a digital format, and the computing components are where data are processed and analyzed to realize the knowledge generation, learning, and intelligence creation purposes.

Having identified all data locations within the architecture, we were able to specify an attack model. Note that this process was performed from the attacker’s perspective. In other words, we identified and considered the various actions attackers can perform on the data at these three data points in the intelligent system architecture.

Processes and internal components

In the analysis for attack model creation, we then investigated the different components and processes in the intelligent system architecture, including the data sensing or collecting process, the data digitization process, the data storing process, and the data processing components. Note that all these processes and components require data to function properly, which implies that the intelligent system can be more secure and work more effectively if the risks associated with data are reduced.

The machine learning and artificial intelligence components of the intelligent system are also important; thus, it is necessary to determine whether these components are susceptible to attacks. In addition, the learning process, which is based on machine learning, is the primary component of knowledge and intelligence generation. Therefore, it is important to understand what occurs during the learning stage. shows the four main steps of the machine learning process.

Figure 4. Machine learning process.

Figure 4. Machine learning process.

As shown in , the collection of raw data is the first step in the machine learning process. After data collection, the data preprocessing step is conducted, which involves data cleansing and data normalization processes. The resultant data are then input to the learning step, where machine learning occurs, and a learning model is created to develop intelligence. Finally, the constructed machine learning model or results are analyzed and evaluated.

From the basic architecture of intelligent systems, the identification of data points, and the specification of the machine process, which was the heart of the intelligent system, it would now be possible to specify types of attacks and create an attack model.

Proposed attack model

Here, we present the proposed attack model for the intelligent system architecture after data points, processes, and components have been identified (). The attack model was proposed based on the potential threats that can occur in each part of the architecture.

Attack model on data of an intelligent system

As stated previously, there are three types of data in the intelligent system architecture, i.e., static data, data in transit, and data that are being processed. Static data are attractive targets for adversaries because large quantities of data are typically found in the storage units. In other words, if an adversary can access a storage unit, a large amount of data may be accessible. Potential attacks that can be executed against static data in the intelligent system architecture are identified and described in .

Table 1. Potential attacks on static data in intelligent systems.

We now provide examples of practical scenarios for how these attacks can be potentially carried out. Firstly, if no secure mechanisms are used to secure data, they can easily be accessed by unauthorized entities, which can potentially lead to the deletion and modification of data. In the case of data modification, Anley (Citation2022) showed that when random noise was added to only 5% of the pixels in an original image, the image became very noisy, and the confidence level of correct classification drastically decreased from over 72% to less than 1%. Moreover, the author tried to reduce the amount of noise so that the resultant image was closer to the original image. The confidence level of correct classification also decreased to less than 1%. Another study called adversarial perturbation carried out by a Microsoft research team (Marshall et al., Citation2022) also yielded a similar result.

Data in transit involve several vulnerabilities. The data are transferred using computer networks and/or the Internet; thus, the data are not always controlled by the data owners or intelligent system. In addition, data in transit may be transmitted over an insecure communication channel, thereby increasing the likelihood of a man-in-the-middle attack. Attacks that can be performed on data in transit are listed in .

Table 2. Potential attacks on data in transit of intelligent systems.

A practical scenario where the attacks listed in can occur can be explained as follows. Suppose there is a sensor collecting temperature data, which are then transmitted to a database on a farm. The data are then sent to the cloud for processing and computation. In this scenario, data transmissions occur between the sensor and database, and from the database to the cloud. Without adequate mechanisms, an adversary can easily sniff data while being transmitted using a packet sniffer such as Wireshark. The adversary can also intercept the data, modify them, and even remove them from the transmission channels. Finally, data injection is when the adversary inserts new items into the existing data to cause the system to behave in the manner required by the attacker (Molina, Citation2022). In our scenario, the attack can simply create new temperature data and transmit them to the farm’s database. Again, without proper protection mechanism, the database will just accept and store the data before passing them to the cloud for computation.

From the lists of potential attacks on static data and data in transit ( and , respectively), it is possible to analyze the potential attacks and specify corresponding attack methods or attack vectors, as shown in .

Table 3. Attack methods and attack types on data in intelligent systems.

The analysis of the attack methods and attack types on static data and data in transit in allows us to identify pairings between attack vector and data types, as shown in .

Figure 5. Pairings between attack vector and data types.

Figure 5. Pairings between attack vector and data types.

shows that there are more attack methods against data in transit than static data in an intelligent system, which confirms the claim that there are more vulnerabilities and risks of attacks in terms of the data in transit.

The analysis in this section has demonstrated that both static data and data in transit cause vulnerabilities in the basic intelligent system architecture, and these vulnerabilities can be exploited using various attack methods. However, this section has not investigated data that are actively being processed. Thus, such data are discussed in the following section in order to specify an attack model and identify corresponding attack methods.

Attack model on processes and internal components of an intelligent system

A principal component of any intelligent system is the processing or learning unit, which employs artificial intelligence or machine learning algorithms. However, for the processing or learning unit to function effectively, other components are required, including an input component, a training component, and a testing component. These components provide an effective and efficient learning model, which inevitably involves data processing. Another component of an intelligent system is the algorithm that leads to the resultant model. Thus, it is necessary to analyze these two components to specify an appropriate attack model with a focus on actively processed data.

Attacks on input data

The effectiveness and efficiency of an algorithmic model in an intelligent system depends on the quality of the input data. This is particularly true in terms of the data used to train machine learning algorithms. Here, data poisoning is a particularly effective attack. Data poisoning can occur when standards for validating data quality are not implemented or enforced, particularly for public data input to an intelligent system. Thus, the potential attacks and risks that can occur are discussed relative to the following questions.

  1. Have the data input to the intelligent system been poisoned? Has any unauthorized entity accessed and modified the data?

  2. Are the data input to the intelligent system from general users? What methods or standards have been used to validate the quality of the data?

  3. Are the data input to the intelligent system public data? How is the data quality validated? Has the security of the connection between the data source and the intelligent system’s databases been examined?

  4. Are the data input to intelligent systems sensitive? If so, what measures are taken to secure the sensitive data?

These questions can help us identify the attack methods an adversary may employ against an intelligent system’s data to reduce the effectiveness of the learning model, which in turn reduces the effectiveness of the intelligent system as a whole.

Attacks on learning processes

The efficiency and effectiveness of an intelligent system, e.g., the accuracy or correctness of prediction and classification tasks, depends on how the corresponding machine learning model is constructed. In our analyses, we identified three potential attack methods against the learning processes of an intelligent system, i.e., model theft, model modification, and model deletion. Theft of a machine learning model is when the learning model constructed in the intelligent system is copied or taken by an adversary, and model theft is directly related to attacks against confidentiality. Model modification can occur when an adversary gains unauthorized access to the machine learning model and modifies the learning algorithm. This results in incorrect or unexpected functionality of the intelligent system, thereby compromising the integrity of the system. Finally, the machine learning model can be deleted entirely by an adversary; thus, the system cannot function as intended, thereby compromising the system’s availability.

Attacks on intelligent system usage

After an intelligent system has been developed, particularly the learning unit, the next step is usage of the system. Note that an intelligent system can be developed for private usage; however, most intelligent systems are used and accessed by various users or entities via application programming interface technologies that function as middleware to make the intelligent system accessible. Thus, it is necessary to identify attacks that can compromise or eliminate the system’s accessibility.

The primary attack method against accessibility is unauthorized access, which can impact intelligent systems in the following ways. If the intelligent system was accessible by a large number of unauthorized entities, the system may not be able to handle all requests, which would result in the system being unable to service authorized entities, i.e., a denial-of-service attack. In addition, if unauthorized entities can access the intelligent system, they may be able to access data or even the machine learning model without authorization. As a result, these adversaries may be able to execute model theft, model modification, and model deletion.

From these analyses of the attacks against the data input to the system, the system’s learning processes, and the usage of the intelligent system, the attack methods or attack vectors that can be performed against the processes and components of intelligent systems are summarized in .

Table 4. Attack methods on processes and components of intelligent systems.

We would like to focus on two types of attacks listed in by providing a practical scenario where these attacks can occur. The first is the model modification attack. The term “model” is basically an algorithm the intelligent system uses, along with some trained parameters associated with that algorithm. The model modification attack can, therefore, be thought of as an infected model, which means that an attacker gains unauthorized access to the model and modifies it so that it executes some arbitrary code. This attack results in the model introducing some undesirable behaviors such as changing the outputs or misclassifications.

The second attack is data poisoning, which is when an adversary influences the behavior of the intelligent system by providing it with modified data. A common approach to data poisoning is called feature crafting (Molina, Citation2022). In data poisoning, the attacker creates training data in such a way that the intelligent system learns features of the attacker’s choice. This practically allows some physical features such as temperatures, colors, and shapes to be used by the attacker to influence the decision or output of the system.

From the analyses performed in this study, we visualized the attack surface and attack vectors, which provides the attack model for the generic intelligent system with basic architecture described throughout this paper. shows the locations of different attack methods, which overall specify the attack model of a generic intelligent system.

Figure 6. Attack model of an intelligent system.

Figure 6. Attack model of an intelligent system.

Discussion and conclusion

Coupled with increasing demand, continuous technological developments have made intelligent systems helpful in many fields, such as medical care, education, and the military. However, to the best of our knowledge, the security of intelligent systems has not been investigated as extensively as intelligent system technologies and related algorithms. Thus, in this paper, we described the general concept of an attack model by first demonstrating that a basic intelligent system architecture comprises five main components, i.e., data collection, information extraction, knowledge generation, learning, and intelligence creation, all of which function together to realize an intelligent system.

In this study, risk analyses were performed to specify an attack model for a generic intelligent system in terms of the data and processes in the system. For the static data and data in transit components of the intelligent system, six attack methods were specified, i.e., eavesdropping, data theft, data modification, data deletion, data interception, and data injection. These attack vectors can significantly impact the three fundamental security characteristics (confidentially, integrity, and availability), as shown in .

Table 5. Attack methods on data and security characteristics.

In addition, actively processed data were also considered and analyzed when the risk analysis and attack model specification were conducted on the processes and components of the intelligent system. From the basic architecture, we identified the fundamental processes of the intelligent system that involve data processing, i.e., the data input, machine learning model development, and intelligent system usage. We found that data poisoning is the most relevant threat in this case because effectiveness of the intelligent system strongly depends on the quality of the data that are input to the system.

Machine learning models are another principal component of an intelligent system, and we have identified vulnerabilities associated with these models, i.e., model theft, model modification, and model deletion. In addition, after intelligent systems are developed and deployed, they are typically used by a wide range of users. Thus, potential access-related risks must be considered. Here, we found that the predominant threat is unauthorized access, whose level of impact depends on how much access an adversary can acquire.

Similar to the attack methods against the intelligent system’s data, the attack vectors against the processes and other components of an intelligent system can affect the three main security characteristics, as shown in .

Table 6. Attack methods on the processes and components and security characteristics.

Overall, the proposed attack model identifies a variety of attack methods against various points and components in a generic intelligent system, e.g., data collection and input processes, learning processes, and post-development system usage. These potential attacks appear to have a common goal, i.e., to disrupt the confidentiality, integrity, and availability of the intelligent system, and this conclusion is in agreement with the work of Gupta et al. (Citation2020). However, Gupta et al. only discussed the machine learning model component of the intelligent system. Similarly, although Anley (Citation2022) discussed in detail the practical attacks in machine learning systems, the author did not provide any actual attack model on machine learning at all. In contrast, in this study, we considered a broader generic intelligent system architecture and proposed an attack model, which could be a catalyst for designing more secure intelligent systems in future.

The proposed attack model, we believe, is of importance in the context of intelligent systems and machine learning. Various attacks have also been identified several times in Anley (Citation2022) and Marshall et al. (Citation2022). This paper is, however, the first to propose and illustrate an attack model for generic intelligent systems where attacks can occur at different points in the intelligent system architecture, rather than just focusing on the machine learning part. In this respect, we believe that the paper provides a better theoretical understanding for the potential behaviors of attackers. The findings are particularly interesting because we have shown that attacks can occur in three different parts of an intelligent system. They are attacks on static data, attacks on data in transit, and attacks on computational or machine learning components. The practicality of the proposed attack model on real-world intelligent systems depends on the ability of an attacker to realize where and how different types of attacks can be performed against various vulnerabilities. In this research, we define an attack model containing representations of different threats that can occur in generic intelligent systems. On the one hand, by applying the proposed attack model, it becomes more possible for an adversary to perform an attack on an intelligent system due to a larger attack surface and higher number of attack points defined in our attack model. On the other hand, by studying our proposed attack model, intelligent system designers and developers can realize a more complete picture, which should allow any existing vulnerabilities to be mitigated.

A major implication of this research work is that the insecurity of intelligent systems and their components has been formulated and illustrated. This means that several risk control strategies will have to be integrated as a part of the intelligent systems to eliminate the potential attacks faced by the systems. Although some strategies have been proposed by (Marshall et al., Citation2022), they only focus on the mitigation of attacks on machine learning. Our proposed attack model covers more areas for which other risk control strategies are required.

Additional information

Funding

This work was supported by Suranaree University of Technology (SUT), Thailand Science Research and Innovation (TSRI), and National Science Research and Innovation Fund (NSRF) (NRIIS no.160345).

References

  • Al-Fedaghi, S., & Moein, S. (2014). Modeling attacks. International Journal of Safety and Security Engineering, 4(2), 97–115. https://doi.org/10.2495/SAFE-V4-N2-97-115
  • Anley, C. (2022). Practical attacks on machine learning systems. NCC Group.
  • ATT&CK. (2022, April 1). ATT&CK matrix for enterprise. Retrieved August 20, 2022, from MITRE ATT&CK: https://attack.mitre.org/
  • Bedi, P., Gandotra, V., Singhal, A., Narang, H., & Sharma, S. (2013). Threat-oriented security framework in risk management using multiagent system. Software: Practice and Experience, 43(9), 1013–1038. https://doi.org/10.1002/spe.2133
  • Bistron, M., & Piotrowski, Z. (2021). Artificial intelligence applications in military systems and their influence on sense of security of citizens. Electronics, 10(7), 871. https://doi.org/10.3390/electronics10070871
  • Curzon, J., Kosa, T. A., Akalu, R., & El-Khatib, K. (2021, April). Privacy and artificial intelligence. IEEE Transactions on Artificial Intelligence, 2(2), 96–108. https://doi.org/10.1109/TAI.2021.3088084
  • Dastin, J. (2018, October 11). Amazon scraps secret AI recruiting tool that showed bias against women. Retrieved August 21, 2022, from Reuters: https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G
  • de Freitas Barbosa, V. A., Gomes, J. C., de Santana, M. A., d A., Albuquerque, J. E. d., Souza, R. G., de Souza, R. E., & dos Santos, W. P. (2022). Heg. IA: an intelligent system to support diagnosis of Covid-19 based on blood tests. Research on Biomedical Engineering, 38(1), 99–116. https://doi.org/10.1007/s42600-020-00112-5
  • Deng, M., Wuyts, K., Scandariato, R., Preneel, B., & Joosen, W. (2011). A privacy threat analysis framework: supporting the elicitation and fulfillment of privacy requirements. Requirements Engineering, 16(1), 3–32. https://doi.org/10.1007/s00766-010-0115-7
  • Frydman, M., Ruiz, G., Heymann, E., Cesar, E., & Miller, B. P. (2014). Automating risk analysis of software design models. The Scientific World Journal, 2014, 1–12. https://doi.org/10.1155/2014/805856
  • Gregor, S., & Benbasat, I. (1999). Explanations from intelligent systems: theoretical foundations and implications for practice. MIS Quarterly, 23(4), 497–530. https://doi.org/10.2307/249487
  • Gupta, R., Tanwar, S., Tyagi, S., & Kumar, N. (2020). Machine learning model for secure data analytics: A taxonomy threat model. Computer Communications, 153, 406–440. https://doi.org/10.1016/j.comcom.2020.02.008
  • Khamis, A., Meng, J., Wang, J., Azar, A. T., Prestes, E., Li, H., A. Hameed, I., Takács, Á., Rudas, I. J., & Haidegger, T. (2021). Robotics and intelligent systems against a pandemic. Acta Polytechnica Hungarica, 18(5), 13–35. https://doi.org/10.12700/APH.18.5.2021.5.3
  • Kohfelder, K., & Garg, P. (1999). The threats to our products. Microsft.
  • Marshall, A., Parikh, J., Kiciman, E., & Kumar, R. S. (2022). Threat modeling AI/ML systems and dependencies. Microsoft. https://docs.microsoft.com/en-us/security/engineering/threat-modeling-aiml
  • Meszaros, J., & Buchalcevova, A. (2017). Introducing OSSF: A framework for online service cybersecurity risk management. Computers & Security, 65, 300–313. https://doi.org/10.1016/j.cose.2016.12.008
  • Mitchell, T. (1997). Machine learning. McGraw Hill.
  • Molina, M. (2022, December 18). What is an intelligent system? arXiv:2009.09083v3 [cs.CY].
  • Okacha, D., Achtaich, N., & Najib, K. (2021). An intelligent strategy for developing scientific learning skills. In M. B. Ahmed, S. Mellouli, L. Braganca, B. A. Abdelhakim, & K. A. Bernadetta (Eds.), Emerging trends in ICT for sustainable development (pp. 29–36). Springer.
  • Okeme, P. A., Skakun, A. D., & Muzalevskiiv, A. R. (2021). Transformation of factory to smart factory. In IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus), St. Petersburg. IEEE.
  • Shen, Z., Fan, S., Wong, Y., Ng, T.-T., & Kankanhalli, M. (2019). Human-imperceptible privacy protection against machines [Paper presentation]. MM '19: Proceedings of the 27th ACM International Conference on Multimedia, ACM (pp. 1119–1128). https://doi.org/10.1145/3343031.3350963
  • UcedaVélez, T. (2021, November 23). What is PASTA threat modeling? Step into the kitchen as the co-founder of PASTA threat modeling breaks down the 7 steps. Retrieved August 12, 2022, from VerSprite: https://versprite.com/blog/what-is-pasta-threat-modeling/
  • Uzunov, A. V., & Fernandez, E. B. (2014, June). An extensible pattern-based library and taxonomy of security threats for distributed systems. Computer Standards & Interfaces, 36(4), 734–747. https://doi.org/10.1016/j.csi.2013.12.008
  • Wenjun, X., & Lagerstrom, R. (2019, July). Threat modeling—A systematic literature. Computers & Security, 84, 53–69. https://doi.org/10.1016/j.cose.2019.03.010
  • Wolf, A., Simopoulos, D., D'Avino, L., & Schwaiger, P. (2021). The PASTA threat model implementation in the IoT development life cycle. In INFORMATIK 2020 (pp. 1195–1204). Gesellschaft für Informatik.
  • Wooldridge, M., & Jennings, N. R. (1995). Intelligent agents: Theory and practice. The Knowledge Engineering Review, 10(2), 115–152. https://doi.org/10.1017/S0269888900008122
  • Wright, D., Finn, R., & Rodrigues, R. (2013, January). A comparative analysis of privacy impact assessment in six countries. Journal of Contemporary European Research, 9(1), 160–180. https://doi.org/10.30950/jcer.v9i1.513
  • Wuyts, K., Sion, L., & Joosen, W. (2020). LINDDUN GO: A lightweight approach to privacy threat modeling [Paper presentation]. 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), IEEE, Genoa, Italy. https://doi.org/10.1109/EuroSPW51379.2020.00047
  • Xiong, W., Legrand, E., Aberg, O., & Lagerstrom, R. (2022). Cyber security threat modeling based on the MITRE enterprise ATT&CK matrix. Software and Systems Modeling, 21(1), 157–177. https://doi.org/10.1007/s10270-021-00898-7