Full article: Machine tools with hidden defects: Optimal usage for maximum lifetime value

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

We consider randomly failing high-precision machine tools in a discrete manufacturing setting. Before a tool fails, it goes through a defective phase where it can continue processing new products. However, the products processed by a defective tool do not necessarily generate the same reward obtained from the ones processed by a normal tool. The defective phase of the tool is not visible and can only be detected by a costly inspection. The tool can be retired from production to avoid a tool failure and save its salvage value; however, doing so too early causes not fully using the production potential of the tool. We build a Markov decision model and study when it is the right moment to inspect or retire a tool with the objective of maximizing the total expected reward obtained from an individual tool. The structure of the optimal policy is characterized. The implementation of our model by using the real-world maintenance logs at the Philips shaver factory shows that the value of the optimal policy can be substantial compared to the policy currently used in practice.

Key Words:

1. Introduction

High-precision machining refers to cutting metal or other rigid materials with tolerances in the single-digit micron range; it is used in many areas with stringent quality requirements such as aerospace, electronics, defense, and medical technology (Groover, Citation2016). Although high-precision machine tools can be very expensive, many manufacturers under-utilize them, as they lack a scientific approach to optimization (Probst, Citation2015). An important challenge in the industry is to be able to use these tools effectively and obtain the maximum value out of them.

In this article, we consider maximizing the value of high-precision machining tools by optimizing their inspection and retirement decisions. The motivation comes from our collaboration with Philips Consumer Lifestyle, a global consumer-electronics manufacturer. The company produces electric shavers by using highly precise Electro-Chemical Machining (ECM) tools. These tools are responsible for cutting small openings in the round caps covering the blades of electric shavers (see ). After processing a random amount of products, the tool fails. This means the tool becomes unusable and is discarded. If a tool is retired before its failure, then the tool can be salvaged to generate an additional reward. That is, retiring a tool prior to its failure saves the tool’s salvage value, however, doing it too early means that its production capacity is not fully used. Furthermore, before the tool fails, it first goes into a hidden defective phase, which can only be detected with the inspection of the tool. A defective tool can also lead to products that are less valuable than the ones processed with a healthy tool. Hence, the company faces the problem of finding the optimal tool usage policy, which specifies when to execute an inspection based on the number of products produced since the tool’s last inspection and during its entire lifetime. This problem is relevant for many other applications where capital assets generate a reward at every unit of usage under the risk of a costly failure and require inspections to reveal their true conditions.

In the case of Philips, the acquisition of the tools is made at a strategic level considering factors such as growth scenarios and available budget. Since such strategic decisions are already made, the production department follows the operational objective of maximizing the economic value of an individual tool to achieve a high return on investment. This is also often seen for cutting tools in a flexible manufacturing system. We therefore focus on the total expected reward a tool collects over its lifetime, which we refer to as the tool’s lifetime value. The tool usage policy at Philips is based on two product counters: run counter and cumulative counter. The run counter is the number of products processed by the tool since its last inspection, whereas the cumulative counter is the total number of products processed by the tool. The tool is inspected every time its run counter hits a fixed inspection limit and retired as soon as it is found to be defective. We refer to this strategy as the fixed-threshold policy. The first limitation of this policy is to have a fixed inspection limit regardless of the age of the tool. Second, it is unable to capture the trade-off between the risk of a tool failure and fully exploiting the production potential of the tool.

We answer the following research questions:

What is the optimal tool usage policy that maximizes the lifetime value of a tool?
To facilitate the implementation of the optimal policy in practice, can we characterize the structure of the optimal policy in terms of product counters?
How much does a tool’s lifetime value increase under the optimal policy compared with the fixed-threshold policy? What is the value of postponing the retirement of a defective tool?
How to solve for the optimal policy efficiently for industry-size problems with large state spaces?

To find the optimal tool-usage policy, we formulate a stochastic shortest path problem, which is a class of total-cost Markov decision models with a cost-free absorbing termination state (Bertsekas and Tsitsiklis, Citation1991). The resulting optimal policy is practically attractive as it is specified in terms of easy-to-track product counters. Assuming that the amount of products the tool can process in the normal and defective phases have general probability distributions, we provide a structural analysis of the optimal tool-usage policy. The implementation of our model with parameters estimated from a case study at Philips shows that, depending on the cost parameters, the percentage improvement over the optimal fixed-threshold policy can take values from 5.2% to 20.7%.

The remainder of this article is organized as follows. Section 2 surveys the relevant literature. Section 3 presents the model formulation and Section 4 establishes the structural results on the optimal policy. Section 5 discusses the numerical insights and Section 6 presents a case study based on the real-world maintenance logs at Philips. Section 7 concludes this article with future research directions. An electronic companion includes the proofs and supplementary results.

2. Literature

This article is related to research on maintenance optimization on systems with the following characteristics:

Three-state degradation process with hidden healthy and defective states and a self-announcing failed state (i.e., the delay-time model) under generally distributed sojourn times for healthy and defective states.
Total expected cost over the lifetime of an asset is the performance criterion.
Degradation process is governed by a Markov chain with hidden states.

There is a rich literature on delay-time models (Christer, Citation1982) within the context of inspection planning; see Wang (Citation2012) for a review of delay-time-based maintenance models. A more recent review can be found in Section 3.2 of de Jonge and Scarf (Citation2019). Most studies in this literature use numerical analysis based on renewal theory with the objective of minimizing average cost (or maximizing average reward) on an infinite horizon (e.g., Christer and Wang, Citation1995; Wang and Christer, Citation2003, Scarf et al., Citation2009; Jiang, Citation2017; Arts and Basten, Citation2018; Duan et al., Citation2019). Maillart and Pollock (Citation2002) consider generally distributed sojourn times for the normal and defective phases and build an infinite-horizon average-cost Markov decision model to compute the optimal inspection schedule. MacPherson and Glazebrook (Citation2011) extend Maillart and Pollock (Citation2002) by adding a repair option to fix a defective component, and propose a heuristic solution based on policy improvement. None of these studies allows postponing the renewal of a defective system.

There is a recent interest in postponing the replacement of a defective component under the delay-time model. Van Oosterom et al. (Citation2014) consider periodic inspections and allow for postponed replacement after an inspection. However, they do this under the assumptions of an exponentially-distributed time to a defect, and a deterministic duration for the defective state. Under generally distributed sojourn times, Yang et al. (Citation2016) consider both periodic and random inspections and allow for postponed replacement after a random inspection. Berrade et al. (Citation2017) also allow postponed replacement after a periodic inspection. Yang et al. (Citation2019) introduce a two-phase maintenance policy, where the first phase includes inspections and the second phase includes the preventive replacement, potentially postponing the replacement of a defective component. However, none of these studies allow incurring a cost due to working in the defective condition. In our work, we allow postponement and incur a cost use to working in the defective condition. To the best of our knowledge, the structure of the optimal policy has not been studied before under the possibility of working with a defective system in the presence of general distributions for the duration of normal and defective states. We do this under an objective function that considers the finite lifetime of an individual production asset.

The majority of the maintenance optimization literature considers infinite-horizon objectives (Khojandi et al., Citation2014). However, this does not apply to the problem faced by the operators of expensive machining tools described in Section 1. In accordance with the practical desire to maximize the return-on-investment from an individual tool, we consider the total expected reward that can be obtained from a tool as the objective. The origin of using total expected reward (or cost) in maintenance optimization goes back to Barlow et al. (Citation1963) with a two-state degradation model including a functioning state and a silent failure state. Barlow et al. (Citation1963) minimize the total expected cost over the lifetime of a system by capturing the trade-off between the cost of too frequent inspections and the cost of working with a defective system. The decision problem terminates upon the detection of failure. Barlow et al. (Citation1963) has been extended in several ways. Kaio and Osaki (Citation1984) propose an approximation of the total expected cost to simplify the computation of the inspection schedule. Parmigiani (Citation1996) considers imperfect tests that can be performed before an inspection. Leung (Citation2001) considers the uncertainty in the time-to-failure distribution and obtains minimax inspection schedules. Sengupta (Citation1980) extends Barlow et al. (Citation1963) to a three-state degradation model including a functioning state, a hidden defective phase and a self-announcing failure state. The current article has the same degradation model, however, we have a three-action model (i.e., process, inspect, retire) whereas all the papers above only provide an inspection schedule.

Our work can be seen to relate to maintenance models in which the degradation process is governed by a Markov chain with hidden states. Among the papers with self-announcing failures, Ozekici and Pliska (Citation1991) study a system similar to ours, but assume that the duration of the defective state is exponentially distributed and preventive maintenance decisions follow a predetermined stopping rule. Ohnishi et al. (Citation1986) express the deterioration process as a continuous-time Markov chain and show the optimal time interval between successive inspections decreases as the degree of deterioration increases. Makis and Jiang (Citation2003) model the evolution of the machine state as a continuous-time Markov process together with a discrete-time observation process depending on the hidden machine state. The optimal replacement policy is found by minimizing the average cost-rate in the long-run by using optimal stopping theory. Maillart (Citation2006) and Kim and Makis (Citation2013) model the deterioration process with self-announcing failures as discrete-time and continuous-time Markov chains, respectively, with the objective of minimizing the average cost rate over an infinite horizon. These two papers are similar in showing that the at-most-four-region policy is optimal and derive structural properties of the optimal policy. Maillart (Citation2006) does not consider the cost of operating under a deteriorated system, and Kim and Makis (Citation2013) assume that the system is replaced immediately when it is found in the defective state. In the current article, we do not have these limitations. More recently, Moghaddass and Ertekin (Citation2018) consider a hidden semi-Markov degradation process with discrete-time condition monitoring and study the joint ordering and replacement. A common approach in this stream is to use a partially observable Markov decision process model with a belief state, representing a probability distribution over the hidden states. In our problem, the defective state of the tool is hidden, but the degradation states do not evolve according to a Markov chain. Instead, we derive the history-dependent transition probabilities of our Markov decision process from the probability distributions in the delay-time model. This modeling approach is also practically appealing, as the optimal policy becomes a function of easy-to-track variables instead of a belief state. It further helps us in avoiding the computational burden to solve for the optimal policy in the presence of an uncountable set of belief states.

3. Model

3.1. Tool degradation process

We consider a tool that first goes into a defective phase before it fails. The tool failure immediately announces itself, whereas the defective phase is not visible and only an inspection can detect it. Let X denote the random variable that represents the product number at which the tool becomes defective (e.g., X = 5 means that the tool processed four products in the normal phase and the fifth product in the defective phase). The sample space of X is ${1, 2, \dots, n_{X}}$ with $n_{X} \in N .$ We let H denote the number of products the tool processes in the defective phase, and it has the sample space ${0, 1, \dots, n_{H}}$ with $n_{H} \in {0} \cup N .$ Notice that n_H can be equal to zero, meaning that a defect can immediately cause a self-announcing failure. Also, note that n_X and n_H can take any large value, and therefore, do not put a limitation to our model.

The probability mass function (pmf) and cumulative distribution function (cdf) of X are denoted with $f_{X} (\cdot)$ and $F_{X} (\cdot),$ respectively. Likewise, the pmf and cdf of H are denoted with $f_{H} (\cdot)$ and $F_{H} (\cdot) .$ We let ${\bar{F}}_{X} (x) = \sum_{i = x}^{n_{X}} f_{X} (i)$ and ${\bar{F}}_{H} (h) = \sum_{i = h}^{n_{H}} f_{H} (i) .$ For notational convenience in our analysis, we extend the support of the random variable X and define $f_{X} (x) = 0$ and ${\bar{F}}_{X} (x) = 0$ for all $x > n_{X} .$ Similarly, we define $f_{H} (h) = 0$ and ${\bar{F}}_{H} (h) = 0$ for all $h > n_{H} .$

3.2. Problem description

The decision epochs correspond to the moments the tool is ready to process a new product (i.e., after the tool is first installed, after the processing of one product, or after an inspection). One of the following actions is taken in each decision epoch: process the next product (P), inspect the tool (I), and retire the tool (R). Inspection is permitted only if the degradation phase of the tool is not known with certainty. An inspection resets the run counter to zero and correctly reveals the true degradation phase of the tool at cost $C_{i} > 0 .$ The inspection cost C_i captures the cost of examining the tool with skilled technicians as well as the costs of removing the tool for inspection, its re-installation and follow-up calibration. If the tool is retired before being damaged with a failure, then the tool can be salvaged at a higher value. We let $C_{r} \geq 0$ denote the salvage reward that captures the economic gain obtained by retiring the tool before it fails. The lifetime of the tool ends with the failure, and a failed tool does not bring any salvage reward. The cost of inconvenience due to a failed tool is negligible (i.e., such a cost could still be modeled by adjusting C_r accordingly). If a product is processed with no tool failure, both the cumulative counter and run counter increase by one unit. Processing a product with a healthy tool brings a reward m > 0 per product. If the tool is defective, it brings a reward $m - C_{d}$ per product, where C_d captures the economic loss associated with processing a product with a defective tool (e.g., cost of lower quality, cost of rework, etc.). It is assumed that $0 \leq C_{d} < m + C_{r} .$ The objective is to maximize the expected total reward obtained from an individual tool.

3.3. Markov decision process model

State space. Let v and τ denote the values of the cumulative counter and run counter, respectively, and let u denote the last observed degradation phase of the tool; i.e., $u \in {0, 1}$ where zero is the normal phase and one is the defective phase. We use the symbol $\nabla$ to represent the end-of-life state of the tool, meaning that the tool is either failed or retired.

Case u = 0: The tool is last observed in the normal phase

The set of states when u = 0 is given by (1) $S_{0} ≜ {(v, τ, 0) : v \in {0, \dots, n_{X} + n_{H} - 1}, τ \in {\max (v - n_{X} + 1, 0), \dots, v}} .$ (1)

Notice in EquationEquation (1)(1) $S_{0} ≜ {(v, τ, 0) : v \in {0, \dots, n_{X} + n_{H} - 1}, τ \in {\max (v - n_{X} + 1, 0), \dots, v}} .$ (1) that the cumulative counter v can be as low as zero (i.e., a new tool) and it can be at most $n_{X} + n_{H} - 1,$ as having it equal to $n_{X} + n_{H}$ would imply the tool has failed, which is already captured by the state $\nabla .$ Also, if the cumulative counter v reaches n_X, the tool is guaranteed to be in the defective phase. Since inspection is a superfluous action then, the run counter τ must be at least $v - n_{X} + 1$ for $v \geq n_{X},$ and therefore, the run counter τ takes values starting from $\max (v - n_{X} + 1, 0)$ in EquationEquation (1)(1) $S_{0} ≜ {(v, τ, 0) : v \in {0, \dots, n_{X} + n_{H} - 1}, τ \in {\max (v - n_{X} + 1, 0), \dots, v}} .$ (1) . (left) illustrates the state space $S_{0}$ for n_X = 10 and n_H = 4.

Figure 1. State space of the Markov decision process model illustrated for n_X = 10 and n_H = 4.

Case u = 1: The tool is last observed in the defective phase

It is important to note that the set of variables $(v, τ, u)$ is not sufficient to capture the production and inspection history of a tool for u = 1, as illustrated in the example below.

Example 1. Suppose that $(v, τ, u) = (9, 0, 1)$ and this state is reached via an inspection at state $(v, τ, u) = (9, 8, 0) .$ Then, it can be inferred that $2 \leq X \leq 9 .$ However, if the state $(v, τ, u) = (9, 0, 1)$ had been reached via an inspection at state $(v, τ, u) = (9, 2, 0),$ then it could be inferred that $8 \leq X \leq 9 .$ That is, the amount of information on X (specifically, the lower bound on the realization of X) is different at state $(v, τ, u) = (9, 0, 1)$ depending on how it is reached.

To capture all the available information in the tool’s history, we introduce an additional state variable w, which we refer to as the inferred lower bound on X. More specifically, if an inspection at cumulative counters v and run counter τ finds the tool in the defective phase, it is possible to infer that $X \geq w$ where $w = v - τ + 1 .$ Consequently, the set of states when the tool is last observed in the defective phase (u = 1) is given by (2) $\begin{matrix} S_{1} ≜ {(v, τ, w, 1) : v \in {1, \dots, n_{X} + n_{H} - 2}, \\ τ \in {\max (v - n_{X} + 1, 0), \dots, \min (v, n_{H}) - 1}, w \in {1, \dots, v - τ}} . \end{matrix}$ (2)

The cumulative counter v starts from one because u = 1 is only possible after an inspection, and v must be at least one for an inspection to be permitted. Also, v can be at most $n_{X} + n_{H} - 2 .$ This is because an inspection could take place latest at cumulative counter $n_{X} - 1,$ and the tool could process at most $n_{H} - 1$ more products before transiting to $\nabla .$ Regarding the possible values of the run counter τ in EquationEquation (2)(2) $\begin{matrix} S_{1} ≜ {(v, τ, w, 1) : v \in {1, \dots, n_{X} + n_{H} - 2}, \\ τ \in {\max (v - n_{X} + 1, 0), \dots, \min (v, n_{H}) - 1}, w \in {1, \dots, v - τ}} . \end{matrix}$ (2) , we first note that τ must be strictly less than v because $τ = v$ would mean the tool has never been inspected (which is not possible for u = 1). Also, τ can be at most $n_{H} - 1$ because the tool could process at most $n_{H} - 1$ more products (after identifying the tool as defective in an inspection) before transiting to $\nabla .$ Thus, the run counter τ is at most $\min (v, n_{H}) - 1 .$ Finally, we note in EquationEquation (2)(2) $\begin{matrix} S_{1} ≜ {(v, τ, w, 1) : v \in {1, \dots, n_{X} + n_{H} - 2}, \\ τ \in {\max (v - n_{X} + 1, 0), \dots, \min (v, n_{H}) - 1}, w \in {1, \dots, v - τ}} . \end{matrix}$ (2) that the state variable w never exceeds $v - τ .$ This is because, when an inspection is performed at cumulative counter v and the tool is found defective, the maximum value of w is the cumulative counter value v itself (when the inspection is performed at run counter 1), and the value of w remains equal to the difference between the cumulative and run counters as the tool continues production in the defective phase. This is illustrated in an example below.

Example 2. Suppose that $(v, τ, u) = (9, 1, 0)$ and an inspection reveals that the tool is defective. Then, it is inferred that X = 9, and therefore, w is updated as the value of v (i.e., w = 9). This is the maximum value of w possible when an inspection is performed at v = 9 and the tool is found defective. Upon finding the tool defective, the state moves from $S_{0}$ to $S_{1}$ and it becomes $(v, τ, w, u) = (9, 0, 9, 1) .$ Subsequently, the state transits to $(v, τ, w, u) = (9 + i, i, 9, 1)$ after processing i more products in the defective state, where w remains equal to $v - τ .$ That is, for u = 1, at any particular value of v and τ, w can be at most $v - τ .$

Actions, state transitions and rewards. We let $μ (\cdot)$ denote a function that maps a state into an action. In particular, $μ (s) \in {P, I, R}$ for $s \in {(v, τ, 0) \in S_{0} : v < n_{X}, 0 < τ} .$ Since inspection is not permitted if the degradation phase of the tool is known with certainty, it follows that $μ (s) \in {P, R}$ for $s \in S_{1} \cup {(v, τ, 0) \in S_{0} : v \geq n_{X} or τ = 0} .$ Let s_k denote the state of the Markov Decision Process (MDP) at the kth decision epoch, $k \in {0, 1, \dots} .$ In , we present the state transitions under each possible action with corresponding probabilities and rewards, and further explain them below.

Table 1. Summary of the state transitions with associated probabilities and rewards.

Display Table

Suppose that an inspection is done at state $(v, τ, 0) .$ If the tool is found defective, which occurs with the conditional probability: $π_{d} (v, τ) ≜ P (X \leq v | X > v - τ, X + H > v),$ the state is updated as $(v, 0, v - τ + 1, 1),$ i.e., this was also seen in Example 2. Otherwise, the state is updated as $(v, 0, 0) .$ Our model allows that the inspection can be immediately followed by the retire-the-tool action if desired.

If the process action is taken at state $(v, τ, 0),$ the tool fails with the conditional probability: $π_{f, 0} (v, τ) ≜ P (X + H = v + 1 | X > v - τ, X + H > v),$ and the end-of-life state $\nabla$ is reached. If no failure occurs, the state moves to $(v + 1, τ + 1, 0)$ and two scenarios are possible: If the tool is still in the normal phase at the completion of the $(v + 1)$ th product, which occurs with probability: $\begin{matrix} P (X > v + 1, X + H > v + 1 | X > v - τ, X + H > v) \\ = P (X > v + 1 | X > v - τ, X + H > v + 1) P (X + H > v + 1 | X > v - τ, X + H > v) \\ = (1 - π_{d} (v + 1, τ + 1)) (1 - π_{f, 0} (v, τ)), \end{matrix}$ then the reward m is obtained. On the other hand, if the tool is defective at the $(v + 1)$ th product, which occurs with probability $π_{d} (v + 1, τ + 1) (1 - π_{f, 0} (v, τ)),$ the reward $m - C_{d}$ is obtained. Consequently, the reward is given by $m - C_{d} π_{d} (v + 1, τ + 1)$ if the tool does not fail. If the process action is taken at state $(v, τ, w, 1),$ then the tool fails with conditional probability: $π_{f, 1} (v, τ, w) ≜ P (X + H = v + 1 | w \leq X \leq v - τ, X + H > v),$ and the end-of-life state $\nabla$ is reached. Otherwise, the production reward $m - C_{d}$ is gained and the state is updated as $(v + 1, τ + 1, w, 1) .$

If the retire action is taken, the reward C_r is collected and the end-of-life state $\nabla$ is reached. illustrates a sample path with actions that eventually lead to retiring the tool before its failure. The end-of-life state $\nabla$ is absorbing; i.e., it is a state that, once entered, cannot be left.

Figure 2. Illustration of state transitions where the tool is retired before it fails.

Lifetime value of a tool. Let π be a tool usage policy characterized by taking action $μ (\cdot)$ every time a decision needs to be made. The expected total reward under this policy when starting in state $s \in S_{0} \cup S_{1} \cup {\nabla}$ is given by $V^{π} (s) = \lim_{N \to \infty} E (\sum_{k = 0}^{N} g_{μ} (s_{k}) | s_{0} = s),$ where $g_{μ} (s_{k})$ is given by $g_{μ} (s_{k}) = {\begin{matrix} (1 - π_{f, 0} (v, τ)) (m - π_{d} (v + 1, τ + 1) C_{d}) & for μ (s_{k}) = P and U (s_{k}) = 0 \\ (1 - π_{f, 1} (v, τ, w)) (m - C_{d}) & for μ (s_{k}) = P and U (s_{k}) = 1 \\ - C_{i} & for μ (s_{k}) = I \\ C_{r} & for μ (s_{k}) = R \\ 0 & for s_{k} = \nabla \end{matrix}$ with $U (s_{k})$ denoting the value of the state variable u in $s_{k} .$ We also refer to $V^{π} (s)$ as the lifetime value of a tool at state $s$ under the policy π; e.g., $V^{π} (0, 0, 0)$ is the lifetime value of a new tool under policy π. Notice that the absorbing end-of-life state must be reached after a finite number of decisions. Therefore, the value function $V^{π} (\cdot)$ is guaranteed to be finite. For notational convenience, we let $V (s)$ denote $V^{π^{*}} (s)$ for $π^{*} ≜ \arg \max_{π} V^{π} (s),$ and refer to it as the maximum lifetime value of a tool at state $s .$ Since the expected total reward of any policy with initial state $\nabla$ is equal to zero, we omit $\nabla$ from the state space of the Markov decision model.

Bellman optimality equations. It can be shown that the maximum lifetime value $V (\cdot)$ satisfies the Bellman optimality equations (Proposition 7.2.1(a) in Bertsekas (Citation2005)): $V (v, τ, w, 1) = \max {V_{P} (v, τ, w, 1), C_{r}} \forall (v, τ, w, 1) \in S_{1},$ where $V_{P} (v, τ, w, 1)$ is the expected total reward if we take the process action at state $(v, τ, w, 1)$ and follow the optimal policy thereafter: $V_{P} (v, τ, w, 1) = (1 - π_{f, 1} (v, τ, w)) (m - C_{d} + V (v + 1, τ + 1, w, 1)) .$

Figure 7. The best-fit probability distributions for the random variables X and H.

Furthermore, the following holds:

$V (v, τ, 0) = {\begin{matrix} \max {V_{P} (v, τ, 0), C_{r}} & \forall (v, τ, 0) \in {(v, τ, 0) \in S_{0} : v < n_{X}, 0 < τ} \\ \max {V_{P} (v, τ, 0), V_{I} (v, τ), C_{r}} & \forall (v, τ, 0) \in {(v, τ, 0) \in S_{0} : v \geq n_{X} o r τ = 0}, \end{matrix}$ where $V_{P} (v, τ, 0)$ is the expected total reward if we take the process action at state $(v, τ, 0)$ and follow the optimal policy afterwards; that is to say $V_{P} (v, τ, 0) = (1 - π_{f, 0} (v, τ)) (m - π_{d} (v + 1, τ + 1) C_{d} + V (v + 1, τ + 1, 0)),$ and $V_{I} (v, τ)$ represents the expected total reward if we take the inspection action at state $(v, τ, 0)$ and follow the optimal policy afterwards; that is to say $V_{I} (v, τ) = - C_{i} + π_{d} (v, τ) V (v, 0, v - τ + 1, 1) + (1 - π_{d} (v, τ)) V (v, 0, 0) .$

We note that there is a special structure of the MDP model that allows us to recursively solve the Bellman equations by making a single pass through the state space at a specific order. Algorithm 1 outlines the details in Section 8.

4. Analysis

4.1. Properties of the probability of being defective and the probability of failure

In this section, we establish the monotonicity properties of the transition probabilities in the MDP model by considering the distributional properties of the random variables X and H. Specifically, Lemma 1 characterizes the probability of being defective and establishes the conditions that guarantee its monotonicity with respect to the cumulative counter.

Lemma 1.

The probability of being in the defective phase $π_{d} (v, τ)$ is given by (3) $π_{d} (v, τ) = \frac{\sum_{x = v - τ + 1}^{v} f_{X} (x) {\bar{F}}_{H} (v - x + 1)}{\sum_{x = v - τ + 1}^{v} f_{X} (x) {\bar{F}}_{H} (v - x + 1) + {\bar{F}}_{X} (v + 1)} \forall (v, τ, 0) \in S_{0},$ (3) where the summation is zero for τ = 0.
$π_{d} (v, τ)$ is nondecreasing in τ.
Let $λ_{X} (x) = f_{X} (x) / {\bar{F}}_{X} (x)$ for $x \in {1, \dots, n_{X}}$ . If $λ_{X} (\cdot)$ is nondecreasing and the property (4) $\frac{f_{X} (2)}{f_{X} (1)} \geq \frac{f_{X} (3)}{f_{X} (2)} \geq \dots \geq \frac{f_{X} (n_{X})}{f_{X} (n_{X} - 1)},$ (4) holds, then $π_{d} (v, τ)$ is nondecreasing in v.

Notice that $π_{d} (v, τ)$ is equal to one for $v \geq n_{X}$ in EquationEquation (3)(3) $π_{d} (v, τ) = \frac{\sum_{x = v - τ + 1}^{v} f_{X} (x) {\bar{F}}_{H} (v - x + 1)}{\sum_{x = v - τ + 1}^{v} f_{X} (x) {\bar{F}}_{H} (v - x + 1) + {\bar{F}}_{X} (v + 1)} \forall (v, τ, 0) \in S_{0},$ (3) . In Lemma 1(iii), $λ_{X} (\cdot)$ is the hazard rate of X; i.e., the conditional probability of defect arrival at the xth product given that the tool has processed x – 1 products in its normal phase. The condition in EquationEquation (4)(4) $\frac{f_{X} (2)}{f_{X} (1)} \geq \frac{f_{X} (3)}{f_{X} (2)} \geq \dots \geq \frac{f_{X} (n_{X})}{f_{X} (n_{X} - 1)},$ (4) is satisfied by many practical distributions such as the discrete uniform distribution, truncated geometric distribution, and truncated discrete Weibull distribution (Nakagawa and Osaki, Citation1975) with nondecreasing hazard rate. In Lemma 2, we characterize the probability of failure when the tool is last observed as normal, and establish the conditions that guarantee its monotonicity with respect to the run and cumulative counters.

Lemma 2.

The probability of failure $π_{f, 0} (v, τ)$ is given by (5) $π_{f, 0} (v, τ) = \frac{\sum_{i = 0}^{τ} f_{X} (v + 1 - i) f_{H} (i)}{\sum_{i = 0}^{τ} {\bar{F}}_{X} (v + 1 - i) f_{H} (i) + {\bar{F}}_{X} (v + 1 - τ) {\bar{F}}_{H} (τ + 1)} \forall (v, τ, 0) \in S_{0} .$ (5)
Let $λ_{H} (h) = f_{H} (h) / {\bar{F}}_{H} (h)$ for $h \in {0, 1, \dots, n_{H}}$ . If $λ_{H} (\cdot)$ is nondecreasing, then $π_{f, 0} (v, τ)$ is nondecreasing in τ.
If $λ_{X} (\cdot), λ_{H} (\cdot)$ and $f_{H} (\cdot)$ are nondecreasing, and the property in EquationEquation (4)(4) $\frac{f_{X} (2)}{f_{X} (1)} \geq \frac{f_{X} (3)}{f_{X} (2)} \geq \dots \geq \frac{f_{X} (n_{X})}{f_{X} (n_{X} - 1)},$ (4) and the property (6) $\frac{f_{H} (1)}{f_{H} (0)} \geq \frac{f_{H} (2)}{f_{H} (1)} \geq \dots \geq \frac{f_{H} (n_{H})}{f_{H} (n_{H} - 1)},$ (6) hold, then $π_{f, 0} (v, τ)$ is nondecreasing in v.

In Lemma 2(ii), $λ_{H} (h)$ represents the hazard rate of H; i.e., the conditional probability of processing h products in the defective phase given that at least h products have been processed in the defective phase. The condition in EquationEquation (6)(6) $\frac{f_{H} (1)}{f_{H} (0)} \geq \frac{f_{H} (2)}{f_{H} (1)} \geq \dots \geq \frac{f_{H} (n_{H})}{f_{H} (n_{H} - 1)},$ (6) is the counterpart of condition (4) for the random variable of H, and it is satisfied by many of the practical distributions mentioned above. Lemma 2(iii) requires the pmf $f_{H} (\cdot)$ to be nonincreasing. In practice, a nondecreasing pmf for the random variable H is not uncommon (e.g., when the number of products the tool processes in the defective phase is roughly constant with a uniformly distributed error).

In Lemma 3, we characterize the probability of failure when the tool is last observed as defective, and establish the conditions that guarantee its monotonicity.

Lemma 3.

The probability of failure $π_{f, 1} (v, τ, w)$ is given by (7) $π_{f, 1} (v, τ, w) = \frac{\sum_{x = w}^{v - τ} f_{X} (x) f_{H} (v + 1 - x)}{\sum_{x = w}^{v - τ} f_{X} (x) {\bar{F}}_{H} (v + 1 - x)} \forall (v, τ, w, 1) \in S_{1} .$ (7)
If $λ_{H} (\cdot)$ is nondecreasing, then the following results hold: $π_{f, 1} (v, τ, w)$ is nondecreasing in τ and nonincreasing in w, and $π_{f, 1} (v, τ, w) \geq π_{f, 0} (v, τ)$ for all $(v, τ, w, 1) \in S_{1} .$
If $λ_{H} (\cdot)$ is nondecreasing and the property in EquationEquation (4)(4) $\frac{f_{X} (2)}{f_{X} (1)} \geq \frac{f_{X} (3)}{f_{X} (2)} \geq \dots \geq \frac{f_{X} (n_{X})}{f_{X} (n_{X} - 1)},$ (4) is satisfied, then $π_{f, 1} (v, τ, w)$ is nondecreasing in v.

In the case of dependence between X and H, the closed-form characterizations in EquationEquations (3)(3) $π_{d} (v, τ) = \frac{\sum_{x = v - τ + 1}^{v} f_{X} (x) {\bar{F}}_{H} (v - x + 1)}{\sum_{x = v - τ + 1}^{v} f_{X} (x) {\bar{F}}_{H} (v - x + 1) + {\bar{F}}_{X} (v + 1)} \forall (v, τ, 0) \in S_{0},$ (3) , Equation(5)(5) $π_{f, 0} (v, τ) = \frac{\sum_{i = 0}^{τ} f_{X} (v + 1 - i) f_{H} (i)}{\sum_{i = 0}^{τ} {\bar{F}}_{X} (v + 1 - i) f_{H} (i) + {\bar{F}}_{X} (v + 1 - τ) {\bar{F}}_{H} (τ + 1)} \forall (v, τ, 0) \in S_{0} .$ (5) and Equation(7)(7) $π_{f, 1} (v, τ, w) = \frac{\sum_{x = w}^{v - τ} f_{X} (x) f_{H} (v + 1 - x)}{\sum_{x = w}^{v - τ} f_{X} (x) {\bar{F}}_{H} (v + 1 - x)} \forall (v, τ, w, 1) \in S_{1} .$ (7) can be rewritten by replacing ${\bar{F}}_{H} (\cdot)$ with ${\bar{F}}_{H | X} (\cdot)$ and by replacing $f_{H} (\cdot)$ with $f_{H | X} (\cdot),$ which represent the reliability function and pmf of H, respectively, conditional on the realization of X. Finally, it is worth noting that Lemmas 1 to 3 include the sufficient conditions for the monotonicity of the probabilities $π_{d} (v, τ), π_{f, 0} (v, τ),$ and $π_{f, 1} (v, τ, w) .$ For the analysis of the optimal-policy structure, we will assume that these monotonicity properties hold in Section 4.2 (see Assumptions 1 and 2).

4.2. Structural results on the optimal policy

We separately analyze the structure of the optimal policy for the two cases: when the tool is last observed as being defective (Section 4.2.1) and when the tool is last observed as being normal (Section 4.2.2).

4.2.1. Tool is last observed as being defective (u = 1)

Proposition 1 characterizes the monotonicity properties of the maximum lifetime value of the tool under the following assumption:

Assumption 1.

$π_{f, 1} (v, τ, w)$ is a nondecreasing function of v.
$π_{f, 1} (v, τ, w)$ is a nondecreasing function of τ.
$π_{f, 1} (v, τ, w)$ is a nonincreasing function of w.

Assumption 1 argues that the tool becomes more likely to fail as the product counters increase and the lower bound on X decreases.

Proposition 1.

The maximum lifetime value of a tool has the following properties:

$V (v, τ, w, 1)$ is a nonincreasing function of v and a nonincreasing function of τ.
$V (v, τ, w, 1)$ is a nondecreasing function of w.

Proposition 1 shows that the maximum lifetime value of a tool, which is last observed in a defective phase, gets lower as the cumulative counter increases. This is intuitive, as then fewer products can be processed until the tool failure. Similarly, a higher run counter increases the risks of working with a defective tool and tool failure, reducing the maximum lifetime value. Finally, the higher the inferred lower bound on the realization of X, the higher the maximum lifetime value of the tool. A tool that enters the defective phase later has more potential.

Theorem 1 shows that a threshold-type policy is optimal on each 45° line within the state space $S_{1}$ at a fixed w. In , we visualize the optimal policy when the random variables X and H are uniformly distributed with n_X = 20 and n_H = 10 (i.e., all the conditions of Lemmas 1 to 3 are satisfied). (right) illustrates $S_{1}$ for w = 6, consisting of 14 (i.e., $n_{X} - w$ ) 45° lines. Notice that the difference $v - τ$ is fixed on each 45° line. It is practically appealing to have a threshold-type policy at a fixed $v - τ,$ as then the optimal policy can be viewed in terms of the number of products to process with a defective tool. For the 45° line with $v - τ = t$ and lower bound w on X, we let $i^{*} (t, w)$ denote the critical threshold that represents the optimal number of products to process with a tool that is just detected as defective.

Figure 3. Optimal policy for u = 1, m = 2, C_r = 10, $C_{i} = 0.5,$ and $C_{d} = 0.1 .$

Theorem 1.

Suppose that the last inspection has revealed that the tool is defective.

At states $(t + i, i, w, 1) \in S_{1}$ , the ‘‘process’’ action is optimal for $i \in {0, \dots, i^{*} (t, w) - 1}$ and the ‘‘retire-the-tool’’ action is optimal for $i \in {i^{*} (t, w), \dots, n_{H} - 1}$ , where: $i^{*} (t, w) = \min {i \in {0, \dots, n_{H} - 1} : π_{f, 1} (t + i, i, w) \geq \frac{m - C_{d}}{m - C_{d} + C_{r}}},$ for $t \in {w, \dots, n_{X} - 1}$ and $w \in {1, \dots, n_{X} - 1} .$
The critical threshold $i^{*} (t, w)$ is nonincreasing in t and nondecreasing in w.

As an example, notice in (left) that $i^{*} (5, 3) = 3;$ i.e., it is optimal to take the process action at states $(5 + i, i, 3, 1)$ for i = 0, 1, 2 and it is optimal to take the retire action at states $(5 + i, i, 3, 1)$ for $i = 3, 4, \dots, 9 .$ In Proposition 2, we characterize the maximum lifetime value of a tool that is just found defective; i.e., $V (v, 0, w, 1) .$

Proposition 2.

The maximum lifetime value $V (v, 0, w, 1)$ of a tool, which has processed at least w products in the normal phase and just been detected as defective by an inspection at cumulative counter v, is given by $(m - C_{d}) \sum_{i = 1}^{i^{*} (v, w)} \prod_{j = 1}^{i} (1 - π_{f, 1} (v + j - 1, j - 1, w)) + C_{r} \prod_{j = 1}^{i^{*} (v, w)} (1 - π_{f, 1} (v + j - 1, j - 1, w)) .$

if $i^{*} (v, w) > 0$ and C_r if $i^{*} (v, w) = 0 .$

The characterization of $V (v, 0, w, 1)$ in Proposition 2 is in closed form and can easily be calculated after identifying the critical threshold $i^{*} (v, w)$ via Theorem 1. For brevity, we let g(v, w) denote the closed-form characterization of $V (v, 0, w, 1) .$ It will be used in Section 4.2.2 in establishing the conditions for a threshold-type optimal policy structure when the tool is last observed as normal.

4.2.2. Tool is last observed as normal (u = 0)

We start our analysis by characterizing the monotonicity properties of the maximum lifetime value of the tool under the following assumption:

Assumption 2.

$π_{f, 0} (v, τ)$ and $π_{d} (v, τ)$ are nondecreasing functions of v.
$π_{f, 0} (v, τ)$ and $π_{d} (v, τ)$ are nondecreasing functions of τ.
$π_{f, 1} (v, τ, w) \geq π_{f, 0} (v, τ)$ for all $(v, τ, w, 1) \in S_{1} .$
$π_{f, 1} (v + i, 0, w + i)$ is nondecreasing in i at a given v and w.

Assumptions 2(i)-(iii) state that the tool becomes more likely to be defective and to fail as the product counters increase, and the probability of tool failure is higher if its last observed phase is defective compared with being normal. Assumption 2(iv) holds if the increase in $π_{f, 1} (v, 0, w)$ with a unit increment in v is larger than the increase in $π_{f, 1} (v + 1, 0, w + 1)$ with a unit decrement in w; i.e., $π_{f, 1} (v + 1, 0, w) - π_{f, 1} (v, 0, w) \geq π_{f, 1} (v + 1, 0, w) - π_{f, 1} (v + 1, 0, w + 1) .$

Proposition 3.

The maximum lifetime value of a tool has the following properties:

$V (v, τ, 0)$ is a nonincreasing function of v and a nonincreasing function of τ.
$V (v, τ, 0) \geq V (v, τ, w, 1)$ for all $(v, τ, w, 1) \in S_{1} .$
$V (v + i, τ, w + i, 1)$ is nonincreasing in i.

Propositions 3(i)-(iii) show that the maximum lifetime value of a tool, which is last observed as being normal, gets smaller as the number of products processed by the tool and the number of products since its last inspection increase. Furthermore, it cannot be less than the maximum lifetime value of a tool which is last observed as defective at any cumulative and run counter value. Proposition 3(iii) is implied when the maximum lifetime value of a tool, which is last observed as being defective, is more affected by a unit increase in the cumulative counter than a unit decrease in the lower bound on X.

Similar to Section 4.2.1, we study the structure of the optimal policy on each 45° line within the state space. There is a total of n_X 45° lines in $S_{0},$ as illustrated in where n_X = 20. Theorem 2 shows that the optimal policy can be characterized with two critical thresholds on each 45° line within the state space $S_{0} .$ Specifically, for the 45° line with $v - τ = t,$ where the possible states are $(t + i, i, 0)$ for $i \in {0, 1, \dots, n_{X} + n_{H} - 1 - t},$ the optimal action is as follows: process the next product if $i < ℓ^{*} (t),$ inspect the tool if $ℓ^{*} (t) \leq i < u^{*} (t),$ and retire the tool if $i \geq u^{*} (t) .$ Notice that $ℓ^{*} (t)$ also represents the optimal number of products to process with a tool that is just inspected and found to be normal.

Figure 4. Optimal policy for u = 0 considering the same instance as in with C_r = 20.

Figure 4. Optimal policy for u = 0 considering the same instance as in Figure 3 with Cr = 20.

Theorem 2.

Let $r (v) ≜ m (E (X | X > v) - v - 1) + (E (H) + 1) (m - C_{d})$ for

$v \in {0, \dots, n_{X} - 1}, Δ_{d} (v, τ; a, b) ≜ π_{d} (v, τ) - π_{d} (v - a, τ - b),$

and $Δ_{f} (v, τ; a, b) ≜ π_{f, 0} (v, τ) - π_{f, 0} (v - a, τ - b) .$

Suppose that the last inspection has revealed that the tool is in the normal phase:

For $t \in {0, \dots, n_{X} - 1}$ , there exists a critical threshold $u^{*} (t)$ such that the ‘‘retire-the-tool’’ action is optimal at states $(t + i, i, 0) \in S_{0}$ for $i \in {u^{*} (t), \dots, n_{X} + n_{H} - 1 - t} .$
There exists a critical threshold $ℓ^{*} (t)$ such that the ‘‘process’’ action is optimal at states $(t + i, i, 0) \in S_{0}$ for $i \in {0, \dots, ℓ^{*} (t) - 1}$ assuming that the condition: (8) $\begin{matrix} C_{i} + Δ_{f} (n_{X}, n_{X} - t; 1, 1) (m - C_{d} + C_{r}) + C_{r} π_{d} (n_{X} - 1, n_{X} - t - 1) \\ \geq π_{d} (n_{X} - 1, n_{X} - t - 1) g (n_{X} - 1, t + 1) + (1 - π_{d} (n_{X} - 1, n_{X} - t - 1)) r (n_{X} - 1) \end{matrix}$ (8)

holds if

t \in {0, \dots, n_{X} - 2}

and the condition:

(9)

\begin{matrix} Δ_{f} (v, τ; 1, 1) (C_{r} + m) + (π_{d} (v - 1, τ - 1) + Δ_{d} (v + 1, τ + 1; 1, 1) + π_{f, 0} (v - 1, τ - 1) π_{d} (v, τ)) C_{d} \\ \geq Δ_{d} (v, τ; 1, 1) (r (v) + C_{r} - g (v, v - τ + 1)) + m + π_{f, 0} (v, τ) π_{d} (v + 1, τ + 1) C_{d} \end{matrix}

(9)

holds for all $(v, τ) \in {(t + i, i) : i = 1, \dots, n_{X} - 1 - t}$ if $t \in {0, \dots, n_{X} - 3}$ . For $t = n_{X} - 1, ℓ^{*} (t)$ is equal to $u^{*} (t) .$

The condition (Equation8(8) $\begin{matrix} C_{i} + Δ_{f} (n_{X}, n_{X} - t; 1, 1) (m - C_{d} + C_{r}) + C_{r} π_{d} (n_{X} - 1, n_{X} - t - 1) \\ \geq π_{d} (n_{X} - 1, n_{X} - t - 1) g (n_{X} - 1, t + 1) + (1 - π_{d} (n_{X} - 1, n_{X} - t - 1)) r (n_{X} - 1) \end{matrix}$ (8) ) guarantees if the process action is optimal at state $(n_{X}, τ, 0)$ then it is also optimal at state $(n_{X} - 1, τ - 1, 0) .$ Similarly, the condition (Equation9(9) $\begin{matrix} Δ_{f} (v, τ; 1, 1) (C_{r} + m) + (π_{d} (v - 1, τ - 1) + Δ_{d} (v + 1, τ + 1; 1, 1) + π_{f, 0} (v - 1, τ - 1) π_{d} (v, τ)) C_{d} \\ \geq Δ_{d} (v, τ; 1, 1) (r (v) + C_{r} - g (v, v - τ + 1)) + m + π_{f, 0} (v, τ) π_{d} (v + 1, τ + 1) C_{d} \end{matrix}$ (9) ) guarantees that if the process action is optimal at state $(v, τ, 0)$ then it is also optimal at state $(v - 1, τ - 1, 0)$ for $v < n_{X} .$ These conditions relate the economic parameters with the rate of increase in the probability of failure and the probability of being defective. In our numerical experiments, the conditions (Equation8(8) $\begin{matrix} C_{i} + Δ_{f} (n_{X}, n_{X} - t; 1, 1) (m - C_{d} + C_{r}) + C_{r} π_{d} (n_{X} - 1, n_{X} - t - 1) \\ \geq π_{d} (n_{X} - 1, n_{X} - t - 1) g (n_{X} - 1, t + 1) + (1 - π_{d} (n_{X} - 1, n_{X} - t - 1)) r (n_{X} - 1) \end{matrix}$ (8) ) and (Equation9(9) $\begin{matrix} Δ_{f} (v, τ; 1, 1) (C_{r} + m) + (π_{d} (v - 1, τ - 1) + Δ_{d} (v + 1, τ + 1; 1, 1) + π_{f, 0} (v - 1, τ - 1) π_{d} (v, τ)) C_{d} \\ \geq Δ_{d} (v, τ; 1, 1) (r (v) + C_{r} - g (v, v - τ + 1)) + m + π_{f, 0} (v, τ) π_{d} (v + 1, τ + 1) C_{d} \end{matrix}$ (9) ) are always satisfied for the probability distributions of X and H estimated from real-world data and under realistic cost parameters, and hence, we consider them as mild sufficient conditions. illustrates the critical thresholds $ℓ^{*} (t)$ and $u^{*} (t)$ for $t = 0, \dots, n_{X} - 1;$ e.g., $ℓ^{*} (0) = 10$ and $u^{*} (0) = 18,$ or $ℓ^{*} (17) = u^{*} (17) = 2 .$ Next, we establish the monotonicity of the critical thresholds.

Theorem 3.

For $t \in {0, \dots, n_{X} - 2}$ , the following results hold: (i) $u^{*} (t) \geq u^{*} (t + 1) .$

(ii) $ℓ^{*} (t) \geq ℓ^{*} (t + 1)$ if (10) $\begin{matrix} C_{i} + Δ_{f} (n_{X}, n_{X} - t - 1; 1, 0) (m - C_{d} + C_{r}) + C_{r} π_{d} (n_{X} - 1, n_{X} - t - 1) \\ \geq π_{d} (n_{X} - 1, n_{X} - t - 1) g (n_{X} - 1, t + 1) + (1 - π_{d} (n_{X} - 1, n_{X} - t - 1)) r (n_{X} - 1) \end{matrix}$ (10) and the condition: (11) $\begin{matrix} Δ_{f} (v, τ; 1, 0) (C_{r} + m) + (π_{d} (v - 1, τ) + Δ_{d} (v + 1, τ + 1; 1, 0) + π_{f, 0} (v - 1, τ) π_{d} (v, τ + 1)) C_{d} \\ \geq Δ_{d} (v, τ; 1, 0) (r (v) + C_{r} - g (v, v - τ + 1)) + m + π_{f, 0} (v, τ) π_{d} (v + 1, τ + 1) C_{d} \end{matrix}$ (11) is satisfied for all $(v, τ) \in {(n_{X} - i, n_{X} - t - i - 1) : i = 1, \dots, n_{X} - 2 - t} .$

Theorem 3(i) shows that the critical threshold $u^{*} (t)$ never increases as t increases. However, it is not always the case for $ℓ^{*} (t)$ (see ). Theorem 3(ii) establishes the conditions under which the critical threshold $ℓ^{*} (t)$ is guaranteed to be nonincreasing in t. The condition (Equation10(10) $\begin{matrix} C_{i} + Δ_{f} (n_{X}, n_{X} - t - 1; 1, 0) (m - C_{d} + C_{r}) + C_{r} π_{d} (n_{X} - 1, n_{X} - t - 1) \\ \geq π_{d} (n_{X} - 1, n_{X} - t - 1) g (n_{X} - 1, t + 1) + (1 - π_{d} (n_{X} - 1, n_{X} - t - 1)) r (n_{X} - 1) \end{matrix}$ (10) ) is the counterpart of condition (Equation8(8) $\begin{matrix} C_{i} + Δ_{f} (n_{X}, n_{X} - t; 1, 1) (m - C_{d} + C_{r}) + C_{r} π_{d} (n_{X} - 1, n_{X} - t - 1) \\ \geq π_{d} (n_{X} - 1, n_{X} - t - 1) g (n_{X} - 1, t + 1) + (1 - π_{d} (n_{X} - 1, n_{X} - t - 1)) r (n_{X} - 1) \end{matrix}$ (8) ), and ensures that if the process action is optimal at state ( $n_{X}, τ, 0$ ) then it is also optimal at state $(n_{X} - 1, τ, 0) .$ Similarly, the condition (Equation11(11) $\begin{matrix} Δ_{f} (v, τ; 1, 0) (C_{r} + m) + (π_{d} (v - 1, τ) + Δ_{d} (v + 1, τ + 1; 1, 0) + π_{f, 0} (v - 1, τ) π_{d} (v, τ + 1)) C_{d} \\ \geq Δ_{d} (v, τ; 1, 0) (r (v) + C_{r} - g (v, v - τ + 1)) + m + π_{f, 0} (v, τ) π_{d} (v + 1, τ + 1) C_{d} \end{matrix}$ (11) ) is the counterpart of condition (Equation9(9) $\begin{matrix} Δ_{f} (v, τ; 1, 1) (C_{r} + m) + (π_{d} (v - 1, τ - 1) + Δ_{d} (v + 1, τ + 1; 1, 1) + π_{f, 0} (v - 1, τ - 1) π_{d} (v, τ)) C_{d} \\ \geq Δ_{d} (v, τ; 1, 1) (r (v) + C_{r} - g (v, v - τ + 1)) + m + π_{f, 0} (v, τ) π_{d} (v + 1, τ + 1) C_{d} \end{matrix}$ (9) ) for $v < n_{X},$ and ensures that if the process action is optimal at state $(v, τ, 0)$ then it is also optimal at state $(v - 1, τ, 0) .$

5. Numerical analysis

The objective of this section is to provide managerial insights on: (i) how the maximum lifetime value of a new tool is affected by various economic and degradation-related parameters; (ii) the benefit of the optimal tool-usage policy compared with the fixed-threshold policy, i.e., the benchmark policy from practice; and (iii) what portion of this benefit come from postponing the retirement of a defective tool. We let m be equal to one and consider various values for each cost parameter. We consider that X is uniformly distributed on ${1, \dots, n_{X}};$ i.e., $f_{X} (x) = 1 / n_{X}$ for $x = 1, \dots, n_{X} .$ This distribution implies that the tool enters at the defective phase at any product from one to n_X with equal probability. In the remainder of this section, we let n_X = 32. The random variable H is uniformly distributed on ${n_{H} - δ + 1, \dots, n_{H}};$ i.e., $f_{H} (n_{H} + 1 - i) = 1 / δ$ if $i = 1, \dots, δ$ and $f_{H} (i) = 0,$ otherwise. That is, the number of product the tool processes in the defective phase has varying levels of uncertainty controlled by parameter δ. We let $n_{H} \in {8, 16, 32}$ and consider three cases: The distribution H1 has δ = 1, the distribution H2 has $δ = n_{H} / 2,$ and the distribution H3 has $δ = n_{H} .$ The distribution H1 represents that a defective tool processes a constant number of products. As we go from distribution H1 to H3, the mean decreases and the variability increases. Although we focus on stylized distribution functions in this section, we will consider the more flexible discrete Weibull distribution in Section 6 and estimate its parameters from real-world maintenance logs as described in Section EC.3.

5.1. The impact of cost parameters on the maximum lifetime value

illustrates the increase in the maximum lifetime value of a new tool as the salvage reward C_r increases. We make three key observations. First, the distinction between the cases with different C_d values becomes more visible for lower C_r. At high C_r values, it is tempting not to use a defective tool at all, and the value of C_d has no effect. Second, as C_r increases, the rate of increase in maximum lifetime value increases and then stabilizes. This means, $V (0, 0, 0) - C_{r}$ (which represents the production value of a tool) decreases at a diminishing rate as C_r increases. In theory, if C_r approaches a very large number, then the production value of the tool would be zero (because then it would be optimal to retire the tool immediately without taking any risk of failure). Third, C_i influences the maximum lifetime value when C_r is sufficiently high. This is intuitive because if C_r is sufficiently high, then it makes sense to use the tool only if we know it is not close to failure, and in this situation, inspection is instrumental under the optimal policy.

Figure 5. Maximum lifetime value of a new tool for the distribution H2 with n_H = 16.

5.2. Comparison with the optimal fixed-threshold policy

We let V_ft denote the lifetime value of a new tool under the optimal fixed-threshold policy; see Section 1 for details on the fixed-threshold policy. In , we let n_H = 32 and compare the values of V_ft with $V (0, 0, 0)$ under various cost parameters and the distributions H1, H2, and H3. To make the comparison easier, we also report Δ_ft, which is the relative improvement over the optimal fixed threshold policy; i.e., $Δ_{f t} = (V (0, 0, 0) - V_{f t}) / V_{f t} .$ We observe that the difference between the lifetime values of a new tool under the optimal tool-usage policy and the optimal fixed-threshold policy can be substantial. For example, when C_r = 10, C_i = 1, and C_d = 0, the percentage improvement Δ_ft is equal to 38.2% for the distribution H1 (), and it is equal to 18.8% and 8.4% for the distributions H2 and H3, respectively ( and ).

Table 2. Comparison of the lifetime values of a new tool for the distribution H1 with n_H = 32.

Display Table

Table 3. Comparison of the lifetime values of a new tool for the distribution H2 with n_H = 32.

Display Table

Table 4. Comparison of the lifetime values of a new tool for the distribution H3 with n_H = 32.

Display Table

We make two observations regarding the sensitivity of the percentage improvement Δ_ft. First, how Δ_ft changes with C_r depends on the value of C_d. To be specific, we note that Δ_ft decreases in C_r for low values of C_d. As C_r increases, it becomes less important to postpone the retirement of a defective tool (because the optimal policy tends to retire the tool earlier to save the high salvage reward). Also, if C_d is low, it matters less whether to inspect the tool at fixed or adaptive thresholds. Thus, the two advantages of the optimal tool-usage policy over the fixed-threshold policy disappear, and we observe that Δ_ft decreases in C_r for lower values of C_d. On the other hand, when C_d is high enough, optimally timing the inspection moments becomes also important, and Δ_ft does not necessarily decrease in C_r. Second, we observe that Δ_ft increases as C_i increases for sufficiently high values of C_d. This is the situation in which working with a defective tool is costly and inspections are therefore important. As the cost of inspection increases, it becomes critical to adaptively plan what triggers an inspection, and hence, the improvement potential of the optimal tool usage policy is larger.

5.3. The value of postponing the tool replacement in the defective phase

In order to assess the value of postponing the retire-the-tool action in the defective phase, we further compare the maximum lifetime value under the optimal tool-usage policy with the lifetime value under the same policy except with one difference: the retire-the-tool action is taken as soon as an inspection finds the tool defective. We denote the lifetime value of a new tool under this no-postponement policy with V_np. The difference $V_{n p} - V_{f t}$ can be interpreted as the value of optimally timing the inspection moments, and the difference $V (0, 0, 0) - V_{n p}$ can be interpreted as the value of postponing the retire action for a defective tool. We observe in to that the difference $V (0, 0, 0) - V_{n p}$ is the most noticeable for lower C_d and C_r values.

5.4. The impact of the distribution of H on the maximum lifetime value

The expected number of defective products processed with a defective tool is the highest for the distribution H1 () and the lowest for the distribution H3 (). This explains why the maximum lifetime values are the highest in and lowest in . The results for the distribution H3 with n_H = 16 and for the distribution H1 with n_H = 8 are reported in and , respectively. Notice that the mean values of these two distributions are very close to each other, with the former having a larger variability than the latter. The decrease in the lifetime values under all the policies can be attributable to the higher variability in the distribution of H3.

Table 5. Comparison of the lifetime values of a new tool under different policies for C_r = 20:
(a) The distribution H3.

Display Table

(b) The distribution H1.

Display Table

6. Case study: Managing ECM tools at Philips Consumer Lifestyle

The case study was conducted at the manufacturing plant of Philips Consumer Lifestyle (PCL) in Drachten, the Netherlands. The plant in Drachten produces highly-precise shaver caps (, right) for electric shavers by using a micro-machining process called ECM. The ECM process (, left) requires expensive machine tools that have an intricate surface to mirror the required geometrics of the shaver cap. The objective of our collaboration with PCL is to design an improved tool-usage policy that maximizes the economic potential of each ECM tool. The current tool-usage strategy at PCL is based on the fixed-threshold policy as described in Section 1. When the wear on the tool surface is less than a specific value, the tool is regarded as being healthy. However, if the tool wear exceeds this specific value, then the tool is still operational but produces faulty products, and hence, it is regarded as defective. The defective state of the tool can only be detected with an inspection performed by specialized equipment and technicians. If a defective tool continues production, it eventually fails, due to the accumulation of wear. A failure substantially reduces the economic value of the tool.

Figure 6. Illustration of the ECM process.

After analyzing the maintenance logs between January 2015 and February 2017, we identify 49 tools for further analysis based on the availability and similar characteristics of their data. In , we provide a sample of the maintenance logs. The data include the run counter and cumulative counter values of an individual tool after each production run, as well as the reason for removing the tool from a machine and the follow-up maintenance activity. The maintenance logs do not keep track of the wear level on the tool surface.

Table 6. Example of the maintenance logs on ECM tools (note that the format in this table is day-month-year).

Download CSV Display Table

By tracking the reason why a tool has been taken off the production and the subsequent maintenance activity, we categorize each tool in one of the following four groups:

The tool is found normal in the last inspection, and it has not failed yet by the time it is retired.
The tool is found normal in the last inspection, and it has failed afterwards.
The tool is found defective in the last inspection, and it has not failed yet by the time it is retired.
The tool is found defective in the last inspection and it has failed afterwards. Specifically, if a tool’s last inspection has resulted in a maintenance activity that includes just cleaning the tool, the tool is regarded as ‘‘last observed as healthy’’, corresponding to groups 1 and 2.

By matching the reason the tool is last taken off the production with the four-digid error code that represents the tool failure, we conclude that 24 tools are in group 1 and 14 tools are in group 2. On the other hand, if a tool’s last inspection has resulted in more than just cleaning the tool, we regard them as ‘‘last observed as defective.’’ After matching the reason the tool is last taken off the production with the failure code, we find that seven tools are in group 3 and 4 tools are in group 4.

In Section EC.3, we provide the details on how we combine the data belonging to each group to estimate the probability distributions of the random variables X and H in our model. For ease in computations, we divide the run counter and cumulative counter by 1000, and then round them to the nearest integer. We assume that the random variables X and H both follow a discrete Weibull distribution (Nakagawa and Osaki, Citation1975), and estimate the scale parameter of X as $5.52 \times 10^{- 7}$ and the shape parameter of X as 3.1056. We estimate the scale and shape parameters of H as 0.0453 and 1.3833, respectively. The resulting distributional shapes of the random variables X and H are illustrated in .

Finally, we compare the maximum lifetime value of a new ECM tool under the optimal tool usage policy (i.e., $V (0, 0, 0)$ ) to the lifetime value under the optimal fixed-threshold policy (i.e., V_ft), and report their percentage difference Δ_ft in . Since the fixed-threshold policy used in practice is not necessarily the optimal one, our comparison can be considered as a lower bound on the economic benefit of the optimal tool usage policy in practice. For confidentiality, we do not disclose the economic parameters but run our experiments with a large set of representative cost values. We also do not specify the fixed-threshold policy parameters used by PCL. shows that, depending on the cost parameters, the percentage improvement over the optimal fixed-threshold policy can take values from 5.2% to 20.7%.

Table 7. The values of Δ_ft (unit: %) calculated for a new ECM tool.

Display Table

7. Conclusion

We study the inspection and retirement decisions for machine tools that go through a hidden defective phase that can only be detected via costly inspections. In particular, we identify the right moment to inspect and retire the tool such that the net expected production reward obtained from a specific tool is maximized. The products processed by a defective tool do not necessarily generate the same reward obtained from a normal tool and a tool failure can be very costly. We establish the structural properties of the optimal policy as a function of practical product counters, the last observed degradation phase of the tool and an inferred lower bound on the numbers of products the tool has processed in its normal phase. The structural properties depend on mild conditions on economic parameters as well as the probability distributions of the number of products the machine tool processes in the normal and defective phases. Therefore, the existence of structural properties can easily be verified in practice after estimating these probability distributions. The implementation of our model by using the real-world maintenance logs at PCL shows that the value of the optimal tool usage policy can be substantial compared with the current alternative. A future research direction is to develop the optimal tool usage policy that takes the advantage of coordinating the inspection decisions for multiple tools in order to maximize their total lifetime value. Another potential research direction is to jointly optimize the usage policy of the tools and the procurement decisions of the spare tools on the production line.

Supplemental material

Supplemental Material

Download PDF (294.4 KB)

Acknowledgments

The authors would like to thank Bas Tijsma from Philips Consumer Lifestyle for his valuable input in model development and data analysis.

Additional information

Funding

The authors gratefully acknowledge the financial support by European Commission H2020-ECSEL Project MANTIS-662189.

Notes on contributors

Alp Akcay

Alp Akcay is an assistant professor in the Department of Industrial Engineering and Innovation Sciences at Eindhoven University of Technology. He received his Ph.D. in Operations Management from the Tepper School of Business at Carnegie Mellon University. His research interests include statistical decision making under uncertainty, simulation design and analysis, and approximate dynamic programming with applications such as planning and control of manufacturing systems and predictive maintenance for capital goods.

Engin Topan

Engin Topan is an assistant professor of reliability, maintenance, and service logistics at the University of Twente. He received his Ph.D. from the Middle East Technical University, Turkey, in 2010. He worked at different positions at Cankaya University, Turkish Military Academy, Erasmus University Rotterdam, and Eindhoven University of Technology. He took part in several projects on after sales service logistics and maintenance planning for capital goods. His research interests include spare parts and service logistics, and maintenance panning.

Geert-Jan van Houtum

Geert-Jan van Houtum is a professor of maintenance and reliability at the Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology. His research is focused on predictive maintenance, availability management of capital goods, spare parts management, and the effect of design decisions on the total cost of ownership of capital goods. Much of his research is based on a close collaboration with e.g. ASML, Dutch Railways, Philips, and the Royal Airforce. He is associate editor of Manufacturing and Service Operations Management and area editor at Service Science.

References

Arts, J. and Basten, R. (2018) Design of multi-component periodic maintenance programs with single-component models. IISE Transactions, 50, 606–615.
Web of Science ®Google Scholar
Barlow, R.E., Hunter, L.C. and Proschan, F. (1963) Optimum checking procedures. Journal of the Society for Industrial and Applied Mathematics, 11, 1078–1095.
Google Scholar
Berrade, M.D., Scarf, P.A. and Cavalcante, C.A.V. (2017) A study of postponed replacement in a delay time model. Reliability Engineering & System Safety, 168, 70–79.
Web of Science ®Google Scholar
Bertsekas, D.P. (2005) Dynamic Programming and Optimal Control, vol. 1. 4th ed. Athena Scientific, Belmont, MA.
Google Scholar
Bertsekas, D.P. and Tsitsiklis, J.N. (1991) An analysis of stochastic shortest path problems. Mathematics of Operations Research, 16, 580–595.
Web of Science ®Google Scholar
Christer, A.H. (1982) Modelling inspection policies for building maintenance. Journal of the Operational Research Society, 33, 723–732.
PubMed Web of Science ®Google Scholar
Christer, A.H. and Wang, W. (1995) A simple condition monitoring model for a direct monitoring process. European Journal of Operational Research, 82, 258–269.
Web of Science ®Google Scholar
de Jonge, B. and Scarf, P.A. (2019) A review on maintenance optimization. European Journal of Operational Research. doi:10.1016/j.ejor.2019.09.047.
Google Scholar
Duan, C., Makis, V. and Deng, C. (2019) Optimal Bayesian early fault detection for CNC equipment using hidden semi-Markov process. Mechanical Systems and Signal Processing, 122, 290–306.
Web of Science ®Google Scholar
Groover, M.P. (2017) Groover’s principles of modern manufacturing: materials, processes, and systems, John Wiley & Sons, Inc., Hoboken, NJ.
Google Scholar
Jiang, R. (2017) An efficient quasi-periodic inspection scheme for a one-component system. IMA Journal of Management Mathematics, 28, 373–386.
Web of Science ®Google Scholar
Kaio, N. and Osaki, S. (1984) Some remarks on optimum inspection policies. IEEE Transactions on Reliability, 33, 277–279.
Web of Science ®Google Scholar
Khojandi, A., Maillart, L.M. and Prokopyev, O.A. (2014) Optimal planning of life-depleting maintenance activities. IIE Transactions, 46, 636–652.
Web of Science ®Google Scholar
Kim, M.J. and Makis, V. (2013) Joint optimization of sampling and control of partially observable failing systems. Operations Research, 61, 777–790.
Web of Science ®Google Scholar
Leung, F.K. (2001) Inspection schedules when the lifetime distribution of a single-unit system is completely unknown. European Journal of Operational Research, 132, 106–115.
Web of Science ®Google Scholar
MacPherson, A.J. and Glazebrook, K.D. (2011) A dynamic programming policy improvement approach to the development of maintenance policies for 2-phase systems with aging. IEEE Transactions on Reliability, 60, 448–459.
Web of Science ®Google Scholar
Maillart, L.M., (2006) Maintenance policies for systems with condition monitoring and obvious failures. IIE Transactions, 38, 463–475.
Web of Science ®Google Scholar
Maillart, L.M. and Pollock, S.M. (2002) Cost-optimal condition-monitoring for predictive maintenance of 2-phase systems. IEEE Transactions on Reliability, 51, 322–330.
Web of Science ®Google Scholar
Makis, V. and Jiang, X. (2003) Optimal replacement under partial observations. Mathematics of Operations Research, 28, 382–394.
Web of Science ®Google Scholar
Moghaddass, R. and Ertekin, S. (2018) Joint optimization of ordering and maintenance with condition monitoring data. Annals of Operations Research, 263, 271–310.
Web of Science ®Google Scholar
Nakagawa, T. and Osaki, S. (1975) The discrete Weibull distribution. IEEE Transactions on Reliability, 24, 300–301.
Web of Science ®Google Scholar
Ohnishi, M., Kawai, H. and Mine, H. (1986) An optimal inspection and replacement policy for a deteriorating system. Journal of Applied Probability, 23, 973–988.
Web of Science ®Google Scholar
Ozekici, S. and Pliska, S.R. (1991) Optimal scheduling of inspections: A delayed Markov model with false positives and negatives. Operations Research, 36, 261–273.
Google Scholar
Parmigiani, G. (1996) Optimal scheduling of fallible inspections. Operations Research, 44, 360–367.
Web of Science ®Google Scholar
Probst, E. (2015) 6 factors maximize profitability in high-precision machining. www.mmsonline.com/articles/6-factors-help-maximize-profitability-in-high-precision-machining. Accessed July 15, 2018.
Google Scholar
Scarf, P.A., Cavalcante, C., Dwight, R.A. and Gordon, P. (2009) An age-based inspection and replacement policy for heterogeneous components. IEEE Transactions on Reliability, 58, 641–648.
Web of Science ®Google Scholar
Sengupta, B. (1980) Inspection procedures when failure symptoms are delayed. Operations Research, 28, 768–776.
Web of Science ®Google Scholar
Van Oosterom, C.D., Elwany, A.H., Çelebi, D. and van Houtum, G. (2014) Optimal policies for a delay time model with postponed replacement. European Journal of Operational Research, 232, 186–197.
Web of Science ®Google Scholar
Wang, W. (2012) An overview of the recent advances in delay-time-based maintenance modelling. Reliability Engineering and System Safety, 106, 165–178.
Web of Science ®Google Scholar
Wang, W. and Christer, A.H. (2003) Solution algorithms for a nonhomogeneous multi-component inspection model. Computers and Operations Research, 30, 19–34.
Web of Science ®Google Scholar
Yang, L., Ma, X., Zhai, Q. and Zhao, Y. (2016) A delay time model for a mission-based system subject to periodic and random inspection and postponed replacement. Reliability Engineering & System Safety, 150, 96–104.
Web of Science ®Google Scholar
Yang, L., Ye, Z.S., Lee, C.G., Yang, S.F. and Peng, R. (2019) A two-phase preventive maintenance policy considering imperfect repair and postponed replacement. European Journal of Operational Research, 274, 966–977.
Web of Science ®Google Scholar

Machine tools with hidden defects: Optimal usage for maximum lifetime value

Abstract

1. Introduction

2. Literature