Search in:

International Journal of Production Research Latest Articles

Submit an article Journal homepage

Open access

1,128

Views

CrossRef citations to date

Altmetric

Listen

Sustainable manufacturing using Zero Defect Manufacturing

A multi-domain mixture density network for tool wear prediction under multiple machining conditions

Gyeongho Kima Department of Industrial Engineering, Ulsan National Institute of Science and Technology, Ulsan, Republic of KoreaView further author information

Sang Min Yangb Department of Mechanical Engineering, Ulsan National Institute of Science and Technology, Ulsan, Republic of KoreaView further author information

Sinwon Kimb Department of Mechanical Engineering, Ulsan National Institute of Science and Technology, Ulsan, Republic of KoreaView further author information

Do Young Kimc Department of Mechatronics Engineering, Chungnam National University, Daejeon, Republic of KoreaView further author information

Jae Gyeong Choia Department of Industrial Engineering, Ulsan National Institute of Science and Technology, Ulsan, Republic of KoreaView further author information

Hyung Wook Parkb Department of Mechanical Engineering, Ulsan National Institute of Science and Technology, Ulsan, Republic of KoreaView further author information

Sunghoon Lima Department of Industrial Engineering, Ulsan National Institute of Science and Technology, Ulsan, Republic of Korea;d Graduate School of Artificial Intelligence, Ulsan National Institute of Science and Technology, Ulsan, Republic of Korea;e Industry Intelligentization Institute, Ulsan National Institute of Science and Technology, Ulsan, Republic of KoreaCorrespondence[email protected]
View further author information

show all

Received 03 Jul 2023, Accepted 20 Nov 2023, Published online: 06 Dec 2023

Cite this article
https://doi.org/10.1080/00207543.2023.2289076
CrossMark

In this article

1. Introduction
2. Theoretical framework
3. Proposed method
4. Experiments
5. Results
6. Discussion
7. Conclusion and future work
Acknowledgements
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Accurate tool wear prediction is an essential task in machining processes because it helps to schedule efficient tool maintenance and maximise the tool's useful life, thus contributing to sustainable production via zero defect manufacturing (ZDM). However, there are limitations to existing methods; these cannot be used under multiple machining conditions, which is common practice. This problem not only hinders accurate tool wear monitoring but also necessitates the use of multiple models, which increases operation and modelling costs. Therefore, the multi-domain learning problem should be addressed to enable tool wear prediction under various machining conditions. To this end, this work presents a novel method, a multi-domain mixture density network (MD $^{2}$ N). In particular, a Bayesian learning-based feature extractor is proposed to learn domain-invariant representations. Additionally, an adversarial learning approach is developed to lead the predictive model in learning domain-invariant features. Lastly, a mixture density network-based predictor is used to generate probabilistic tool wear outputs. Experiments that use datasets from real-world milling processes under multiple conditions prove the proposed method's promising efficacy, with the best mean absolute error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE) of 2.1748, 5.6422, and 0.0350, respectively, indicating the ability to learn multi-domain representations.

KEYWORDS:

Bayesian approach
deep learning
smart manufacturing
sustainability
tool wear prediction
zero defect manufacturing

1. Introduction

The machining process is one of the most widely adopted techniques in manufacturing. Among various factors that affect the quality of manufactured products, the machine tool's condition has the greatest impact. In particular, worn tools cause surface roughness and cracking via residual stress. Furthermore, tool wear often leads to tool breakage, which incurs additional costs for tool change and sudden stoppage, leading to decreased productivity. This problem is intensified for various commonly used materials that have desirable characteristics, such as a high strength-to-weight ratio and corrosion resistance. For instance, titanium alloy's thermal properties cause the material to maintain a high temperature at the tool-material interface, which accelerates the tool wear. Therefore, accurately predicting ongoing tool wear during machining processes is essential not only for determining the proper time to change the tool, which can maximise available tool life, but also for preventing tool breakage, leading to improved overall process efficiency and sustainability. Hence, the development of a high-performing tool wear prediction system can contribute to digital manufacturing transformation (Konstantinidis et al. Citation2023) by lowering defects and increasing productivity, leading to zero defect manufacturing (ZDM).

Despite its effectiveness, developing an effective tool condition monitoring (TCM) method is a challenging task, because tool wear generation is highly complex to model, both analytically and numerically. Traditionally, mechanistic approaches that employ physical model-based methods and analytical solutions have been proposed. However, due to the difficulty of formulation and computational cost, they cannot be easily applied and adapted in real-time TCM (Traini, Bruno, and Lombardi Citation2021). Furthermore, they cannot consider varying tool conditions during machining processes, which hampers flexible applications in diverse domains. On the other hand, data-driven approaches, which utilise sensor measurement data from machining processes to predict tool wear, have been recently proposed. In particular, machine learning (ML) and deep learning (DL)-based methods that exploit multivariate time-series data have shown superior performances in TCM for various machining processes (G. Kim et al. Citation2022; Shi et al. Citation2019; Traini, Bruno, and Lombardi Citation2021; Yang, Zhou, and Tsui Citation2016).

However, there are several limitations to the current state of data-driven tool wear prediction approaches. First, most existing methods can only be used in a single machining condition because each predictive model is trained on data from a particular machining process that corresponds to a single machining condition (C. Liu et al. Citation2021). However, in real-world machining practices, multiple machining conditions are widely used to produce the desired quality products. Various factors that determine the machining condition, such as cutting speed (i.e. a material removal rate), lubricant type, tool material, and machining type (e.g. rough, finishing), are adjusted to yield high product quality and process efficiency. In addition, recent machining technologies, which are used to restrain the progression of tool wear using coolants that lower material temperature, increase the diversity of machining conditions that affect tool wear. Despite the prevalence of machining process that uses multiple machining conditions, existing approaches have serious limitations, because multiple models should be used for prediction simultaneously or a model should be switched for use whenever the machining condition is changed (Z. Wang et al. Citation2021). Therefore, the lack of a capability to perform tool wear prediction under multiple machining conditions not only increases modelling cost but also operation cost due to the employment of as many independent models as the number of machining conditions, thus harming the sustainability of manufacturing processes.

In order to develop a unified model that can predict the degree of tool wear under multiple machining conditions to greatly improve efficiency in model development and deployment, several problems must be addressed. First, data from multiple machining conditions have different distributions because each data comes from a distinct domain. This heterogeneity increases the complexity of tool wear prediction and is the main reason why standard data-driven models perform poorly in multi-domain learning problems (Luo, Wen, and Tao Citation2017). Therefore, prediction models should learn multi-domain features to perform under multiple machining conditions. In addition, particularly in machining processes with multiple machining conditions, an inverse problem often exists where similar input data have different target values (e.g. degrees of tool wear), which hinders the application of models trained via the principle of a maximum likelihood estimation that produces a conditional average (C. Li and Lee Citation2019). For instance, in data-driven approaches where real-time sensor measurements are used as inputs to predict tool wear (i.e. output), similar input values may correspond to different output values depending on the machining condition. Due to these problems, tool wear prediction under multiple machining conditions has not been successfully performed in the current literature, and thus most of the existing methods have not proven efficient in realistic machining settings.

This work addresses the aforementioned problems to develop a unified data-driven tool wear prediction method, a multi-domain mixture density network (MD $^{2}$ N) that accounts for multiple machining conditions using various techniques. First, a Bayesian learning-based approach to multi-domain feature learning is developed. Using variational inference (VI), a convolution-based feature extractor learns domain-invariant representations. In addition, an adversarial learning approach is developed to lead the predictive model to learn domain-invariant features by optimising the objective function during training. Lastly, a mixture density network (MDN)-based predictor is used to generate probabilistic tool wear prediction outputs. In the experiments, real-world milling processes with various machining conditions are performed and used to validate the proposed method's efficacy. Comparisons with existing data-driven methods and state-of-the-art (SOTA) methods in tool wear prediction prove the proposed method's effectiveness.

The proposed method not only shows superior tool wear prediction performance but also greatly improves efficiency in modelling and deployment, because a single model can work versatilely for multiple machining conditions. In particular, the ability to learn domain-invariant features enables the use of the proposed method under various machining conditions. Moreover, although the proposed method can handle data from multiple domains, it requires a short inference time, which enables an effective online prognostic application. The main difference between conventional approaches to multi-domain learning (MDL) in tool wear prediction and the proposed approach is illustrated in Figure . The contributions of this work to the literature are as follows.

Figure 1. (a) Conventional approaches and (b) the proposed approach to multi-domain tool wear prediction.

(a) Multiple models take distinct input signals individually and each of them outputs tool wear prediction. (b) Diverse input signals are fed into the proposed model (i.e. MD2N) at once and are transformed into predicted tool wear.

A novel tool wear prediction method (i.e. MD $^{2}$ N) is proposed, which proves effective and efficient under various machining conditions.
Bayesian and adversarial learning approaches are developed to learn domain-invariant features to achieve multi-domain learning for tool wear prediction under multiple machining conditions.
A mixture density network-based model is proposed to generate probabilistic predictions, and various relevant techniques are presented.
The superior prediction performance of the proposed MD $^{2}$ N under various machining conditions indicates its practical applicability and the ability to learn multi-domain features and generate multimodal predictions.

2. Theoretical framework

This section outlines a theoretical framework and existing works in the literature pertaining to tool wear prediction, including conventional and the latest data-driven methods. Various ML and DL-based methods as well as their limitations are also discussed. In addition, the concept of multi-domain learning, which aligns with the objective of this work, which is to learn a model with domain-invariant characteristics, is presented, and the related works in the literature are reviewed in detail. Lastly, a specific probabilistic model architecture, named the mixture density network, is introduced, and its variety of usage in existing works is illustrated.

2.1. Data-driven tool wear prediction

Traditional approaches to tool wear prediction rely on analytical solutions and indirect methods given by a concrete formula between the degree of tool wear and the factors that affect it, such as cutting force, speed, and removal rate. This makes finding solutions difficult and also limits real-time TCM (G. Kim, Yang, et al. Citation2023; García-Ordás et al. Citation2018). Later, data-driven methods that utilise signals from computer numerical control (CNC) machines and attached sensors enable faster application and improved accuracy in tool wear prediction (Kuo and Kusiak Citation2019). Some hybrid approaches that combine analytical physics-based models with data-driven methods have also been proposed (Zhang et al. Citation2022). Conventional data-driven methods like state-space models rely on manual statistical feature extraction techniques coupled with ML-based predictors. Traini, Bruno, and Lombardi (Citation2021) suggest a TCM framework for predictive maintenance using various time and frequency domain features extracted from sensor measurements for the milling process. Yang, Zhou, and Tsui (Citation2016) employ a differential evolution algorithm with an extreme learning machine using statistical features and wavelet features for online tool wear estimation. Zamudio-Ramírez et al. (Citation2020) propose using stray flux signals with a discrete wavelet transform and a fast Fourier transform. Zhu and Liu (Citation2017) develop a TCM based on a hidden semi-Markov model using cutting force signals. However, conventional data-driven approaches that use manual feature extraction require additional computational time that hinders real-time TCM. In contrast, recent approaches that utilise predictive models with high expressive power, such as ML and DL, have shown superior performance in data-driven TCM (C. Liu, Zheng, and Xu Citation2021).

Recently presented approaches use deep neural network (DNN) architectures that can handle complex multivariate time-series data from machining processes. In particular, a convolutional neural network (CNN), a recurrent neural network (RNN), and their variants, which are suitable for handling sensor signals (e.g. force, audio, vibration), are widely adopted (G. Kim et al. Citation2021; J. Wang et al. Citation2019). Xu et al. (Citation2020) present an integrated tool wear prediction model based on multiple DNNs, including a parallel CNN, a deep residual network, and a bi-directional long short-term memory (LSTM), using multisensory input signals. Shi et al. (Citation2019) use a stacked sparse autoencoder to develop a tool wear prediction method based on vibration signals. Hahn and Mechefske (Citation2021) present a tool wear monitoring method based on a disentangled variational autoencoder (VAE) and a temporal CNN.

Among various DNN architectures, CNN-based methods that have model flexibility and adaptiveness to long time-series data are widely adopted in data-driven TCM. In fact, the convolutional operation is more efficient than other DNN operations, such as recurrent operations, in terms of computational complexity and speed (Kusiak Citation2020), thus is appropriate in online data-driven TCM. Ma et al. (Citation2021) utilise CNN with a bidirectional gated recurrent unit (GRU) to predict tool wear during the milling process based on force signal. Guo et al. (Citation2021) propose a multi-scale convolutional attention network-based tool life prediction method. Sun et al. (Citation2020) propose employing CNN and LSTM, which use raw sensor signals to predict tool flank wear. This work also proposes a DL-based prediction method using CNN that is suitable for sensor signals. However, compared to most existing methods that can only work under a single machining condition, this work presents a domain-invariant model, which performs under multiple conditions and greatly improves prediction performance and efficiency. From the perspective of ZDM, the proposed method can help achieve sustainable and efficient maintenance for digital manufacturing and production systems (Azamfirei, Psarommatis, and Lagrosen Citation2023; Psarommatis, May, and Azamfirei Citation2023). In particular, when coupled with existing ZDM strategies using intelligent quality management frameworks (Konstantinidis et al. Citation2023), the proposed method can improve product quality and cost by accurate tool wear prediction under multiple machining conditions. In addition, informed decisions based on predictions can greatly enhance process efficiency, leading to ZDM.

2.2. Multi-domain learning

Although the task of developing a unified model that can accurately predict tool wear under multiple machining conditions has not been previously well addressed, it is similar to the problem setting of multi-domain learning (MDL) in ML literature. Closely related to domain-invariant learning and domain generalisation, MDL aims to train predictive models that perform on data drawn from multiple domains. The existence of multi-domain leads to domain shift both in input and output spaces, which renders general model training difficult and impairs predictive performance (Zhao et al. Citation2023). To overcome this problem, the MDL-based method is developed to train models to learn feature representations that are invariant to data-generating distributions. Applying MDL not only provides the representation learning method for DNNs but also improves modelling efficiency, because it eliminates the need to train individual models for every domain. In the task of tool wear prediction, MDL can help develop models invariant to different data (i.e. the sensor signals and the degree of tool wear) domains due to changes in machining conditions. Therefore, this work employs an MDL-based approach to developing a tool wear prediction method that can well perform under multiple machining conditions.

Recent MDL approaches have relied on modifying the model architecture and optimisation process to make the feature representations of every domain similar. Rebuffi, Bilen, and Vedaldi (Citation2017) propose using residual adapter modules, which contain domain-agnostic and domain-specific parameters of DNN. This approach is further improved by a universal parameterisation method (Rebuffi, Bilen, and Vedaldi Citation2018). Bilen and Vedaldi (Citation2017) present a deep information-sharing method using the core parameters of DNN, except those used in normalisation techniques. Berriel et al. (Citation2019) develop an efficient architectural strategy for MDL using budget-aware adapters that learn the most relevant features from novel domains. Y. Li and Vasconcelos (Citation2019) suggest the use of covariance normalisation and an adaptive layer for each domain for MDL, which share the same idea as the MDL approaches mentioned above. Several approaches employ iterative pre-training and fine-tuning strategies for MDL, however, they do not guarantee domain-invariant feature learning (S. He et al. Citation2020). In addition, MDL approaches based on modifying model architectures limit the application of various DNN types, as they impose constraints on model selection.

Ganin and Lempitsky (Citation2015) propose a seminal approach to MDL that uses a gradient reversal layer (GRL) and standard backpropagation to train models adversarially. During training, the domain classifier loss is maximised to obtain domain-invariant deep feature representations. Xiao et al. (Citation2021) propose MDL using a Bayesian neural network by applying a partial Bayesian treatment to the last two layers with an explicit objective function that leads to learning invariant representations. Gao et al. (Citation2022) also use a VAE to employ a Bayesian learning approach for conditional domain-invariant feature representation by aligning the class-conditional distributions of feature representations. This work develops an MDL-based tool wear prediction method that combines two key ideas; the Bayesian learning scheme and an adversarial learning-based approach (such as Ganin and Lempitsky Citation2015). To the best of our knowledge, except for similar but different approaches using meta-learning and transfer learning (J. He et al. Citation2022; Y. Li et al. Citation2019; Z. Wang et al. Citation2021; Zhao et al. Citation2023), no previous work has been done on an MDL-based approach to tool wear prediction under multiple machining conditions.

2.3. Mixture density network

As mentioned previously, in addition to the heterogeneity of multi-domain data, one reason that standard DNN-based tool wear prediction methods cannot perform well under multiple machining conditions is that the output is forced to converge to the conditional average, while in reality, it could be multimodal (C. Li and Lee Citation2019). For instance, although the data samples are drawn from different domains, both might show similar patterns in input features while having a dissimilar degree of tool wear. This type of issue, which is called the inverse problem (Mao et al. Citation2022), brings detrimental effects to model training by simply leading to output average value rather than learning multi-domain features. Therefore, this is one of the reasons that prohibit the direct use of existing tool wear prediction methods in real-world practice under multiple machining conditions. To address this problem, the proposed method exploits a probabilistic model architecture called the mixture density network (MDN). In addition to the aforementioned advantages of the use of MDN in the proposed method, MDN also provides probabilistic prediction and high adaptability, which can be beneficial in other manufacturing domains.

MDN combines DNN with a mixture density model, which assumes that conditional likelihood $p (y | x)$ consists of multiple plausible distributions. Due to its mixture assumption, MDN has high flexibility to model arbitrary distribution functions. In contrast to general deterministic DNNs, which can only output a single value, MDN can represent multimodal outputs and thus is deemed proper in the task of tool wear prediction under multiple machining conditions. In detail, MDN uses DNN to directly estimate the parameters of the mixture distributions from which the data are assumed to be drawn. For computational efficiency and theoretical guarantee, MDN often assumes a mixture of Gaussians, as expressed in Equation (Equation1(1) $\begin{aligned} p (y | x) = \sum_{k = 1}^{K} π_{k} (x) N (y | μ_{k} (x), σ_{k} (x)) . \\ w h e r e & \sum_{k = 1}^{K} π_{k} (x) = 1 a n d π_{k} \geq 0 \forall k . \end{aligned}$ (1) ), to model the conditional density (H. Kim and Kim Citation2023). Therefore, the MDN outputs consist of three parts: (1) mixing coefficients of the mixture distribution, (2) mean values for every distribution, and (3) standard deviation values. As shown in Equation (Equation1(1) $\begin{aligned} p (y | x) = \sum_{k = 1}^{K} π_{k} (x) N (y | μ_{k} (x), σ_{k} (x)) . \\ w h e r e & \sum_{k = 1}^{K} π_{k} (x) = 1 a n d π_{k} \geq 0 \forall k . \end{aligned}$ (1) ), the conditional probability distribution $p (y | x)$ is modelled by Gaussian mixture components $N (y | μ_{k} (x), σ_{k} (x))$ with corresponding positive weights $π_{k} (x)$ . In practice, the training of MDN is not trivial due to some numerical instabilities (Choi et al. Citation2018). For this reason, this work develops practically useful techniques that facilitate the training procedure of MDN. In particular, several novel methods concerning the activation functions and the regularisation are presented in Section 3. (1) $\begin{aligned} p (y | x) = \sum_{k = 1}^{K} π_{k} (x) N (y | μ_{k} (x), σ_{k} (x)) . \\ w h e r e & \sum_{k = 1}^{K} π_{k} (x) = 1 a n d π_{k} \geq 0 \forall k . \end{aligned}$ (1) MDN is widely used in ML and DL applications, where probabilistic predictive distributions should be generated to account for various hypotheses, which are often multimodal. C. Li and Lee (Citation2019) use MDN on top of a CNN-based feature extractor to generate multiple hypotheses in a three-dimensional human pose estimation. Saunders, Camgoz, and Bowden (Citation2021) use a transformer with MDN to produce various sign language poses. Ji, Ameri, and Cho (Citation2021) employ MDN architecture to represent probabilistic effects on the prediction of non-conformance rates. MDN is also applied in other DL tasks, such as stereo matching (Tosi et al. Citation2021) and autonomous vehicle (Choi et al. Citation2018), due to its ability to model complex target distributions. However, despite its effectiveness in modelling multimodal and probabilistic outputs, MDN has not been commonly used in intelligent production domains. Therefore, another primary novelty of this work comes from the proper application and adaptation of MDN in a tool wear prediction task, which can be further developed in other manufacturing domains. In particular, the proposed method employs MDN to generate potential multimodal predictions using multi-domain features from various machining conditions.

3. Proposed method

3.1. Model architecture

This work proposes a novel tool wear prediction method named the multi-domain mixture density network (MD $^{2}$ N) that can be used under multiple machining conditions. To this end, multiple components are developed and combined to learn multi-domain features and generate multimodal predictions. In particular, a Bayesian learning-based feature extractor is proposed to learn domain-invariant representations. The MDN-based predictor is then used to generate a multimodal predictive distribution for tool wear. In addition, to achieve domain invariance, an adversarial learning-based auxiliary domain classifier with GRL is used to guide model training. An overall MD $^{2}$ N model architecture, which is illustrated in detail in the following sections, is shown in Figure .

Figure 2. MD $^{2}$ N: the proposed model architecture.

The experimental setup using multiple machining conditions that generate multivariate time-series input signals, leading to the Bayesian domain-invariant feature extractor that consists of multiple neural network layers, leading to the mixture density network and the auxiliary domain classifier, simultaneously.

3.2. Bayesian domain-invariant feature extractor

This work proposes a Bayesian domain-invariant feature extractor (BDIFE) that maps input data to representations that can be used for multiple domains. Based on the findings that Bayesian learning is effective in MDL (Gao et al. Citation2022; Xiao et al. Citation2021), this work postulates that it will function as an ensemble of domain-specific representations (i.e. Bayesian model averaging), leading to probabilistically modelling multi-domain features. To this end, BDIFE employs Bayesian convolutional operations to extract domain-invariant features from the input data. A VI-based approximate inference (i.e. a reparameterisation technique) is used to realise Bayesian convolutions. Given input data $x \in R^{L \times C}$ , where L and C are the length of the multivariate time-series and the number of channels (i.e. variables), respectively, BDIFE maps the input to the feature representation $g \in R^{L^{'} \times C^{'}}$ that is assumed to be domain-invariant by applying multiple Bayesian convolutions, batch normalisation, and pooling. Besides, convolutional operations are organised following a residual block structure (K. He et al. Citation2016).

Denoting the Bayesian parameters of BDIFE as w, BDIFE $f_{w}$ , with a prior $p (w) = N (0, I)$ , is optimised during training to generate domain-invariant representations. This work assumes that during training BDIFE would learn to generate diverse plausible representations for multiple domains via Bayesian inference. Bayesian learning provides a better generalisation ability to various domains because rather than learning deterministic values, Bayesian treatment allows model parameters to capture a diverse range of representations in a probabilistic manner (Ouyang et al. Citation2016). As mentioned above, since VI is adopted, a factorised Gaussian distribution is used as a variational posterior $q (w | θ) = N (μ, ρ^{2})$ , as expressed in Equation (Equation2(2) $\begin{aligned} \begin{aligned} w = μ + ρ \cdot ϵ . \\ ϵ \sim N (0, I) . \end{aligned} \end{aligned}$ (2) ). Further, w is reparametrised as $θ = (μ, ρ)$ , leading to evidence lower bound (ELBO) objective of Equation (Equation3(3) $\begin{aligned} L_{V I} = K L (q (w | θ) | | p (w)) - E_{q (w | θ)} [\log (p (y | x, w))] \\ \approx \sum_{i = 1}^{n} \log (q (w^{(i)} | θ)) - \log (p (w^{(i)})) \\ - \log (p (y | x, w^{(i)})) . \end{aligned}$ (3) ). Therefore, during training, parameters μ and ρ are optimised that determine BDIFE parameters w, as shown in Equation (Equation2(2) $\begin{aligned} \begin{aligned} w = μ + ρ \cdot ϵ . \\ ϵ \sim N (0, I) . \end{aligned} \end{aligned}$ (2) ). For a more detailed derivation of ELBO, refer to Kingma and Welling (Citation2013). (2) $\begin{aligned} \begin{aligned} w = μ + ρ \cdot ϵ . \\ ϵ \sim N (0, I) . \end{aligned} \end{aligned}$ (2) (3) $\begin{aligned} L_{V I} = K L (q (w | θ) | | p (w)) - E_{q (w | θ)} [\log (p (y | x, w))] \\ \approx \sum_{i = 1}^{n} \log (q (w^{(i)} | θ)) - \log (p (w^{(i)})) \\ - \log (p (y | x, w^{(i)})) . \end{aligned}$ (3) On top of the extracted representations from the main BDIFE operations, including Bayesian convolutions, a squeeze-excitation (SE) block (Hu, Shen, and Sun Citation2018) is used to let the model adaptively select informative information by modelling dependencies between features. Hence, the final representation of BDIFE is denoted as $g = f (x; w)$ . A detailed architectural structure of BDIFE is illustrated in Figure .

Figure 3. The detailed architectural structure of BDIFE.

A series of neural network layers stacked and connected with arrows in an order that constitutes the proposed BDIFE architecture.

3.3. MDN-based tool wear predictor

Using the extracted feature representations from BDIFE and SE block that are assumed to be domain-invariant due to the benefits of Bayesian learning, an MDN-based predictor is constructed to generate a multimodal predictive distribution of tool wear. The MDN outputs a Gaussian mixture distribution as expressed in Equation (Equation1(1) $\begin{aligned} p (y | x) = \sum_{k = 1}^{K} π_{k} (x) N (y | μ_{k} (x), σ_{k} (x)) . \\ w h e r e & \sum_{k = 1}^{K} π_{k} (x) = 1 a n d π_{k} \geq 0 \forall k . \end{aligned}$ (1) ). In particular, the parameters that constitute the mixture distribution are produced by MDN $f_{ϕ} (x) = {π_{k}, μ_{k}, σ_{k}}_{k = 1}^{K}$ with the learnable parameters ϕ, where k represents the mixture component. The optimisation objective of the MDN-based predictor in MD $^{2}$ N is expressed in Equation (Equation4(4) $L_{M D N} = - \frac{1}{N} \sum_{i = 1}^{N} \log (\sum_{k = 1}^{K} π_{k} (x_{i}) N (p (y | μ_{k} (x_{i}), σ_{k} (x_{i})))) .$ (4) ). (4) $L_{M D N} = - \frac{1}{N} \sum_{i = 1}^{N} \log (\sum_{k = 1}^{K} π_{k} (x_{i}) N (p (y | μ_{k} (x_{i}), σ_{k} (x_{i})))) .$ (4)

Despite MDN's practical and theoretical effectiveness, its training requires careful treatment of the components. In particular, numerical instabilities often arise due to the likelihood function (Choi et al. Citation2018). In this work, several techniques are adopted to stabilise training. For an activation function inside the MDN-based predictor, a modified exponential linear unit is used, as expressed in Equation (Equation5(5) $\begin{aligned} h (x) & = {\begin{cases} x + 1, & i f x > 0. \\ α \cdot (\exp (x) - 1), & i f x \leq 0. \end{cases} \end{aligned}$ (5) ). In addition, Xavier initialisation and L2 regularisation are applied to every MDN parameter. To facilitate learning, two additional regularisation techniques are also developed, as shown in Equations (Equation6(6) $\begin{aligned} L_{π} & = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} (- π_{k} (x_{i}) \log (π_{k} (x_{i}))) . \end{aligned}$ (6) ) and (Equation7(7) $\begin{aligned} L_{σ} & = \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} (σ_{k} (x_{i}))^{2} . \end{aligned}$ (7) ). The use of Equation (Equation6(6) $\begin{aligned} L_{π} & = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} (- π_{k} (x_{i}) \log (π_{k} (x_{i}))) . \end{aligned}$ (6) ) regularises the negative entropy throughout MDN training, which prohibits the generated mixing coefficients from being overly imbalanced. Since the loss is to be minimised during model optimisation (i.e. training), the value of Equation (Equation6(6) $\begin{aligned} L_{π} & = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} (- π_{k} (x_{i}) \log (π_{k} (x_{i}))) . \end{aligned}$ (6) ) decreases as the distribution of mixing coefficients becomes more uniform (i.e. higher entropy). Equation (Equation7(7) $\begin{aligned} L_{σ} & = \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} (σ_{k} (x_{i}))^{2} . \end{aligned}$ (7) ) constrains MDN from producing too large standard deviation values for every mixture component, by directly regularising the magnitude of $σ_{k} (x_{i})$ . (5) $\begin{aligned} h (x) & = {\begin{cases} x + 1, & i f x > 0. \\ α \cdot (\exp (x) - 1), & i f x \leq 0. \end{cases} \end{aligned}$ (5) (6) $\begin{aligned} L_{π} & = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} (- π_{k} (x_{i}) \log (π_{k} (x_{i}))) . \end{aligned}$ (6) (7) $\begin{aligned} L_{σ} & = \frac{1}{N} \sum_{i = 1}^{N} \sum_{k = 1}^{K} (σ_{k} (x_{i}))^{2} . \end{aligned}$ (7) As mentioned above, during inference, the tool wear prediction output of MDN consists of a mixture distribution with each mixture component and its corresponding distribution parameters (i.e. mean and standard deviation). In this work, one of the most widely used heuristics in MDN literature to use the mixture component that has the highest mixing coefficient $k^{*} = a r g m a x_{k} π_{k}$ is selected to yield the final tool wear prediction $N (μ_{k^{*}}, σ_{k^{*}}^{2})$ .

3.4. Auxiliary domain classifier

As mentioned above, BDIFE is expected to achieve MDL via Bayesian learning and MDN. In addition, this work further enhances MDL ability by using adversarial learning-based GRL (Ganin and Lempitsky Citation2015). During training, an auxiliary domain classifier (ADC), which predicts the domain class t given extracted representations from BDIFE, is constructed and optimised. Importantly, ADC is optimised in an adversarial manner to maximise the training objective as shown in Equation (Equation8(8) $L_{A D C} = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{d = 1}^{D} t_{d} \log (p (t | x_{i})_{d}) .$ (8) ), to learn domain-invariant representations. While cross-entropy loss (shown in Equation (Equation8(8) $L_{A D C} = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{d = 1}^{D} t_{d} \log (p (t | x_{i})_{d}) .$ (8) )) is used for ADC in this work, other types of loss functions for classification can also be employed. This idea of guiding model training adversarially for the sake of MDL is realised by GRL, which converts the sign of gradient for certain parts of MD $^{2}$ N. (8) $L_{A D C} = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{d = 1}^{D} t_{d} \log (p (t | x_{i})_{d}) .$ (8) ADC $f_{ψ}$ consists of several feedforward layers with a softmax activation parameterised by ψ. For adversarial learning of domain-invariant representations, coupled with the BDIFE and MDN-based predictor, the final objective of MDL training for MD $^{2}$ N is formulated as Equation (Equation9(9) $\begin{aligned} E_{w, ϕ, ψ} & = - \frac{1}{N} \sum_{i = 1}^{N} \log \sum_{k = 1}^{K} π_{k} (x_{i}) N (p (y | μ_{k} (x_{i}), σ_{k} (x_{i}))) \\ + λ_{1} L_{π} + λ_{2} L_{σ} + K L (q (w | θ) | | p (w)) \\ - λ_{3} \cdot - \frac{1}{N} \sum_{i = 1}^{N} \sum_{d = 1}^{D} t_{d} \log (p (t | x_{i})_{d}) . \end{aligned}$ (9) ). (9) $\begin{aligned} E_{w, ϕ, ψ} & = - \frac{1}{N} \sum_{i = 1}^{N} \log \sum_{k = 1}^{K} π_{k} (x_{i}) N (p (y | μ_{k} (x_{i}), σ_{k} (x_{i}))) \\ + λ_{1} L_{π} + λ_{2} L_{σ} + K L (q (w | θ) | | p (w)) \\ - λ_{3} \cdot - \frac{1}{N} \sum_{i = 1}^{N} \sum_{d = 1}^{D} t_{d} \log (p (t | x_{i})_{d}) . \end{aligned}$ (9) where the Kullback–Leibler (KL)-divergence term applies only to the weights of Bayesian convolutions in BDIFE.

3.5. Model optimisation

All MD $^{2}$ N parameters, including BDIFE, MDN-based predictor, and ADC, are differentiable; therefore, each component is trained using backpropagation in an end-to-end fashion. However, as mentioned in Section 3, since ADC is trained adversarially using GRL, the gradient propagated during optimisation has different signs for BDIFE and MDN as expressed in Equation (Equation10(10) $\begin{aligned} Δ w & = - γ (\frac{\partial L_{M D N}}{\partial w} + λ_{1} \frac{\partial L_{π}}{\partial w} + λ_{2} \frac{\partial L_{σ}}{\partial w} - λ_{3} \frac{\partial L_{A D C}}{\partial w}) \\ Δ ϕ & = - γ (\frac{\partial L_{M D N}}{\partial ϕ} + λ_{1} \frac{\partial L_{π}}{\partial ϕ} + λ_{2} \frac{\partial L_{σ}}{\partial ϕ}) \\ Δ ψ & = - γ (λ_{3} \frac{\partial L_{A D C}}{\partial ψ}) \end{aligned}$ (10) ). (10) $\begin{aligned} Δ w & = - γ (\frac{\partial L_{M D N}}{\partial w} + λ_{1} \frac{\partial L_{π}}{\partial w} + λ_{2} \frac{\partial L_{σ}}{\partial w} - λ_{3} \frac{\partial L_{A D C}}{\partial w}) \\ Δ ϕ & = - γ (\frac{\partial L_{M D N}}{\partial ϕ} + λ_{1} \frac{\partial L_{π}}{\partial ϕ} + λ_{2} \frac{\partial L_{σ}}{\partial ϕ}) \\ Δ ψ & = - γ (λ_{3} \frac{\partial L_{A D C}}{\partial ψ}) \end{aligned}$ (10) where γ is a learning rate and the KL divergence term is omitted for conciseness.

4. Experiments

4.1. Milling setup

In this work, the Ti-6Al-4V milling process is conducted under various machining conditions to obtain real-world data that can validate the efficacy of MD $^{2}$ N. Detailed information on the milling experiments is provided in Table . The five-axis CNC machine (HTC-1000, HNK Co.) and a dynamometer as a cutting force sensor (Kistler, 9257B) are used. The machining distance of 100 mm is denoted as one pass in the experiments. The tool holder's diameter (R220.291-0050-06.5A, Seco Tools Co.) is set at 50 mm, and the tool insert's relief angle (RPHT1204M0T-6-M13, Seco Tools Co.) is set at 11 $^{\circ}$ . To further increase domain diversity (i.e. machining conditions), a cryogenic and minimum quantity lubricant (CryoMQL) is applied in addition to wet settings, where liquid nitrogen is sprayed on the cutting tool and the work material, leading to different trends in tool wear and sensor measurement. The milling setup is shown in Figure . The schematic diagram of the experimental setup is illustrated in Figure .

Figure 4. The experimental setup for the milling process.

The cutting tool and tool holder attached to the work material, performing a milling process with a dynamometer attached to the work stage, the lubricant nozzle facing toward the cutting tool.

Figure 5. The schematic diagram of the experimental setup.

The 5-axis CNC machine with the work stage positioned on left with other experimental settings, leading to the measurement of tool wear and data acquisition, leading to a PC that inputs tool wear length and cutting force.

Table 1. Milling experiment conditions.

Display Table

4.2. Data description

From the milling experiments, eight distinct datasets with different machining conditions are collected. Among a variety of sensor signals, including force, vibration, and audio, that can be used to monitor the milling process, cutting force is used in this work for several reasons. First, cutting force during the milling process increases as the tool wear progresses, thus helping to predict tool wear in an online manner. Second, compared to vibration and audio signals, force signals can be used to monitor machining safety and stability and therefore have a higher real-world applicability. In addition, as force signals are measured with sensors attached to the work stage (as shown in Figure ), it is less affected by external noise, unlike audio signals. Moreover, the force signals measured in three different axes can be used in other machining processes, hence having higher versatility. The input data used in this work consist of cutting forces sampled at a frequency of 1000 Hz, and the descriptive statistics are shown in Table . Figure shows the exemplary sensor measurement input data.

Table 2. Descriptive statistics of datasets.

Display Table

Figure 6. Visualisation of sensor measurements.

A plot of three force signals in the x, y, and z directions, each coloured with different colours, with the x-axis of time measured in milliseconds and the y-axis of the force measured in Newton.

Figure 7. Measured tool wear of (a) Experiment 2 and (b) Experiment 6.

Actual measurement of the surface of work material in wet and CryoMQL settings at 1, 5, 10, 15, and 20 passes positioned from top to bottom, the crack and wear progress as the machining proceeds.

4.3. Tool wear degree

For each pass of the milling process, the degree of tool wear is measured with a microscope (KEYENCE, VK-X200). The actual tool wear measurements from Experiments 2 and 6 are shown in Figure . To obtain a continuous degree of tool wear as true target values, the tool wear equation (shown in Equation (Equation11(11) $\begin{aligned} V B & = d (a + b T^{c})^{- 1} . \end{aligned}$ (11) )) is used. In particular, Usui's tool wear model (Usui, Shirakashi, and Kitagawa Citation1984) is employed, where tool wear V B is associated with machining time T. Based on the Levenberg–Marquardt (LM) method, the fitted equation is used to estimate the degree of tool wear between the machining distances at which the exact wear degree is measured. Detailed equations used in the LM method are provided in Equations (Equation12(12) $\begin{aligned} p_{k + 1} & = p_{k} - (J_{r}^{T} J_{r} + μ_{k} d i a g (J_{r}^{T} J_{r}))^{- 1} J_{r}^{T} r (p_{k}), k \geq 0 \end{aligned}$ (12) ), (Equation13(13) $\begin{aligned} J_{r} (p) & = [\begin{matrix} \frac{\partial r_{1} (p)}{\partial p_{1}} & \dots & \frac{\partial r_{1} (p)}{\partial p_{m}} \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial r_{n} (p)}{\partial p_{1}} & \dots & \frac{\partial r_{n} (p)}{\partial p_{m}} \end{matrix}] \end{aligned}$ (13) ), and (Equation14(14) $\begin{aligned} r (p) & = [\begin{matrix} r_{1} (p) \\ r_{2} (p) \\ ⋮ \\ r_{n} (p) \end{matrix}] = [\begin{matrix} y_{1} - f (x_{1}, p) \\ y_{2} - f (x_{2}, p) \\ ⋮ \\ y_{n} - f (x_{n}, p) \end{matrix}] \end{aligned}$ (14) ). (11) $\begin{aligned} V B & = d (a + b T^{c})^{- 1} . \end{aligned}$ (11) (12) $\begin{aligned} p_{k + 1} & = p_{k} - (J_{r}^{T} J_{r} + μ_{k} d i a g (J_{r}^{T} J_{r}))^{- 1} J_{r}^{T} r (p_{k}), k \geq 0 \end{aligned}$ (12) (13) $\begin{aligned} J_{r} (p) & = [\begin{matrix} \frac{\partial r_{1} (p)}{\partial p_{1}} & \dots & \frac{\partial r_{1} (p)}{\partial p_{m}} \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial r_{n} (p)}{\partial p_{1}} & \dots & \frac{\partial r_{n} (p)}{\partial p_{m}} \end{matrix}] \end{aligned}$ (13) (14) $\begin{aligned} r (p) & = [\begin{matrix} r_{1} (p) \\ r_{2} (p) \\ ⋮ \\ r_{n} (p) \end{matrix}] = [\begin{matrix} y_{1} - f (x_{1}, p) \\ y_{2} - f (x_{2}, p) \\ ⋮ \\ y_{n} - f (x_{n}, p) \end{matrix}] \end{aligned}$ (14)

4.4. Data preprocessing

Standardisation is applied to each variable to normalise the data. Compared to conventional data-driven approaches, raw time-series signals are used in this work to enable online prediction using temporal information inherent in the input data. In addition, a sliding window method is applied to transform raw data into a suitable format that can be used as inputs. To reduce training time and downsample the data, a sliding window of size 400 with a stride of 200 is used (Ma et al. Citation2021).

4.5. Evaluation metrics

Since the target variable (i.e. tool wear degree) in this work is continuous, the appropriate evaluation metrics used in the regression analysis are employed to validate the experimental results. In particular, mean absolute error (MAE) and root mean squared error (RMSE), which measure the average magnitude of prediction error in linear and quadratic manners, respectively, are used. In addition, mean absolute percentage error (MAPE), which is one of the most popular regression metrics due to its robustness to outliers (Prestwich et al. Citation2014), is also used. The formulas for the metrics (i.e. MAE, RMSE, MAPE) are defined as expressed in Equation (Equation15(15) $\begin{aligned} M A E & = \frac{1}{N} \sum_{i = 1}^{N} | y_{i} - \hat{y_{i}} | . \\ R M S E & = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} (y_{i} - \hat{y_{i}})^{2}} . \\ M A P E & = \frac{100}{N} \sum_{i = 1}^{N} \frac{| y_{i} - \hat{y_{i}} |}{| y_{i} |} . \end{aligned}$ (15) ). (15) $\begin{aligned} M A E & = \frac{1}{N} \sum_{i = 1}^{N} | y_{i} - \hat{y_{i}} | . \\ R M S E & = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} (y_{i} - \hat{y_{i}})^{2}} . \\ M A P E & = \frac{100}{N} \sum_{i = 1}^{N} \frac{| y_{i} - \hat{y_{i}} |}{| y_{i} |} . \end{aligned}$ (15)

4.6. Implementation details

The experiments in this work were conducted with a TensorFlow library on a graphical processing unit. During model optimisation, Adadelta is used with a batch size of 256 and cosine decay as a learning rate scheduler with an initial learning rate of 0.01. In addition, early stopping is used to prevent model overfitting. In terms of computational time, training MD $^{2}$ N for a single epoch takes around 40 seconds, whereas a prediction for a single sample at test time takes 0.0002 seconds on average. For a fair evaluation, all data are divided into training, validation, and test sets, with a ratio of 70%, 10%, and 20%, respectively. Five trials with different random seeds are performed to obtain the error bars of the results.

5. Results

This work provides empirical results on tool wear prediction performance in three parts. First, experimental data from the wet setting with four different machining conditions are used. Second, data from the CryoMQL setting with the same number of machining conditions are used. In the last case, data from all eight machining conditions are used. In each part, training data and test data are composed of multi-domain data, thus the tool wear prediction results indicate whether the predictive model can accomplish MDL.

5.1. Evaluation of wet setting data

Using Datasets 1, 2, 3, and 4 collected in the wet setting, the tool wear prediction performance of the proposed MD $^{2}$ N is validated. Using the same training set consisting of the four machining conditions, four different test sets are used to verify whether MD $^{2}$ N has learned domain-specific representations for tool wear prediction. In addition, a single combined test set containing all machining conditions is also used to verify whether MD $^{2}$ N has achieved MDL. As provided in Table , MD $^{2}$ N shows good prediction performance in all four machining conditions even when trained on multi-domain data. In particular, even when trained on the dataset composed of multi-domain input features, the proposed MD $^{2}$ N shows test MAPE from 0.04 to 0.07 for each input domain.

Table 3. Prediction performance for wet and CryoMQL setting data.

Download CSV Display Table

5.2. Evaluation of CryoMQL setting data

Tool wear prediction results for MD $^{2}$ N using the training set collected in the CryoMQL setting with four different machining conditions are provided in Table . Similar to the results of the wet setting, the proposed MD $^{2}$ N provides an accurate prediction performance on all four different machining conditions, showing that it has successfully learned domain-invariant representations, which can be used for multiple machining conditions simultaneously. The magnitude of the error for the CryoMQL setting has shown to be smaller than that of the wet setting because the ground truth tool wear degree is smaller due to the use of CryoMQL.

5.3. Evaluation of wet and CryoMQL setting data

Next, MD $^{2}$ N is trained and tested using a combined dataset containing all datasets from eight different machining conditions (including both wet and CryoMQL settings). As shown in Table , the proposed MD $^{2}$ N still performs well under eight machining conditions. Furthermore, compared to previous results in Table , although the number of machining conditions has doubled, the tool wear prediction performance does not change much and even increases in some cases (i.e. Datasets 1, 2, 3, and 7). In terms of MAPE, MD $^{2}$ N shows test values from 0.02 to 0.06, which indicates that it performs well on various test domains as well. Therefore, the proposed MD $^{2}$ N's capability of learning multi-domain features to perform tool wear prediction under multiple machining conditions has been empirically verified.

Table 4. Prediction performance for both settings data.

Download CSV Display Table

The prediction results of MD $^{2}$ N trained in eight machining conditions, for every dataset, are visualised in Figure . For all test datasets, the proposed MD $^{2}$ N shows good prediction performance, well following the increasing trend of ongoing tool wear during the milling process. For some datasets, however, the prediction results have shown peaky points where the error values are larger than other time points. It might be caused by the increment of tool wear and the reduction of tool stiffness that generates abrupt vibrations. Moreover, as the machining process proceeds, the cutting tool might have adhered to the work material due to the additional friction force (e.g.ploughing force), which negatively affects the stable data acquisition.

Figure 8. Tool wear prediction performance of MD $^{2}$ N on all datasets under wet and CryoMQL settings. (a) Dataset 1. (b) Dataset 2. (c) Dataset 3. (d) Dataset 4. (e) Dataset 5. (f) Dataset 6. (g) Dataset 7 and (h) Dataset 8.

Tool wear prediction results with an increasing plot with a black solid line for the ground-truth tool wear and a coloured line for the predicted tool wear for every dataset, the x-axis represents machining distances measured in millimetres and the y-axis showing tool wear degrees measured in micrometres.

Using the MD $^{2}$ N trained in eight datasets under wet and CryoMQL settings, the learned latent space is visualised with t-distributed stochastic neighbour embedding (t-SNE) (Van der Maaten and Hinton Citation2008). In particular, learned feature representations from the intermediate layer of MD $^{2}$ N, right after BDIFE $g_{w}$ , are extracted. Representations from the layer that comes after BDIFE are used to visualise the latent space because its outputs are fed into both an MDN-based predictor and ADC. Five thousand test samples from each machining condition are used to generate feature representations from MD $^{2}$ N and are later transformed into t-SNE embeddings for visualisation. The latent space before training and its transformation during training is illustrated in Figure . The latent space of MD $^{2}$ N before training seems to have strong separability among different machining conditions, showing little overlapping regions between different datasets. However, as training proceeds, the latent embeddings gradually mingle together, decreasing the separability among machining conditions. In addition, after training, the embeddings become more spread out with some kind of structured patterns and less separation between machining conditions. Therefore, the visualisation of the MD $^{2}$ N's latent space indicates its ability of MDL, with domain-invariant latent feature representations that enable tool wear prediction under multiple machining conditions.

Figure 9. t-SNE visualisation of the latent space of MD $^{2}$ N: (a) before training, (b) after 100 epochs, (c) after 200 epochs, (d) after 300 epochs, and (e) after training.

Two-dimensional visualisation of the latent space generated using the proposed method, scatterplots with each point representing a data sample, each sampled coloured differently according to the machining setting.

5.4. Comparison with existing methods

Using the same experimental setting as in previous results, the performance of MD $^{2}$ N is compared with existing data-driven tool wear prediction models, including support vector regression (SVR), random forest (RF), LSTM, GRU, and CNN. As shown in Table , in all three cases: (1) four machining conditions in the wet setting, (2) four machining conditions in the CryoMQL setting, and (3) all eight machining conditions, the proposed MD $^{2}$ N yields a much higher prediction performance. The prediction results indicate that the proposed MD $^{2}$ N has more successfully learned multi-domain features for tool wear prediction under multiple machining conditions, compared to existing methods. In particular, the MD $^{2}$ N components, including BDIFE, MDN-based predictor, and ADC, have empirically shown effectiveness.

Table 5. Performance comparison with existing models.

Display Table

5.5. Comparison with state-of-the-art methods

In addition, the MD $^{2}$ N performance is compared to existing SOTA tool wear prediction methods (Hahn and Mechefske Citation2021; Sun et al. Citation2020; J. Wang et al. Citation2019). For the sake of reproducibility, existing methods with publicly available source codes are employed for performance comparison. Although the SOTA methods have shown high predictive performance in tool wear prediction tasks, they have not been used under multiple machining conditions. The experimental results indicate that the proposed MD $^{2}$ N outperforms existing SOTA methods as provided in Table . In particular, the superior performance of the proposed method empirically proves that MD $^{2}$ N is capable of learning domain-invariant representations under multiple machining conditions.

Table 6. Performance comparison with state-of-the-art methods.

Display Table

6. Discussion

From the perspective of machining analysis, the precision of tool wear prediction by the proposed MD $^{2}$ N under multiple machining conditions, including wet and CryoMQL settings, is desirable. In particular, for difficult-to-cut materials (e.g. titanium), for which various machining conditions and types of lubricants are applied to meet the desired machining quality, the proposed method can greatly improve the efficiency of the machining process. The online prediction of tool wear using the proposed method can also contribute to the overall productivity of the machining process, as a single model can be used in real-time so that human interruption is minimised during production processes.

As mentioned above, titanium (i.e. Ti-6Al-4V) has poor thermal properties, such as low specific heat and low thermal conductivity, which are directly related to the increment of machining temperature. In the CryoMQL setting, liquid nitrogen (LN $_{2}$ ) and MQL help mitigate the heat generated during milling. In addition, their use leads to a reduction of friction and increased lubricant ability in the tool-chip and tool-work material interface, thus leading to a decrease in tool wear in the case of CryoMQL settings. This is shown in Figure , where the magnitude of tool wear degree is smaller in general for CryoMQL settings than wet settings. Specifically, in the case of Datasets 2 and 6, the difference in tool wear degree is larger than in other conditions because, as the machining speed increased, the generation of machining heat increased, which led to the increment of LN $_{2}$ effects. However, in the case of Datasets 7 and 8, the degree of tool wear is not considerably reduced. The increment of feed per tooth and axial depth is negatively induced by the large machining area, which is a high-contact area. Hence, increasing the cutting speed and decreasing the contact area of tool-work material is a way to maximise the effects of CryoMQL.

From the perspective of a data-driven tool wear prediction approach, the proposed method performs prediction solely based on real-time sensor measurements. Because a mechanistic model is not employed, the proposed method has higher flexibility and adaptability in real-world machining practices. However, as it is a purely data-driven method, when there is noise or abnormalities in sensors in real industrial applications, the quality of tool wear prediction could be affected (García-Ordás et al. Citation2018). For instance, when sensor precision degrades and measurement noise arises, the input data for the proposed method change. Therefore, the prediction results would differ. While the employed Bayesian learning-based approach to multi-domain feature learning can mitigate the effects of sensor noise and abnormalities on the tool wear prediction performance to a certain degree, a severe amount of noise would potentially deteriorate the proposed method's performance. For these reasons, using other factors of the machining environment that are less influenced by sensor abnormalities, which are contingent upon the state and geometric characteristics of the work material, can improve robustness to sensor noise and precision of prediction (Dreyfus et al. Citation2022). In addition, incorporating aleatoric and epistemic uncertainty for modelling and prediction can also help develop a robust tool wear prediction method under inherent sensor noise and abnormalities.

From the perspective of sustainable manufacturing and production, the proposed method can contribute to achieving ZDM in various ways. First, due to accurate online tool wear prediction performed by the proposed MD $^{2}$ N, the tool costs, which account for a large part of manufacturing costs, can be reduced. In addition, the efficiency of tool replacement decisions can be improved. For instance, not only the tool can be maximally utilised but also the tool breakage that damages machinery can be prevented in advance. This results in the reduction of production defects and waste, thus contributing to ZDM. Furthermore, for industry decision-makers, the proposed MD $^{2}$ N has advantages in reducing modelling and maintenance costs, because a single model is required for tool wear prediction under various machining conditions. In particular, due to the proposed method's ability to learn domain-invariant features, the use of multiple models is not necessitated. In addition, when coupled with relevant digital technologies in modern manufacturing ecosystems, such as machine vision (Konstantinidis et al. Citation2023) and digital twins (Konstantinidis et al. Citation2022), the proposed method can be broadly applied to other types of production domains. Therefore, the aforementioned advantages of the proposed method lead to ZDM in the era of Industry 4.0 (Psarommatis, May, and Azamfirei Citation2023).

7. Conclusion and future work

This work presents a novel tool wear prediction method (i.e. MD $^{2}$ N) that can perform well under multiple machining conditions during milling process. Considering real-world machining practices, where various machining conditions are used interchangeably and that require learning domain-invariant feature representations, the proposed MD $^{2}$ N can greatly improve efficiency and productivity. In particular, MD $^{2}$ N employs a Bayesian learning-based feature extractor (i.e. BDIFE) to learn multi-domain representations for MDL. MDN-based predictor is then used to generate multimodal predictive distribution for ongoing tool wear, which might also handle potential inverse problems in multiple machining conditions. In addition, a GRL-based auxiliary classifier (i.e. ADC) is used in an adversarial learning manner to further enhance the MDL capability of MD $^{2}$ N.

Extensive experiments using datasets from real-world milling processes conducted under multiple machining conditions prove the effectiveness of the proposed MD $^{2}$ N. Using datasets collected under wet and CryoMQL settings consisting of eight different machining conditions, the proposed MD $^{2}$ N's efficacy in tool wear prediction under multiple machining conditions has been verified. Compared to existing data-driven tool wear prediction methods and SOTA methods in the literature, the proposed method shows superior prediction performance. The results indicate that MD $^{2}$ N can effectively capture both domain-invariant and domain-specific representations during training. Furthermore, considering that MD $^{2}$ N takes 0.0002 seconds (on average) for inference, a real-time prognostic application seems feasible.

One of the most prospective research areas in which the proposed method can be integrated into sustainable production domains is ZDM (Azamfirei, Psarommatis, and Lagrosen Citation2023). Recently, the concept of ZDM, which aims to ensure product and process quality by minimising defects using advanced data-driven technologies, has arisen (Psarommatis et al. Citation2020). In particular, quality improvement based on digital technologies to improve the sustainability of production systems is one of the most widely studied topics both in academia and industry (Psarommatis et al. Citation2022). In this regard, the proposed method can contribute to achieving ZDM not only by reducing tool costs but also by improving product quality using accurate tool wear prediction results. In particular, the application of the proposed method in real-world machining processes would help determine proper tool change time and prevent unexpected process stoppage due to tool breakage, thus developing efficient tool maintenance strategies with lower defects (Psarommatis, May, and Azamfirei Citation2023). Combining the proposed method with existing ZDM strategies in various aspects, such as detection, repair, and prevention (Psarommatis et al. Citation2020), would also bring improvement in sustainable manufacturing. For instance, using predictions from the proposed method and other virtual defect detection methods can enable more efficient operation for manufacturing systems. Furthermore, using a single model for multiple machining conditions could lower operation costs and improve productivity in practice. Especially for decision-making in the manufacturing industry, this would enhance production efficiency and sustainability. From the perspective of a prevention strategy for ZDM, using a single model can even reduce the possibility of malfunction and maintenance costs.

It is worth noting that the objective of this work is different from that of the literature on tool wear prediction under a novel machining condition, in which domain adaptation and transfer learning techniques can be applied (Oh et al. Citation2022). In particular, a domain adaptation setting, where a few supervisions are provided for the target domain and a covariate shift occurs, is not entirely identical to the problem addressed in this work. However, the proposed method becomes the basis for it, since related tasks can be further improved when successful MDL is achieved.

Future works include the application of the proposed method to other predictive maintenance approaches (e.g. condition-based maintenance), quality prediction, and fault diagnosis, for industrial machinery (G. Kim, Choi, et al. Citation2023; Jun and Kim Citation2017; Konstantinidis et al. Citation2022). Using the proposed method with the tool recovery process, to determine when to halt the machining process for the effective reuse of the tools (Jun et al. Citation2012) remains one of the future works. Integrating the proposed method with other defect detection methods (e.g. virtual methods, vision-based methods) for improved inspection quality and reduced waste under the detection strategy of ZDM (Psarommatis et al. Citation2020), also remains future research. Another future work is to develop an uncertainty-aware tool wear prediction under multiple machining conditions utilising the additional benefits that the MDN structure provides, which is the predictive uncertainty without the iterative sampling procedure (Choi et al. Citation2018). From the perspective of ZDM, developing approaches for automated quality inspection, process control, and maintenance (Azamfirei, Psarommatis, and Lagrosen Citation2023; Dreyfus et al. Citation2022), under various manufacturing conditions, remains a promising future research area. In addition, to achieve sustainable manufacturing, combining the proposed method with other advanced technologies, such as digital twins and robotics (Azamfirei, Psarommatis, and Lagrosen Citation2023; Konstantinidis, Mouroutsos, and Gasteratos Citation2021), would be one of the most prospective research directions for digital transformation in Industry 4.0.

Acknowledgments

The authors would like to express appreciation to the editors and referees for their helpful comments to improve the quality of our work. The authors thank Minjoo Ku at LG Electronics for helpful discussions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this work are available from the corresponding author, [SL], upon reasonable request.

Additional information

Funding

This work was supported by [Ministry of Trade, Industry and Energy (MOTIE) of Korea] under Grant [number 20017932]; [National Research Foundation of Korea (NRF) funded by the Korea government (MSIT)] under Grant [number 2021R1F1A1046416]; [Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT), Artificial Intelligence Graduate School Program (UNIST)] under Grant [number 2020-0-01336]; [Science and Technology Commercialization Promotion Agency funded by the Korea government in 2023 (Ministry of Science and ICT)] under Grant [number RS-2023-00254286]; and [National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT] under Grant [number 2021H1D8A306520712].

Notes on contributors

Gyeongho Kim

Gyeongho Kim received his B.S. degree in industrial engineering from Ulsan National Institute of Science and Technology (UNIST), Republic of Korea, in 2021. He is currently a Ph.D. Candidate with the Department of Industrial Engineering at UNIST. His research interests include semi-supervised learning, Bayesian learning, the application of machine learning in industries (industrial AI), and uncertainty in machine learning.

Sang Min Yang

Sang Min Yang received a bachelor's degree in mechanical engineering from Ulsan National Institute of Science and Technology (UNIST), Korea, in 2020. He is currently pursuing a combined master's and doctoral degree in mechanical engineering at the same university. His research interests include machining processes and analyses.

Sinwon Kim

Sinwon Kim received a bachelor's degree in mechanical engineering from the University of Ulsan, Korea, in 2021. He is currently pursuing a combined master's and doctoral degree at Ulsan National Institute of Science and Technology (UNIST) of Mechanical Engineering, Ulsan, Korea. His research interests include machining processes and analyses.

Do Young Kim

Do Young Kim received the B.S. and Ph.D. degrees from Ulsan National Institute of Science and Technology (UNIST). He is currently an Assistant Professor of Mechatronics Engineering at Chungnam National University. His research interests mainly include multiscale manufacturing processes.

Jae Gyeong Choi

Jae Gyeong Choi received her B.S. in design and human engineering from Ulsan National Institute of Science and Technology (UNIST), Republic of Korea, in 2019. Since 2020, she has been a Combined M.S./Ph.D student in industrial engineering at UNIST. Her research interests include machine learning/deep learning, industrial artificial intelligence, and video/audio processing.

Hyung Wook Park

Hyung Wook Park is a Professor of Ulsan National Institute of Science and Technology (UNIST) from 2009, PhD in Mechanical Engineering at Georgia Institute of Technology, USA in 2008. His research interests include micro/macro machining, machining dynamics, robot arm machining, and multiphysics-based micro/meso-scale manufacturing (MP-M2) processes and systems.

Sunghoon Lim

Sunghoon Lim received his B.S. and M.S. in industrial engineering from KAIST, Republic of Korea, in 2005 and 2009, respectively, and his Ph.D. in industrial engineering from the Pennsylvania State University, University Park, PA, in 2018. He is currently an Associate Professor with the Department of Industrial Engineering, an Adjunct Associate Professor with the Graduate School of Artificial Intelligence, and a Head of the Industry Intelligentization Institute, Ulsan National Institute of Science and Technology (UNIST), Republic of Korea. His research interests include machine learning and deep learning, industrial artificial intelligence, and smart manufacturing.

References

Azamfirei, Victor, Foivos Psarommatis, and Yvonne Lagrosen. 2023. “Application of Automation for In-Line Quality Inspection, a Zero-Defect Manufacturing Approach.” Journal of Manufacturing Systems 67:1–22. https://doi.org/10.1016/j.jmsy.2022.12.010.
Web of Science ®Google Scholar
Berriel, Rodrigo, Stephane Lathuillere, Moin Nabi, Tassilo Klein, Thiago Oliveira-Santos, Nicu Sebe, and Elisa Ricci. 2019. “Budget-Aware Adapters for Multi-Domain Learning.” Paper presented at IEEE/CVF International Conference on Computer Vision, Seoul, KR, October 2019.
Google Scholar
Bilen, Hakan, and Andrea Vedaldi. 2017. “Universal Representations: The Missing Link Between Faces, Text, Planktons, and Cat Breeds.” Online preprint. https://arxiv.org/abs/1701.07275.
Google Scholar
Choi, Sungjoon, Kyungjae Lee, Sungbin Lim, and Songhwai Oh. 2018. “Uncertainty-Aware Learning from Demonstration Using Mixture Density Networks with Sampling-Free Variance Modeling.” Paper presented at IEEE International Conference on Robotics and Automation, Brisbane, AU, May 2018.
Google Scholar
Dreyfus, Paul-Arthur, Foivos Psarommatis, Gokan May, and Dimitris Kiritsis. 2022. “Virtual Metrology as an Approach for Product Quality Estimation in Industry 4.0: A Systematic Review and Integrative Conceptual Framework.” International Journal of Production Research 60 (2): 742–765. https://doi.org/10.1080/00207543.2021.1976433.
Web of Science ®Google Scholar
Ganin, Yaroslav, and Victor Lempitsky. 2015. “Unsupervised Domain Adaptation by Backpropagation.” Paper presented at International Conference on Machine Learning, Lille, FR, July 2015.
Google Scholar
Gao, Zhifan, Saidi Guo, Chenchu Xu, Jinglin Zhang, Mingming Gong, Javier Del Ser, and Shuo Li. 2022. “Multi-Domain Adversarial Variational Bayesian Inference for Domain Generalization.” IEEE Transactions on Circuits and Systems for Video TechnologyEarly Access. https://doi.org/10.1109/TCSVT.2022.3232112.
Web of Science ®Google Scholar
García-Ordás, María Teresa, Enrique Alegre-Gutiérrez, Víctor González-Castro, and Rocío Alaiz-Rodríguez. 2018. “Combining Shape and Contour Features to Improve Tool Wear Monitoring in Milling Processes.” International Journal of Production Research 56 (11): 3901–3913. https://doi.org/10.1080/00207543.2018.1435919.
Web of Science ®Google Scholar
Guo, Liang, Yaoxiang Yu, Hongli Gao, Tingting Feng, and Yuekai Liu. 2021. “Online Remaining Useful Life Prediction of Milling Cutters Based on Multisource Data and Feature Learning.” IEEE Transactions on Industrial Informatics 18 (8): 5199–5208. https://doi.org/10.1109/TII.2021.3118994.
Web of Science ®Google Scholar
Hahn, Tim Von, and Chris K. Mechefske. 2021. “Self-Supervised Learning for Tool Wear Monitoring with a Disentangled-Variational-Autoencoder.” International Journal of Hydromechatronics 4 (1): 69–98. https://doi.org/10.1504/IJHM.2021.114174.
Google Scholar
He, Shuting, Hao Luo, Weihua Chen, Miao Zhang, Yuqi Zhang, Fan Wang, Hao Li, and Wei Jiang. 2020. “Multi-Domain Learning and Identity Mining for Vehicle Re-Identification.” Paper presented at IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Virtual, 2020.
Google Scholar
He, Jianliang, Yuxin Sun, Chen Yin, Yan He, and Yulin Wang. 2022. “Cross-Domain Adaptation Network Based on Attention Mechanism for Tool Wear Prediction.” Journal of Intelligent Manufacturing 2022:1–23. https://doi.org/10.1007/s10845-022-02005-z.
Web of Science ®Google Scholar
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. “Deep Residual Learning for Image Recognition.” Paper presented at IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, June 2016.
Google Scholar
Hu, Jie, Li Shen, and Gang Sun. 2018. “Squeeze-and-Excitation Networks.” Paper presented at IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018.
Google Scholar
Ji, Bongjun, Farhad Ameri, and Hyunbo Cho. 2021. “A Non-Conformance Rate Prediction Method Supported by Machine Learning and Ontology in Reducing Underproduction Cost and Overproduction Cost.” International Journal of Production Research 59 (16): 5011–5031. https://doi.org/10.1080/00207543.2021.1933237.
Web of Science ®Google Scholar
Jun, Hong-Bae, and David Kim. 2017. “A Bayesian Network-Based Approach for Fault Analysis.” Expert Systems with Applications 81:332–348. https://doi.org/10.1016/j.eswa.2017.03.056.
Web of Science ®Google Scholar
Jun, Hong-Bae, Dong-Ho Lee, Jae-Gon Kim, and Dimitris Kiritsis. 2012. “Heuristic Algorithms for Minimising Total Recovery Cost of End-of-Life Products Under Quality Constraints.” International Journal of Production Research 50 (19): 5330–5347. https://doi.org/10.1080/00207543.2011.624562.
Web of Science ®Google Scholar
Kim, Gyeongho, Jae Gyeong Choi, Minjoo Ku, Hyewon Cho, and Sunghoon Lim. 2021. “A Multimodal Deep Learning-Based Fault Detection Model for a Plastic Injection Molding Process.” IEEE Access 9:132455–132467. https://doi.org/10.1109/ACCESS.2021.3115665.
Web of Science ®Google Scholar
Kim, Gyeongho, Jae Gyeong Choi, Minjoo Ku, and Sunghoon Lim. 2023. “Developing a Semi-Supervised Learning and Ordinal Classification Framework for Quality Level Prediction in Manufacturing.” Computers & Industrial Engineering 181:109286. https://doi.org/10.1016/j.cie.2023.109286.
Web of Science ®Google Scholar
Kim, Hyojoong, and Heeyoung Kim. 2023. “Deep Embedding Kernel Mixture Networks for Conditional Anomaly Detection in High-Dimensional Data.” International Journal of Production Research 61 (4): 1101–1113. https://doi.org/10.1080/00207543.2022.2027040.
Web of Science ®Google Scholar
Kim, Gyeongho, Sang Min Yang, Dong Min Kim, Sinwon Kim, Jae Gyeong Choi, Minjoo Ku, Sunghoon Lim, and Hyung Wook Park. 2023. “Bayesian-Based Uncertainty-Aware Tool-Wear Prediction Model in End-Milling Process of Titanium Alloy.” Applied Soft Computing 148:110922. https://doi.org/10.1016/j.asoc.2023.110922.
Web of Science ®Google Scholar
Kim, Gyeongho, Sang Min Yang, Sinwon Kim, Dong Min Kim, Sunghoon Lim, and Hyung Wook Park. 2022. “Tool Wear Prediction in the End Milling Process of Ti-6Al-4V using Bayesian Learning.” Paper presented at International Conference on Advanced Mechatronic Systems, Toyama, Japan, December 2022. https://doi.org/10.1109/ICAMechS57222.2022.10003325.
Google Scholar
Kingma, Diederik P., and Max Welling. 2013. “Auto-Encoding Variational Bayes.” Online preprint. https://arxiv.org/abs/1312.6114.
Google Scholar
Konstantinidis, Fotios K., Spyridon G. Mouroutsos, and Antonios Gasteratos. 2021. ““The Role of Machine Vision in Industry 4.0: An Automotive Manufacturing Perspective.” Paper presented at IEEE International Conference on Imaging Systems and Techniques, Kaohsiung, Taiwan, December 2021.” https://doi.org/10.1109/IST50367.2021.9651453.
Google Scholar
Konstantinidis, Fotios K., Nikolaos Myrillas, Spyridon G. Mouroutsos, Dimitrios Koulouriotis, and Antonios Gasteratos. 2022. “Assessment of Industry 4.0 for Modern Manufacturing Ecosystem: A Systematic Survey of Surveys.” Machines 10 (9): 746. https://doi.org/10.3390/machines10090746.
Web of Science ®Google Scholar
Konstantinidis, Fotios K., Nikolaos Myrillas, Konstantinos Tsintotas, Spyridon Mouroutsos, and Antonios Gasteratos. 2023. “A Technology Maturity Assessment Framework for Industry 5.0 Machine Vision Systems Based on Systematic Literature Review in Automotive Manufacturing.” International Journal of Production ResearchEarly Access. https://doi.org/10.1080/00207543.2023.2270588.
PubMed Web of Science ®Google Scholar
Kuo, Yong-Hong, and Andrew Kusiak. 2019. “From Data to Big Data in Production Research: The Past and Future Trends.” International Journal of Production Research 57 (15-16): 4828–4853. https://doi.org/10.1080/00207543.2018.1443230.
Web of Science ®Google Scholar
Kusiak, Andrew. 2020. “Convolutional and Generative Adversarial Neural Networks in Manufacturing.” International Journal of Production Research 58 (5): 1594–1604. https://doi.org/10.1080/00207543.2019.1662133.
Web of Science ®Google Scholar
Li, Chen, and Gim Hee Lee. 2019. “Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network.” Paper presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, June 2019.
Google Scholar
Li, Yingguang, Changqing Liu, Jiaqi Hua, James Gao, and Paul Maropoulos. 2019. “A Novel Method for Accurately Monitoring and Predicting Tool Wear Under Varying Cutting Conditions Based on Meta-Learning.” CIRP Annals 68 (1): 487–490. https://doi.org/10.1016/j.cirp.2019.03.010.
Web of Science ®Google Scholar
Li, Yunsheng, and Nuno Vasconcelos. 2019. “Efficient Multi-Domain Learning by Covariance Normalization.” Paper presented at IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, June 2019.
Google Scholar
Liu, Changqing, Yingguang Li, Jingjing Li, and Jiaqi Hua. 2021. “A Meta-Invariant Feature Space Method for Accurate Tool Wear Prediction Under Cross Conditions.” IEEE Transactions on Industrial Informatics 18 (2): 922–931. https://doi.org/10.1109/TII.2021.3070109.
Web of Science ®Google Scholar
Liu, Chao, Pai Zheng, and Xun Xu. 2021. “Digitalisation and Servitisation of Machine Tools in the Era of Industry 4.0: A Review.” International Journal of Production Research 61 (12): 4069–4101. https://doi.org/10.1080/00207543.2021.1969462.
Web of Science ®Google Scholar
Luo, Yong, Yonggang Wen, and Dacheng Tao. 2017. “Heterogeneous Multitask Metric Learning Across Multiple Domains.” IEEE Transactions on Neural Networks and Learning Systems 29 (9): 4051–4064. https://doi.org/10.1109/TNNLS.2017.2750321.
PubMed Web of Science ®Google Scholar
Ma, Junyan, Decheng Luo, Xiaoping Liao, Zhenkun Zhang, Yi Huang, and Juan Lu. 2021. “Tool Wear Mechanism and Prediction in Milling TC18 Titanium Alloy Using Deep Learning.” Measurement 173:108554. https://doi.org/10.1016/j.measurement.2020.108554.
Web of Science ®Google Scholar
Mao, Yuwei, Zijiang Yang, Dipendra Jha, Arindam Paul, Wei-keng Liao, Alok Choudhary, and Ankit Agrawal. 2022. “Generative Adversarial Networks and Mixture Density Networks-Based Inverse Modeling for Microstructural Materials Design.” Integrating Materials and Manufacturing Innovation 11 (4): 637–647. https://doi.org/10.1007/s40192-022-00285-0.
PubMed Web of Science ®Google Scholar
Oh, JeongRim, JongJin Park, ChangSoo Ok, ChungHun Ha, and Hong-Bae Jun. 2022. “A Study on the Wind Power Forecasting Model Using Transfer Learning Approach.” Electronics 11 (24): 4125. https://doi.org/10.3390/electronics11244125.
Web of Science ®Google Scholar
Ouyang, Linhan, Yizhong Ma, Jianxiong Chen, Zhigang Zeng, and Yiliu Tu. 2016. “Robust Optimisation of Nd: YLF Laser Beam Micro-Drilling Process Using Bayesian Probabilistic Approach.” International Journal of Production Research 54 (21): 6644–6659. https://doi.org/10.1080/00207543.2016.1154212.
Web of Science ®Google Scholar
Prestwich, Steven, Roberto Rossi, S. Armagan Tarim, and Brahim Hnich. 2014. “Mean-Based Error Measures for Intermittent Demand Forecasting.” International Journal of Production Research 52 (22): 6782–6791. https://doi.org/10.1080/00207543.2014.917771.
Web of Science ®Google Scholar
Psarommatis, Foivos, Gökan May, and Victor Azamfirei. 2023. “Envisioning Maintenance 5.0: Insights From a Systematic Literature Review of Industry 4.0 and a Proposed Framework.” Journal of Manufacturing Systems 68:376–399. https://doi.org/10.1016/j.jmsy.2023.04.009.
Web of Science ®Google Scholar
Psarommatis, Foivos, Gökan May, Paul-Arthur Dreyfus, and Dimitris Kiritsis. 2020. “Zero Defect Manufacturing: State-of-the-Art Review, Shortcomings and Future Directions in Research.” International Journal of Production Research 58 (1): 1–17. https://doi.org/10.1080/00207543.2019.1605228.
Google Scholar
Psarommatis, Foivos, João Sousa, João Pedro Mendonça, and Dimitris Kiritsis. 2022. “Zero-Defect Manufacturing the Approach for Higher Manufacturing Sustainability in the Era of Industry 4.0: A Position Paper.” International Journal of Production Research 60 (1): 73–91. https://doi.org/10.1080/00207543.2021.1987551.
Web of Science ®Google Scholar
Rebuffi, Sylvestre-Alvise, Hakan Bilen, and Andrea Vedaldi. 2017. “Learning Multiple Visual Domains with Residual Adapters.” Paper presented at Advances in Neural Information Processing Systems, Long Beach, USA, December 2017.
Google Scholar
Rebuffi, Sylvestre-Alvise, Hakan Bilen, and Andrea Vedaldi. 2018. “Efficient Parametrization of Multi-Domain Deep Neural Networks.” Paper presented at IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, June 2018.
Google Scholar
Saunders, Ben, Necati Cihan Camgoz, and Richard Bowden. 2021. “Continuous 3D Multi-Channel Sign Language Production Via Progressive Transformers and Mixture Density Networks.” International Journal of Computer Vision 129 (7): 2113–2135. https://doi.org/10.1007/s11263-021-01457-9.
Web of Science ®Google Scholar
Shi, Chengming, Bo Luo, Songping He, Kai Li, Hongqi Liu, and Bin Li. 2019. “Tool Wear Prediction Via Multidimensional Stacked Sparse Autoencoders with Feature Fusion.” IEEE Transactions on Industrial Informatics 16 (8): 5150–5159. https://doi.org/10.1109/TII.2019.2949355.
Web of Science ®Google Scholar
Sun, Huibin, Jiduo Zhang, Rong Mo, and Xianzhi Zhang. 2020. “In-Process Tool Condition Forecasting Based on a Deep Learning Method.” Robotics and Computer-Integrated Manufacturing 64:101924. https://doi.org/10.1016/j.rcim.2019.101924.
Web of Science ®Google Scholar
Tosi, Fabio, Yiyi Liao, Carolin Schmitt, and Andreas Geiger. 2021. “Smd-Nets: Stereo Mixture Density Networks.” Paper presented at IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, June 2021.
Google Scholar
Traini, Emiliano, Giulia Bruno, and Franco Lombardi. 2021. “Tool Condition Monitoring Framework for Predictive Maintenance: A Case Study on Milling Process.” International Journal of Production Research 59 (23): 7179–7193. https://doi.org/10.1080/00207543.2020.1836419.
Web of Science ®Google Scholar
Usui, E., T. Shirakashi, and T. Kitagawa. 1984. “Analytical Prediction of Cutting Tool Wear.” Wear 100 (1–3): 129–151. https://doi.org/10.1016/0043-1648(84)90010-3.
Web of Science ®Google Scholar
Van der Maaten, Laurens, and Geoffrey Hinton. 2008. “Visualizing Data Using T-SNE.” Journal of Machine Learning Research 9 (11): 2579–2605.
Google Scholar
Wang, Zheng, Qingxiu Liu, Hansi Chen, and Xuening Chu. 2021. “A Deformable CNN-DLSTM Based Transfer Learning Method for Fault Diagnosis of Rolling Bearing Under Multiple Working Conditions.” International Journal of Production Research 59 (16): 4811–4825. https://doi.org/10.1080/00207543.2020.1808261.
Web of Science ®Google Scholar
Wang, Jinjiang, Jianxing Yan, Chen Li, Robert X. Gao, and Rui Zhao. 2019. “Deep Heterogeneous GRU Model for Predictive Analytics in Smart Manufacturing: Application to Tool Wear Prediction.” Computers in Industry 111:1–14. https://doi.org/10.1016/j.compind.2019.06.001.
Web of Science ®Google Scholar
Xiao, Zehao, Jiayi Shen, Xiantong Zhen, Ling Shao, and Cees Snoek. 2021. “A Bit More Bayesian: Domain-Invariant Learning with Uncertainty.” Paper presented at International Conference on Machine Learning, Virtual, July 2021.
Google Scholar
Xu, Xingwei, Zhengrui Tao, Weiwei Ming, Qinglong An, and Ming Chen. 2020. “Intelligent Monitoring and Diagnostics Using a Novel Integrated Model Based on Deep Learning and Multi-Sensor Feature Fusion.” Measurement 165:108086. https://doi.org/10.1016/j.measurement.2020.108086.
Web of Science ®Google Scholar
Yang, Wen-An, Qiang Zhou, and Kwok-Leung Tsui. 2016. “Differential Evolution-Based Feature Selection and Parameter Optimisation for Extreme Learning Machine in Tool Wear Estimation.” International Journal of Production Research 54 (15): 4703–4721. https://doi.org/10.1080/00207543.2015.1111534.
Web of Science ®Google Scholar
Zamudio-Ramírez, Israel, Jose Alfonso Antonino-Daviu, Miguel Trejo-Hernandez, and Roque Alfredo Osornio-Rios. 2020. “Cutting Tool Wear Monitoring in CNC Machines Based in Spindle-Motor Stray Flux Signals.” IEEE Transactions on Industrial Informatics 18 (5): 3267–3275. https://doi.org/10.1109/TII.2020.3022677.
Web of Science ®Google Scholar
Zhang, Yuqing, Min Xie, Yihai He, and Xiao Han. 2022. “Capability-Based Remaining Useful Life Prediction of Machining Tools Considering Non-Geometry and Tolerancing Features with a Hybrid Model.” International Journal of Production Research, 1–17. https://doi.org/10.1080/00207543.2022.2152126.
Web of Science ®Google Scholar
Zhao, Zetian, Bingtao Hu, Yixiong Feng, Bin Zhao, Chen Yang, Zhaoxi Hong, and Jianrong Tan. 2023. “Multi-Surface Defect Detection for Universal Joint Bearings Via Multimodal Feature and Deep Transfer Learning.” International Journal of Production Research 61 (13): 4402–4418. https://doi.org/10.1080/00207543.2022.2138613.
Web of Science ®Google Scholar
Zhu, Kunpeng, and Tongshun Liu. 2017. “Online Tool Wear Monitoring Via Hidden Semi-Markov Model with Dependent Durations.” IEEE Transactions on Industrial Informatics 14 (1): 69–78. https://doi.org/10.1109/TII.2017.2723943.
Web of Science ®Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

A multi-domain mixture density network for tool wear prediction under multiple machining conditions

Abstract

1. Introduction

2. Theoretical framework

2.1. Data-driven tool wear prediction

2.2. Multi-domain learning

2.3. Mixture density network

3. Proposed method

3.1. Model architecture

3.2. Bayesian domain-invariant feature extractor

3.3. MDN-based tool wear predictor

3.4. Auxiliary domain classifier

3.5. Model optimisation

4. Experiments

4.1. Milling setup

Table 1. Milling experiment conditions.

4.2. Data description

Table 2. Descriptive statistics of datasets.

4.3. Tool wear degree

4.4. Data preprocessing

4.5. Evaluation metrics

4.6. Implementation details

5. Results

5.1. Evaluation of wet setting data

Table 3. Prediction performance for wet and CryoMQL setting data.

5.2. Evaluation of CryoMQL setting data

5.3. Evaluation of wet and CryoMQL setting data

Table 4. Prediction performance for both settings data.

5.4. Comparison with existing methods

Table 5. Performance comparison with existing models.

5.5. Comparison with state-of-the-art methods

Table 6. Performance comparison with state-of-the-art methods.

6. Discussion

7. Conclusion and future work

Acknowledgments

Disclosure statement

Data availability statement

Additional information

Funding

Notes on contributors

Gyeongho Kim

Sang Min Yang

Sinwon Kim

Do Young Kim

Jae Gyeong Choi

Hyung Wook Park

Sunghoon Lim

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date