4,184
Views
24
CrossRef citations to date
0
Altmetric
Articles

Multi-objective optimization and comparison of surrogate models for separation performances of cyclone separator based on CFD, RSM, GMDH-neural network, back propagation-ANN and genetic algorithm

ORCID Icon, , &
Pages 180-201 | Received 16 Jul 2019, Accepted 01 Nov 2019, Published online: 09 Dec 2019

Abstract

Pressure drop (Δp) and collection efficiency (η) are used to evaluate the separation performance of the cyclone separator. In this study, we conducted comparative study of cyclone models using response surface methodology (RSM), back propagation neural network (BPNN), and group method of data handling (GMDH) networks to develop optimal predictive cyclone models. Also, we conducted multi-objective optimization for maximizing model and minimizing model using genetic algorithm (GA). CFD was performed instead of experimental method to get the estimated values for modeling of Δp and η. The validation results of CFD showed 0.5% and 2% errors for Δp and η, respectively, compared with the experimental data. Second, design of experiment (DOE) analysis for 10 cyclone geometrical parameters was executed to obtain the significant geometrical parameters. Vortex finder diameter Dx, inlet width a, inlet height b and cone height Hco have a significant effect on η and Δp. However, interaction effects between the geometrical parameters have small effects. The cyclone models by RSM, BPNN and GMDH based on 25 CFD training set were developed. The predictive performance results by the cyclone models were compared by 25 CFD test set. The GMDH method achieved the best prediction for Δp (R2=99.7, RMSE = 0.102) Radjusted2=98.99, RMSE = 0.0119) than the RSM, BPNN cyclone models. Additionally, uncertainty analysis was performed to estimate the quantitative performance of cyclone models. The results show that the uncertainty width of GMDH models achieved the best prediction (η: ±0.0065, Δp: ±0.0188). Finally, GA was applied to optimize the GMDH models simultaneously. GA generated 70 non-dominant solutions. Reproducibility of five optimal points was validated by using CFD. The trade-off optimal point showed improvement by 24.31%, 18% and 8.79% for Δp d50 and η, respectively.

1. Introduction

1.1. Research background

In many industries, the contaminants are removed using various dust collectors. In comparison with other type of dust collectors, such as settling chamber, wet scrubbers and electrostatic precipitators etc., gas–solid cyclone separator using a centrifugal force were widely used in industrial processes due to the advantages of its relatively low maintenance cost, simple structure and robustness of operating conditions under high temperature and pressure (Hoffmann & Stein, Citation2008). Pressure drop ‘Δp’ (also expressed as Euler number) and collection efficiency ‘η’ (also known as the separation efficiency and cut-off diameter) are used to evaluate the separation performance of the cyclone separator. The cut-off diameter is a particle diameter for which η is 50% and Euler number is the dimensionless measure of the pressure drop. The performance parameters are defined as follow; (1) η=1Nout(d)Nin(d)(1) Where Nin is the total number of particles, and Nout is the total number of particles discharged in the cyclone vortex finder. (2) ΔP=PinPout(2) Where Pin is the pressure in the cyclone inlet, and Pout is the pressure in the cyclone outlet. The performance parameters are closely dependent of the geometric parameters of cyclone. The configuration and nomenclature of 10 geometric parameters of cyclone is shown in Figure . Ten parameters are diameter of vortex finder Dx, inlet heighta, inlet width b, cylinder height Hcy, cone height Hco, cone tip diameter Bc, tube height Ht, collector height Hc and collector diameter Dc, vortex finder length S.

Figure 1. Geometry and nomenclature of 10 geometric parameters of cyclone.

Figure 1. Geometry and nomenclature of 10 geometric parameters of cyclone.

Generally, as Δp increases, η also increases. However, the optimal cyclone should be designed to have minimum Δp and maximum η. Therefore, many studies have been reported for several decades to model and optimize the performance parameters according to the geometric parameters of cyclone.

Early researches developed empirical models for Δp and η according to massive experiments and mathematical equations (Avci & Karagoz, Citation2001; Barth, Citation1956; Dietz, Citation1981; Leith, Citation1990; Muschelknautz & Krambrock, Citation1970; Shephered & Lapple, Citation2005). The empirical model was developed by conservation law's equation and experimental data. Gimbun, Chuah, Choong, and Fakhru’l-Razi (Citation2005) showed by comparing the CFD results with the cyclone performances predicted from the various empirical models.

With the advance of high-performance computers and CFD technology, CFD has been considered as another approach to optimize Δp and η recently. CFD have advantages of time saving and high accuracy as well as the ability to calculate physical quantities that are difficult to be measured experimentally (Elsayed & Lacor, Citation2012). Therefore, many CFD studies have been performed for optimizing Δp and η depending on the geometrical parameters. The effects of pressure drop and collection efficiency on the changes of the cyclone vortex finder shape were investigated using CFD (De Souza, Salvo, & Martins, Citation2015; Raoufi, Shams, Farzaneh, & Ebrahimi, Citation2008). The CFD simulation of cyclone flow was performed according to changing the cyclone inlet shape (Bernardo, Mori, Peres, & Dionísio, Citation2006; Elsayed & Lacor, Citation2011; Yang, Sun, & Gao, Citation2013). The cyclone performance parameters were investigated according to changing the cyclone body shape (Brar, Sharma, & Elsayed, Citation2015; Hamdy, Bassily, El-Batsh, & Mekhail, Citation2017; Safikhani, Akhavan-Behabadi, Shams, & Rahimyan, Citation2010). The cyclone flow was investigated by changing the cyclone dust collector shape (Ganegama Bogodage & Leung, Citation2015; Kaya & Karagoz, Citation2009; Qian, Zhang, & Zhang, Citation2006). These studies performed by changing only one geometrical parameter and others were fixed. This did not consider the interactive effect among the geometric parameters. This brings about local optimization.

Recently, ANN techniques and GA method has been successfully applied to many industries (Ardabili et al., Citation2018; Ebtehaj & Bonakdari, Citation2016; Fotovatikhah et al., Citation2018; Gholami et al., Citation2018; Moazenzadeh, Mohammadi, Shamshirband, & Chau, Citation2018; Najafi, Faizollahzadeh Ardabili, Shamshirband, Chau, & Rabczuk, Citation2018; Taherei Ghazvinei et al., Citation2018). Also, cyclone optimization studies have been performed by combing Design of Experiment (DOE) with various modeling methods such as Response Surface Methodology (RSM) and Artificial Neural Network (ANN) methods. A number of optimization studies are relatively limited, compared to the experiment and simulation studies (Sun & Yoon, Citation2018).

Elsayed and Lacor (Citation2012) conducted a multi-objective optimization study for seven geometrical parameters using non-dominated sorting genetic algorithm (NSGA-II). They obtained the radial basis function neural network (RBFNN) models for Euler number (Δp) and cut-off diameter (η) based on results of the empirical model. The empirical models can yield the estimated values faster than using CFD. However it is not suitable for cyclonic optimization because the predictive performance of empirical model is less accurate for complex geometry.

Sun, Kim, Yang, Kim, and Yoon (Citation2017) conducted multi-objective optimization for Δp and η based on CFD, the desirability function method and response surface method (RSM). As optimization results, Δp and η were improved by 20.7% and 24.2%, respectively, compared to the reference model. Also, Sun and Yoon (Citation2018) added the cyclone surface roughness as a new design variable and increased the level range of the existing design variables and optimized it by using NSGA-II. However, it was found that cited reference values of Δp and η are significantly different at the optimization study using NSGA-II. Therefore, a reasonable comparison of optimization results cannot be taken ahead.

Khalkhali and Safikhani (Citation2012) developed the models of Δp and η according to the four cyclone geometric parameters based on CFD and GMDH neural network. The developed cyclone models were optimized simultaneously using NSGA-II. Multi-objective optimization was executed based on the NSGA-II. However, the reproducibility of the optimized results was not confirmed by CFD simulations. This cannot confirm the proper optimization results.

Elsayed and Lacor (Citation2013) optimized the four geometric parameters of the Stairmand design cyclone using CFD, RSM and desirability functions. The new design has achieved high performance. However, the errors between RSM results and CFD results for Δp and η were 17.20% and 20.23%, respectively. In addition, a cyclone performance models were developed using RBFNN as an alternative to RSM. However, the predicted performance of the developed RBFNN model was evaluated quantitatively based on the training data set. That approach might result in over fitting. A different test set should have been used to evaluate the prediction performance of model instead of the training set that used to train the model. In addition, this study did not verify the physical justification for optimized reasons. It cannot explain why the optimized model outperforms than the existing baseline model.

1.2. Study objectives

The objectives of present study are summarized in three:

  1. Analysis of effect of significant geometrical parameters out of 10 geometrical parameters on the cyclone performance parameters using DOE analysis with CFD results: The significant geometrical parameters assist to reduce the number of DOE data sets required for surrogate modeling, as well as to solve local optimizations that occur when only one design variable is changed. Previous work of cyclone using CFD overlooks better optimization performance due to local optimization problems (Elsayed & Lacor, Citation2011; Ganegama Bogodage & Leung, Citation2015; Raoufi et al., Citation2008).

  2. Development and Comparison of predictive models based on RSM, GMDH technique and BPNN technique for predicting cyclone performance ?>parameter with respect to the cyclone geometric parameters: Each cyclone performance prediction model is compared quantitatively using AIC, R2, RMSE, and R2_adj, and uncertainty analysis is performed on the developed models. To the best of our knowledge, there are no studies comparing simultaneously the predictive performance of Δp and η models obtained by using the various ANN methodologies and the RSM using the same training dataset and test dataset in many cyclone studies.

  3. Multi-objective optimization of the cyclone models that have the best predictive performance using NSGA-II: The optimized results compare not only the improvement of the cyclone performance models (Δp, η) with the reference model, but also analyze from the viewpoint of physical justification of the optimization result. The limitations of previous optimization researches lack an analysis of why the optimization was made (Elsayed & Lacor, Citation2013; Khalkhali & Safikhani, Citation2012).

1.3. Reference model: Muschelknautz method of modeling

For this optimization study, Muschelknautz method (MM) of modeling (Muschelknautz & Krambrock, Citation1970) was selected as a reference model because it contains the shape of the cyclone dust collector as a design parameter, compared to the other cyclone mathematic model, and many experimental results (Obermair & Staudinger, Citation2001; Obermair, Woisetschläger, & Staudinger, Citation2003) are available for CFD validation. The reference model will be compared to the optimized cyclone model. The 10 geometric parameters and geometric ratio of the MM model presents in Table .

Table 1. Geometrical parameters values for Obermair et al. (Citation2003). (D = 0.4 m).

1.4. Study outline

The flow chart relationship in this study is shown in Figure . This optimization study followed the following steps. First, CFD simulations are performed on the cyclone flow using the turbulence models and the DPM. For time and convenience, CFD is used instead of experimental method to get estimated values for modeling of Δp and η. The CFD results are validated by comparison with experiment results of the reference model available in the literature (Muschelknautz & Krambrock, Citation1970; Obermair et al., Citation2003; Obermair & Staudinger, Citation2001).

Figure 2. The flow-chart relationship of present work.

Figure 2. The flow-chart relationship of present work.

Next, the DOE analysis for the 10 cyclone geometric parameters is performed to get the significant geometric parameters. And then, the models based on RSM, GMDH-NN and BPNN combined with the DOE method have been developed to predict η and Δp with respect to the significant geometric parameters. The prediction performances of the cyclone performance models are compared to analyze the advantages of each methodology and differences between the models. And then, an optimal predictive model for η and Δp is selected.

Finally, the optimal predictive models were simultaneously optimized to get maximum η and minimum Δp using NSGA-II. The ‘trade-off’ solution between optimal models was selected in optimal Pareto front, and the reproducibility was verified by using CFD with the Reynold stress model. The improvement rates of optimized results were compared with the reference model quantitatively.

2. Numerical simulation of cyclone separator

2.1. Reynolds averaged Naiver-Stokes (RANS) for turbulence models

An appropriate turbulence model should be used to simulate the turbulence behavior in CFD. The Reynolds averaged Naiver-Stokes (RANS) turbulence models have been successfully applied to predict the turbulence flow in many industrial applications. Many studies have been conducted on the application of turbulence models to capture the behavior of cyclone flow. Boyan, Ayers, and Swithenbank (Citation1982) found the kε models cannot predict the turbulence behavior of cyclone with significant non-equilibrium effects of turbulent transport. Qian et al. (Citation2006) showed that the cyclone flow can capture using the Reynold stress model (RSM) which assumes a Reynolds stress tensor with isotropic eddy-viscosity hypothesis. In addition to the RANS model, LES (Large Eddy Simulation) shows relatively better prediction results about the cyclone than RSM, but LES requires computational resources several times larger than RSM (Elsayed & Lacor, Citation2013). Therefore, in this study, the RSM was used to simulate the cyclone flow. The RSM are expressed as follow: (3) tRij+u¯kxkRij=DTij+Pij+Fij+Φijεij(3) (4) DT,ij(turbulent diffusion)=xk[ρuiujuk¯+ρ(δkjui+δikuj)(4) (5) Pij(stress production)=ρuiuk¯ujxk+uiuk¯uixk(5) (6) Fij(rotation production)=2ρΩk(ujum¯εikm+uiuk¯εjkm)(6) (7) Φij(pressure strain)=pui¯xj+uj¯xi(7) (8) εij(pressure strain)=2μui¯xkuj¯xk(8) where p and μ are, respectively, pressure, the molecular viscosity, and δ is Kronecker delta.

2.2. Discrete phase models (DPM)

The particle trajectory in cyclone was calculated by Lagrangian-based discrete phase model (DPM). The DPM assumes that the particle phase does not affect the continuous phase when the volume occupied by the particles in the continuous phase was less than 10% in DPM (ANSYS FLUENT, Citation2016). So, DPM can be applied in cyclone simulation. Transport momentum equation for each particle is expressed as: (9) dupdt=FD(uk+ukup)+g(ρpρ)ρp+F(9) where the term FD(uk+ukup) is the drag force per unit particle mass, uk is the time-averaged fluid velocity in the k-th direction obtained by solving the RANS model, uk is the instantaneous fluctuation velocity in the k-th direction and F is an additional acceleration (force/unit particle mass), ρp, ρ are the particle density and the flow density, respectively. The drag force and drag coefficient of the spherical particles is calculated as (10) FD=18μρpdp2CDRep24(10) (11) CD=24ReRep1,24(1+0.15Rep0.687)Rep1<Rep1000,0.441000<Rep,(11) where the term CD is drag coefficient of the spherical particle, Rep is the particle Reynold number. The turbulence fluctuations are random functions of space and time. A discrete random walk (DRW) model is applied to simulate the instantaneous fluctuation velocity (uk) in this study. The uk affects to the particle transport for small particles. The values of uk that prevail during the lifetime of the turbulent eddy are sampled by assuming that they obey a Gaussian probability distribution as (ANSYS FLUENT, Citation2016). (12) uk=ξukuk¯(12) Where, ξ is a normally distributed random number. ukuk¯ is the local RMS value of the velocity fluctuations. The characteristic lifetime of the eddy (τe) is defined either as a constant given by: (13) τe=2TL(13) where TL is the eddy turn over time given as, τe=0.3k/e. The other option allows for a long–normal random variation of eddy lifetime that is given by: (14) τe=TLlog(r)(14) where r is a uniform random number between 0 and 1. The option of random calculation of τe yields a more realistic description of the correlation function.

2.3. CFD simulation setup

Since cyclone flows have the characteristics of strong vortex and steep pressure distribution, it is necessary to accurately predict cyclone flows using powerful and efficient algorithms. Shukla, Shukla, and Ghosh (Citation2011) conducted a case study to select the optimal algorithm that can accurately simulate the cyclone flow. According to the results of the case study, in this study, the velocity and pressure was connected by SIMPLE algorithm, for static pressure distribution and the pressure drop, it was interporated by the PRESTO method. Advection terms were discretized by using the QUICK. The second-order upwind scheme was used for the momentum, turbulence dissipation rates and turbulence kinetic energy. To reduce the numerical instability and to improve convergence, the first-order upwind scheme was used for the Reynolds stress.

The boundary conditions in this study were identical to the conditions used in the reference model. The cyclone inlet was set to uniform velocity and the air flow rate is 800 m3/h (12.7 m/s). The cyclone outlet was set to atmospheric pressure. The cyclone wall was set to no-slip conditions. For the convergence of the cyclone flow, the convergence criteria for optimization process are until there is no fluctuation of pressure drop. The all simulation are conducted as follow, First, the steady-state solution reached 3,000 iterations for stability of the solution. And then, the iteration number per time step and the time step size were selected as 1500 and 0.001 s, respectively. This creates a physical time of 1.5 s. The boundary conditions for the CFD simulation were listed in Table . The commercial FVM simulation software ANSYS Fluent 16.2.was used to simulate the cyclone performance parameters.

Table 2. Boundary condition for the cyclone CFD simulation.

2.4. DPM setting

For the discrete phase modeling (DPM) condition, the collusion condition between the wall and the particles was selected as elastic conditions. The collection of particles is calculated if the particles reach the dust collector bottom. The particles density is 2770 kg/m3. The number of injected particles is 104 in the all simulation. The particle size distribution is divided to 10 class based on the Rosin-Rammler theory (ANSYS FLUENT, Citation2016). (15) Yd=e(d/d¯)n(15) Where the n is the diffusion parameter. The d is the mean diameter. For simulation, the d, n is defined to 5 µm and 3.5, respectively. Also, the distribution of particle diameters is set from 1 to 10 µm. The particles are divided into 10 classes. So, the particles can be mixed sufficiently. The injected 104particles are tracked to calculate the collection efficiency of the cyclone about all simulation.

2.5. Comparison between CFD results and reference model

2.5.1. CFD grid independence test

As the grid density is increased, the precise solution could be obtained. However, as the number of grid increases, the CPU time also increases. Figure  shows grid for cyclone generated by using cut-cell meshing of ANSYS Meshing 16.1. The geometric ratio of cyclone was used as the reference model. The near-wall treatment was achieved by using scalable wall functions considering the grid refinement with y+ < 11 (ANSYS FLUENT, Citation2016).

Figure 3. Mesh for the simulation of the cyclone.

Figure 3. Mesh for the simulation of the cyclone.

In this study, the grid independence test was conducted until η and Δp were converged according to increasing the grid edge size. Figure  presents that the results of grid test for η (cf. Figure (a)) and Δp (cf. Figure (b)). It showed that the Δp and η converged at the 6.5 mm of maximum grid size. Thus, in the all simulation, the maximum grid of 6.5 mm was used.

Figure 4. (a) Results of the mesh independence test of the separation efficiency. (b) Result of the mesh independence test of the pressure drop.

Note: unit % is multiplied by 100.

Figure 4. (a) Results of the mesh independence test of the separation efficiency. (b) Result of the mesh independence test of the pressure drop.Note: unit % is multiplied by 100.

2.5.2. Validation of results for velocity distribution

To validate the CFD simulation of cyclone, CFD simulation of cyclone velocity distribution was compared with the results of available experimental data. Figure  presented the consistency between the experimental data using Laser Doppler anemometry (LDA) method (Obermair et al., Citation2003) and simulated results for tangential and axial velocities at y = 0.75D and y = 0.368D. As shown in Figure , The kε models cause excessive turbulent viscosities and unrealistic tangential velocities due to the Boussinesq eddy viscosity assumption. Therefore, the kε models cannot capture the cyclone flow. Whereas, the Reynolds Stress Model (RSM) can accurately capture the cyclone flow because the RSM assumes a Reynolds stress tensor with three-dimensional anisotropic property (ANSYS FLUENT, Citation2016). Furthermore, the collection efficiency curve from CFD results was compared with the experimental results in Figure . The CFD results were consistent with those obtained experimentally. The error between experiment result and CFD results for η and Δp were 1.1% and 0.5% respectively. Therefore, η and Δp resulted from CFD showed reasonable results by comparing the experiment data. This verified CFD simulation can be reasonably used for various surrogate model method, such as response surface methodology and artificial neural network systems.

Figure 5. Comparison of CFD result for distribution of velocity with experiment data. The three rows present comparison against Moazenzadeh et al. (Citation2018) from left to right: first column – tangential velocity; second column – axial velocity; from top to bottom: axial location in 0.75D (A-A′) and 0.368D (B-B′), respectively.

Figure 5. Comparison of CFD result for distribution of velocity with experiment data. The three rows present comparison against Moazenzadeh et al. (Citation2018) from left to right: first column – tangential velocity; second column – axial velocity; from top to bottom: axial location in 0.75D (A-A′) and 0.368D (B-B′), respectively.

Figure 6. Comparison of CFD result for separation efficiency curve with experiment data Moazenzadeh et al. (Citation2018).

Note: unit % is multiplied by 100.

Figure 6. Comparison of CFD result for separation efficiency curve with experiment data Moazenzadeh et al. (Citation2018).Note: unit % is multiplied by 100.

3. Surrogate model development for η and Δp

3.1. Design of experiment

Design of Experiment (DOE) is a systematic method, which can be effectively used to investigate the effect of change of design variables on the response(s) and to provide reasonable answers about the behavior of a system (Mathews, Citation2005). In particular, the modeling techniques, such as artificial neural networks (ANN) and response surface techniques (RSM), are successfully combined with DOE methods (Khayet, Cojocaru, & Essalhi, Citation2011; Rahimpour, Hatti-Kaul, & Mamo, Citation2016). The DOE method can generate a matrix that provides maximum information by minimizing the combination of variables. In general, the fractional factorial design (FFD) method and the central composite design (CCD) method are most commonly used. The DOE method can efficiently apply not only to generate train set for determining the parameters of model but also to generate test set for evaluating the predictive performance of the model.

3.1.1. Fractional factorial design

As the number of design variables increases, the number of physical/numerical experiments is increased exponentially to generate the DOE matrix. If a lot of design variables are considered to model ANN and RSM based on DOE, it takes lots of time to obtain the physical/ numerical experiment results. For solving the problem of many variables, fractional factorial design (FFD) method is introduced. It is effective to reduce the number of runs by compounding high-order interaction effects. It is advisable to use the FFD to investigate design variables that have a statistically significant effect before implementing modeling techniques to reduce the experiments size (Sun et al., Citation2017). Table  presents the selected cyclone geometric parameters for the FFD with IV resolution and the corresponding range of levels (−1, +1) (Hoffmann & Stein, Citation2008). The FFD of IV resolution with 32 CFD runs was performed. The statistical significance for cyclone geometric parameters was analyzed by Pareto chart with analysis of variance (ANOVA).

Table 3. Levels and factors for factorial experiment.

Figure  shows the main effect plot and Pareto chart for Δp (cf. Figure (a)) and η (cf. Figure (b)) at the α = 0.05 significance level. Higher-order terms are not included in the chart because they have very small effects. Therefore, the higher term has been ignored. The vortex finder diameter Dx(x1), the inlet width a (x2), the inlet height b (x3) and the cone height Hco (x5) have a significant effect on η and Δp. Also, the result of the main effects for η and Δp showed the same tendency as Pareto chart results. In the case of the x5, it showed a negative effect on and Δp and a positive effect on- η. Therefore, the four geometric parameters (Dx, a,b,Hco) are selected as design parameter for modeling Δp and η at next section. The other parameters are set as average values between minimum and maximum values.

Figure 7. Analysis of design of experiment for the pressure drop (a) and separation efficiency (b); from top to bottom: Pareto chart, main effect plot.

Figure 7. Analysis of design of experiment for the pressure drop (a) and separation efficiency (b); from top to bottom: Pareto chart, main effect plot.

3.2. Response surface methodology (RSM)

RSM uses a strong statistical method based on the least square method that fits of the estimated values obtained from experiment/simulation to the quadratic polynomial model of RSM. In order to predict the system behavior, the quadratic or higher-order polynomial model is applied in many industrial applications. For modeling the cyclone performance parameters according to geometric parameters, the polynomial equation of RSM is expressed as a general form: (16) yk=β0+i=14βixi+i=14.j=14βijxixj+i=14βiixi2+ε(16) where yk is the response surface models (k = 1 for collection efficiency and k = 2 for pressure drop), and xi and xj are the cyclone geometric parameters. β0, βi, βij, βii are the regression coefficients for each constant, linear, interaction, and square term. ε is the prediction error of the model. The regression coefficients were obtained by the ordinary least square (OLS) method. The OLS is represented as follow, (17) βOLS=(XTX)1XTY(17) Where βOLS is a regression coefficient vector, X is the matrix about the tested four geometrical parameters. The matrix is generated using DOE method, which includes all combinations of the four geometrical parameters. Y is the column vector of the experiment/simulation results in the DOE matrix. In other words, the main advantage of RSM is that it is simple and fast to develop predictive models, but the nonlinear nature of the model is relatively small compared to ANN.

A typical DOE matrix for obtaining RSM model is the central composite design (CCD) method (Mathews, Citation2005). The CCD enables to estimate the square terms of the second-order model efficiently. The CCD consists of a factorial experiment point, a center point and an axial point. If the number of the design variables is k, the total number of experiments of CCD can be written as (18) N=2k+2k+nc(18) where 2k is the factorial experiment points, and 2k is the axial points, and nc is the number of iterations at the center point. A value of α depends on the number of experimental results in the factorial experiment point: (19) α=2(kp)/4(19) where if k is 0, that is full factorial design. And if k is constants, that is fractional factorial design, α is the rotatability. All factors are investigated in five levels (−α, −1, 0, 1, +α). In this work, the DOE matrix for CCD is generated by using the statistical analysis program (Minitab 17, MINITAB Inc.).

3.3. Group method of data handling (GMDH) type neural network

The GMDH algorithm pertains to computer-based mathematical modeling techniques. One of the characteristics of GMDH is the fully automatic architecture and parametric optimization of the model (Onwubolu, Citation2015). The GMDH algorithm consists of several layers containing neurons. Each neuron has two inputs (xi, xj) and on single output (yˆn). The output of each neuron can be represented through a quadratic polynomial as Eq.20. (20) yˆn=w0+w1xin+w2xjn+w3xinxjn+w4xin2+w5xjn2(20) Where n is the total number of datasets. wi is weigh of model. The weight of a neuron is obtained by minimizing the square of the difference between the estimated values (yn) and the predicted values (yˆn) using model. (21) e=minn=1N(yˆnyn)2(21) (22) n=1N(yˆnyn)2wi=0(22) That is, the gradient of e is minimized about wi. It called the gradient descent method. Test set is then used to calculate the coefficient of determination (R2) and Akaike information criterion (AIC) of the obtained each neuron. The R2 is the quantitative criteria of predictive performance for the model. The AIC evaluates the overfit risk and overfit risk of the model. If the R2 and AIC of neurons in calculating layer are higher than a predefined value, the neurons are transferred as new neurons in the next layer. If the best criteria of the current layer are no longer higher than the best quantitative criteria of the previous layer, the GMDH algorithm will stop.

In other words, the optimal output neuron is selected automatically. Therefore, the GMDH algorithm can capture not only the mathematical model with nonlinear, but also the model with higher-order terms without instability problems due to the inductive self-organizing (Onwubolu, Citation2015). However, the GMDH is difficult to change the network structure such as the layer size, node size. The final optimal neuron can be represented as Eq. 23. (23) yˆ=w0+i=1nwixi+i=1nj=1nwijxixj+i=1nj=1nk=1nwijkxixjxk+(23) which is known as a Volterra-Kolmogorov-Gabor polynomial (Onwubolu, Citation2015). The GMDH-type models of Δp and η will be determined and evaluated in the following sections. All GMDH calculations were carried out using MATLAB (R2018b) code.

3.4. Artificial neural network technique based on back propagation neural network (BPNN) algorithm

An artificial neural network (ANN) is a powerful computational modeling technique for solving the multivariate regression problems and for modeling of non-linear characteristic. The prediction accuracy of the ANN is affected by the ANN architecture, which consists of an input layer, hidden layers, and an output layer (Khayet et al., Citation2011). The input layer receives the values of input variables and connects to the hidden layer by multiplying the weight (wi) as Eq. 24 and assigns those to an activation function as Eq. 25. The result of activation function is transferred as new input to neurons in the next layer. (24) yj(k)=b0+i=1nwixi(24) (25) y(k)j_out=11+exp(y(k)j)(25) Where k is layer number, j is node number. The most common training algorithm for feed-forward neural network is back-propagation (BP) method. ANN training through the BP algorithm is an iterative optimization process that adjusts the weights appropriately to minimize a performance function. A commonly used performance function is defined as the root mean square error (RMSE) as follow: (26) RMSE=1ni=1n(yiyj_out(k))(26) Where yiˆ is i-th prediction value using the obtained model. According to the BP algorithm, weights and biases are iteratively updated in the fastest decreasing direction of the performance function, RMSE. In general, a single iteration of the BP algorithm can be written as: (27) W(k+1)=W(k)η grad(k)(RMSE)(27) Where W(k) is a vector of the current weight and bias, grad(k) (RMSE) is the current gradient of the RMSE as performance function and η is the learning rate. The update of the algorithm is terminated when the MSE does not decrease or exceeds a certain number of iterations. In case of BPNN, the number of hidden nodes and hidden layers of optimal model are not predetermined. Therefore, it is important to find the optimal number of node and layer. It called ‘hyperparameter tuning’. In other words, BPNNs can improve the prediction performance by adjusting the network structure, learning algorithm, and learning rate. However, not only it takes a long time to find the optimal learning condition, but also it takes a long time to converge the model due to iterative method. Therefore, in this study, we found the optimal network structure through greedy search method that increases the variable at regular intervals. For learning the BPNN model, the learning rate was set to 0.001 and the running algorithm was used to the Adam optimizer. The training of the BPNN model was terminated when the RMSE was lower than 1E-05 or the iterations reached about 100,000. The learning parameters for BPNN which selected in this work were summarized in Table . More details about mathematical aspects of BP training algorithms are described to Ref. (Hagan, Demuth, & Beale, Citation1997). All BPNN calculations were carried out using PYTHON 3.6.

Table 4. Training parameter for BPNN.

3.5. Train set and test set for comparing the predictive model of Δp and η

Twenty-five combinations (hereafter, CFD train set) were generated using CCD as Table . The CFD train set is used to obtain predictive model for Δp and η using RSM, GMDH and BPNN. Also, 81 combinations were generated using the full factorial design. Twenty-five out of 81 were randomly selected as CFD test set. The CFD test set presents in Table . The CFD test set is used to evaluate the prediction performance of the predictive models for Δp and η. The new combination of test sets not only efficiently grasps the prediction performance of the model, but also overcomes the models over train problem (Onwubolu, Citation2015). The range of four geometric parameters, x1, x2, x3 and x5 used for the modeling methods is the same as Table . The other parameters (x4, x6, x7, x8, x9 and x10) are fixed as the average of their parameter.

Table 5. CFD train set by using central composite design (CCD) method.

Table 6. CFD test set by using full factorial design method (partial).

3.6. Comparison of prediction performance and modeling results for η and Δp using RSM, GMDH and BPNN

3.6.1. Modeling for η and Δp using RSM

The coefficient of RSM polynomial for η and Δp is obtained as Tables  and , respectively. The RSM cyclone models were evaluated by the analysis of variance (hereafter, ANOVA). The ANOVA showed that each term of the cyclone geometric parameters has significant effect on performance parameters when the p-values of each terms are less than 0.05. The p-values of all the terms of Δp and η were similar, except for x2* x2 term and x1 * x5 term. This indicates that Δp is sensitively affected by width length of the cyclone inlet. The interactive effect between the cone length of cyclone and the outlet diameter can affect- Δp. Also, the interactive effects between x1 and x2 have the highest among the interaction term and that the interaction of x3 and x4 has the least effect on Δp and η. The interactive effects of pair of x1, x2 (cf. Figure (a)) and pair of x3, x5 (cf. Figure (b)) are visualized in Figure . Figure (a) showed a steep curve when the interactive effect is high. On the other hands, when the interactive effect is low, the curve is the similar as the straight line as shown Figure (b).

Figure 8. 3D plot and 2D contour of the response surface about separation efficiency (up) and pressure drop (down); (a) combination x1 and x2; (b) combination x3 and x5.

Note: all other variables are hold at mean values.

Figure 8. 3D plot and 2D contour of the response surface about separation efficiency (up) and pressure drop (down); (a) combination x1 and x2; (b) combination x3 and x5.Note: all other variables are hold at mean values.

Table 7. Results for fitting of regression coefficient of RSM η model and ANOVA results for RSM term.

Table 8. Results for fitting of regression coefficient of RSM Δp model and ANOVA results for RSM term.

The ANOVA showed that the coefficient of determination (R2) of the Δp and the η are 99.83% and 99.21% using CCD results (train set), respectively. The adjusted coefficient of determination (Radj2) of the Δp and the η considering the degree of freedom of the variables were 99.62% and 98.29% about train set, respectively. However, the RSM model should be evaluated for accurate prediction performance by using a test set that is not used for model parameter estimation rather than train set.

Therefore, the predictive performance of the RSM cyclone models was evaluated based on the CFD test set as shown Figure . The R2 values of the Δp and η were 91.61% and 94.5%, respectively. In particular, the RSM model has relatively poor predictive performance for collection efficiency data> 92% and pressure drop data> 4000 Pa. Consequently, this indicates that the RSM cyclone models have relatively low predictability for the new data.

Figure 9. Prediction performance results of cyclone RSM model for (a) separation efficiency and (b) pressure drop.

Figure 9. Prediction performance results of cyclone RSM model for (a) separation efficiency and (b) pressure drop.

3.6.2. Modeling for η and Δp using GMDH-type neural network

The coefficient of GMDH polynomial for η and Δp is obtained as Tables  and , respectively. The adequacy of GMDH model for η and Δp was evaluated by analyzing agreement between the cyclone GMDH output and the CFD test sets output. Figure  (a, b) present comparison of the prediction results of the performance parameters using GMDH with CFD results including x-y line. The predictive performance of the obtained the GMDH cyclone models were evaluated quantitatively based on the correlation coefficient. The correlation coefficients of η and Δp were 98.9%, 99.7%, respectively. The predictive performance of the obtained the GMDH cyclone models were evaluated quantitatively using the correlation coefficient. As shown Figure , the GMDH cyclone models predicted effectively the estimated values by CFD.

Figure 10. Prediction performance results of GMDH cyclone model for (a) separation efficiency and (b) pressure drop.

Figure 10. Prediction performance results of GMDH cyclone model for (a) separation efficiency and (b) pressure drop.

Table 9. GMDH η model.

Table 10. GMDH Δp model.

3.6.3. Modeling for η and Δp using BPNN

The prediction performance of the ANN model depends on the number of neurons and the number of layers. If the number of datasets is small, larger the number of layers, and larger the overfitting that causes problem in prediction performance of new dataset (Onwubolu, Citation2015).

So, in this study, the best network architecture has been investigated by grid search method (trial and error method). Based on the grid search method, the root mean square error (RMSE) of each BPNN model for the cyclone performance parameters was measured according to increasing the number of nodes and number of layers as shown Figure . As shown in the Fig. (a), RMSE of η model is overfitted when the number of nodes is more than 14 and the number of layers is more 3.

Figure 11. Results of hyperparameter tuning (a) separation efficiency and (b) pressure drop; from up to down: number of node and number of layer.

Figure 11. Results of hyperparameter tuning (a) separation efficiency and (b) pressure drop; from up to down: number of node and number of layer.

Also, Figure (b) presents that the RMSE of the Δp model is overfitted when the number of nodes is more than 8 nodes and the number of layers is more 3. Therefore, the RMSE of the BPNN model of Δp and η with optimal structure was 0.12 and 0.16, respectively. Figure  present the prediction performance for BPNN cyclone model based on the CFD test set. It showed the result of the prediction performance in the optimal BPNN structure, which is a good agreement, and the R2 for Δp and η was 98.1 and 98.5, respectively.

Figure 12. Prediction performance results of BPNN cyclone model for (a) separation efficiency and (b) pressure drop.

Figure 12. Prediction performance results of BPNN cyclone model for (a) separation efficiency and (b) pressure drop.

3.6.4. Comparison of prediction performance among RSM, GMDH and BPNN for Δp and η

In this study, the predictive model for Δp and η were obtained using RSM, GMDH and BPNN algorithm. In order to compare the prediction accuracy for the predictive models, the histogram of the error distributions and the prediction behavior for the three η models and three Δp models were analyzed using the CFD test set as shown Figures  and , respectively. The histogram of the error distributions can give better comparison for the prediction performance in terms of the possible error ranges (Onwubolu, Citation2015). The frequency of error between the prediction results by the developed cyclone models and the estimated results by CFD was normalized for convenient.

Figure 13. Comparison analysis of prediction performance test of separation efficiency model using RSM, GMDH and BPNN based the CFD test set; (a) Error histogram and probability density function for difference between CFD result and prediction value by models (b) Error value between CFD result and prediction value by models.

Figure 13. Comparison analysis of prediction performance test of separation efficiency model using RSM, GMDH and BPNN based the CFD test set; (a) Error histogram and probability density function for difference between CFD result and prediction value by models (b) Error value between CFD result and prediction value by models.

Figure 14. Comparison analysis of prediction performance test of pressure drop model using RSM, GMDH and BPNN based the CFD test set; (a) Error histogram and probability density function for difference between CFD result and prediction value by models (b) Error value between CFD result and prediction value by models

Figure 14. Comparison analysis of prediction performance test of pressure drop model using RSM, GMDH and BPNN based the CFD test set; (a) Error histogram and probability density function for difference between CFD result and prediction value by models (b) Error value between CFD result and prediction value by models

As shown and , the probability density function and the frequency the error were widely distributed in RSM cyclone model for η and Δp, compared to the neural network approaches. It showed low predictive performance of the RSM cyclone model. On the other hand, the GMDH results and BPNN results have higher prediction accuracy because the neural network approaches are considered for high-order nonlinearities unlike RSM approaches. Also, the predictive performance of RSM is especially poor at data points (3, 10, 22) for collection efficiency and pressure drop. In case of the data points, the flow rate is the same, but the inlet area is reduced, and the exit diameter of the cyclone is reduced, which increases the inlet velocity, resulting in a strong vortex with a physically sharp pressure gradient inside the cyclone. The GMDH and BPNN models with higher terms than the RSM show relatively good results in order to predict this sudden increase in pressure gradient. When comparing the BPNN results and the GMDH results, the variance of the error histogram of the BPNN model and the GMDH model was quite similar. However, training in the GMDH is based on performing linear algebra which requires short time for convergence rather than an iterative method of minimizing the error like the back-propagation algorithm which requires a large amount of time for convergence (Hagan et al., Citation1997). The convergence speed of the GMDH model of Δp and η (41 sec) was 15 times lower than that of BPNN model (621 sec).

Moreover, the GMDH cyclone models are mathematically expressed as a higher-order polynomial model. However, in the case of the BPNN cyclone model, the obtained model is still hidden as the black box model. The neural structure (the number of layer) for the BPNN cyclone models were 4. The neural structure for GMDH's Δp, η model were 4, 3, respectively. The BPNN algorithm needs to select the optimum neural structure by adjusting the hyperparameter, while the GMDH algorithm can adaptively optimize the structure to avoid overfitting problem.

Errors between predicted values by each model and estimated values at each CFD test set are shown in Figures  (b) and (b). It shown that the error of GMDH models is close to zero at all data point, compared to RSM and BPNN. Also, Table  presents the predictive performance of the three models using R2, RMSE, Radjusted2 and AIC based on the CFD test sets. The AIC and Radjusted2 evaluate the overfit risk and underfit risk of the model. As shown Table , the GMDH cyclone models (Δp: R2=99.7, RMSE = 0.102, Radjusted2=98.4, AIC = 36.794 and η: R2=98.99, RMSE = 0.0119, Radjusted2=94.45, AIC = 186.64) showed the best prediction performance than other models.

Table 11. Predictive model comparison of η and Δp parameters obtained by RSM, GMDH and BPNN.

Furthermore, in order to quantitatively estimate and compare the uncertainties of the cyclone models, uncertainty analysis (Gholami et al., Citation2018) was performed with the 25 CFD test set. To estimate the uncertainties, prediction error (PE), mean prediction error (MPE), standard deviation of prediction error (SDPE) and width of uncertainty band (WUB) and prediction error interval of 95% (PEI) are introduced. The value of MPE indicates an overestimated model for positive numbers and an underestimated model for negative numbers. The WUB and PEI can be expressed by the SDPE. (28) MPE=i=1nej=i=1n(PiTi)(28) (29) SDPE=i=1n(ejMPE)n1(29) (30) PEI=±1.96SDPE(30)

Where, Pi is the predicted value, Ti is the estimated value. The uncertainty analysis results for the all cyclone models are shown in Table . The WUB of GMDH models showed the lowest values (η = ±0.0065, Δp = ±0.0188) compared to the RSM and the BPNN results. Also, it was seen that the model of smallest MPE and 95% PEI are the GMDH model. Furthermore, the width of uncertainty band between the estimated values by CFD and the predicted values by the three models were visualized in Figure . The upper line and lower line of 95% PEI for η and Δp predicted by GMDH are best distributed in the exact line. Therefore, the GMDH cyclone models for Δp and η was reasonably selected as the optimal predictive model in this paper. The GMDH cyclone models for Δp and η are used in next section for the Pareto multi-objective optimization.

4. Multi-objective optimization for η and Δp by using genetic algorithm

4.1. Genetic algorithm

GA (Genetic Algorithms) are one of the optimization methods to solve local optimization problem. The GA inspired by the natural evolution. The population evolves towards the optimal form under the natural conditions. The GA repeatedly updates the population of individual solutions. In each step of the GA process, the GA produces initial random population and evaluates fitness for the individual. The best individuals are selected as the parent, and then the GA produces the offspring by applying crossover and mutation for the next generation. The GA is terminated when it reaches to the certain criterion.

Figure 15. Comparison analysis of qualitative scatter plot about width of uncertainty band of (a) Separation efficiency model and (b) Pressure drop model by three proposed method (BPNN, GMDH and RSM).

Figure 15. Comparison analysis of qualitative scatter plot about width of uncertainty band of (a) Separation efficiency model and (b) Pressure drop model by three proposed method (BPNN, GMDH and RSM).

Table 12. Uncertainty analysis results for cyclone performance models of GMDH, BPNN and RSM.

However, in the many multi-optimization problems, the objective functions conflicted with each other. Therefore, it is hard to obtain the best optimized points to optimize all objective functions simultaneously. For solving the problem, Den, Agrawal, Pratap, and Meyarivan (Citation2000) proposed the non-dominant genetic algorithm-II (NSGA-II). The non-dominated solutions set (also expressed as Pareto optimal front) provide the optimal solution by determining the point among solution set. For more details about the GA, see the Ref (Elsayed & Lacor, Citation2013).

4.2. Results of multi-objective optimization for Δp and η using NSGA-II

For multi-objective optimization for Δp and η, the non-dominated solutions set was obtained by NSGA-II using MATLAB 2018a. Table  presents a setting condition for multi-objective optimization. The conditions in NSGA-II were set up to search the entire design area. The fitness functions from GMDH-cyclone model for Δp and η were used for the GA optimization. The constraints of geometrical parameters used in the GA optimization are same as range of parameters at Table .

Table 13. Genetic algorithm setup for multi-objective optimization.

Figure  presents the results of 70 Pareto optimal front for η and Δp. The η is expressed as negative for the optimization purpose. The five design points (A,B,C,D,E) from 70 Pareto optimal front were determined to decide the optimal Δp and η. Point A have the high pressure drop compared to other point, it may require the high power to operate the manufacturing system. Point E has the low collection efficiency compared to other point. It cannot function properly as a cyclone. Point C was chosen to optimization cycle. The separation efficiency and pressure drop of Point C are 91.1% and 98.5 Pa, respectively, it has good performance compared the reference data. Also, Table  shows the reproducibility results. The reproducibility results show that GMDH cyclone models have high predictive performance.

Figure 16. Pareto front for multi-objective optimization for cyclone performance parameters.

Figure 16. Pareto front for multi-objective optimization for cyclone performance parameters.

Table 14. Validation of Pareto optimal front for the cyclone performance.

Finally, Table  shows the improvement rate of Δp and η of the optimized cyclone and the dimension ratio of the optimized cyclone. The η, d50 and Δp of optimized cyclones improved 8.79%, 18%, and 24.31% respectively over the performance of the reference cyclone. Figure  presents the comparison of the collection efficiency curve between the reference cyclone and the optimization cyclone. It presented that the collection efficiency for particles less than 1 micron was improved by about 20%. Figure  presents the particle behavior in the optimized cyclone. Therefore, the optimized cyclone demonstrates a reasonable approach in the GMDH-type neural network combined DOE and NSGA-II optimization processes in the present work.

Figure 17. Comparison of the separation efficiency curve between the optimized cyclone and the reference model.

note: unit % is multiplied by 100.

Figure 17. Comparison of the separation efficiency curve between the optimized cyclone and the reference model.note: unit % is multiplied by 100.

Figure 18. The behavior when the particles of different sizes (1–10 μm)are injected into the optimized cyclone.

Figure 18. The behavior when the particles of different sizes (1–10 μm)are injected into the optimized cyclone.

Table 15. Comparison of results for cyclone performance parameters between optimized cyclones and reference model.

4.3. Results of numerical simulation comparison between optimized cyclone and reference model

In order to analyze why the optimized cyclone model could achieve better performance than the reference model, the flow pattern of optimized cyclone was compared with the reference model by using CFD. The secondary circulation in cyclone decreases the cyclone separation performance due to particle re-entrainment phenomenon. Figure (a) presents the comparison for velocity vector field of the optimized cyclone and reference cyclone. The secondary circulation occurred at the reference cyclone. However, the optimized cyclone showed that the area of secondary circulation was decreased as shown Figure (a), compared to reference model. Therefore, when compared with reference cyclone, the separation efficiency of optimization cyclone outperform.

Figure 19. Results of flow pattern analysis; from left to right: Reference cyclone and optimized cyclone; (a) comparison for velocity vector field. (b) comparison for pressure drop. (c) comparison for turbulent kinetic energy.

Figure 19. Results of flow pattern analysis; from left to right: Reference cyclone and optimized cyclone; (a) comparison for velocity vector field. (b) comparison for pressure drop. (c) comparison for turbulent kinetic energy.

The contour of pressure drops of the optimized cyclone and reference cyclone presents in Figure (b), respectively. The maximum pressure of the optimized cyclone was about 200 lowers than the reference cyclone. This high pressure drop requires the high power to operate the manufacturing system and affects exposure of the particles in the cyclone called the phenomenon of ‘lip leakage’. This resulted in an improvement in the cyclone separation performance.

Figure (c) presents the turbulent kinetic energy (TKE) of the reference cyclone and the optimized cyclone. As shown in the Figure (e), the TKE was observed relatively high near the cyclone vortex finder of the reference model. As the TKE increases, the collection efficiency decreases. The maximum TKE of the optimized cyclone decreased by 25%, compared to the reference cyclone.

5. Conclusion

This study performed a comparative study among RSM, BPNN and GMDH algorithm for predictive models of Δp and η, and multi-objective cyclone optimization was successfully performed by using NSGA-II. In this study, MM model was selected as the reference model, and the optimization objectives were defined to minimize Δp and maximize η. After the DOE analysis for the cyclone geometric parameters using the validated CFD, the predictive models for Δp and η were obtained based on RSM, GMDH technique and BPNN technique. As a result of comparing the obtained models, the GMDH cyclone models predicted the Δp and η most efficiently. The R2 of η and Δp was 98.9%, 99.7%, respectively. Therefore, the GMDH cyclone models were used in the GA as the fitness functions. A total of 70 non-dominated optimal design points were obtained by GA. The five optimal point were selected to validate accuracy of GA results using CFD. The reproducibility results showed less than 10% error and proved the reliability of the GA optimization process. The optimized cyclone (Point C) provided a more reasonable improvement than the MM model. The optimized cyclone reduced the pressure drop by 24.31% (from 1300 to 984 Pa), the cut-off diameter by 18% (from 1.83 μm to 1.5 μm) and increased the overall efficiency by 8.79% (from 83% to 91%). In addition, the optimized reason was demonstrated by analysis of the flow part of the cyclone (flow field, pressure drop, turbulence kinetic energy).

The results presented can help with selection of appropriate prediction modeling techniques about the cyclone, and the obtained 70 Pareto front design points can assist designers and decision makers to optimize cyclone design. However, the optimization results need to be verified using experimental methods in addition to CFD. Therefore, future studies will further demonstrate the feasibility of the research by fabricating the cyclone based on optimized cyclone dimensions and conducting experiments on separation performance and pressure drop.

Disclosure statement

No potential conflict of interest was reported by the authors.

References