Abstract
The problem of the Andersen cascade impactor data inversion is studied using the maximum entropy method to recover the particle size distribution function from a set of measured masses collected on the impactor stages. The classical entropy maximization and its modification to accommodate data errors are reviewed using the techniques of convex optimization. Numerical experiments were performed to test the method on several unimodal and multimodal distributions commonly observed in practice. The Maximum Entropy Method for data inversion produced excellent agreement between the recovered and test Particle Size Distribution Functions without the need for a priori assumptions.
1. MOTIVATION AND PROBLEM STATEMENT
One of the major purposes of the Andersen cascade impactor (ACI) is to determine the particle size distribution of the particles suspended in the air flowing through the device. Cascade impactors consist of a number of stages in series each of which is designed to retain particles about a particular size. As the aerosol of interest is convected through the device, the particle size that each consecutive stage is designed to retain is decreased. The probability that a particle of a certain size will impact on a particular stages is known as the stages collection efficiency curve. Traditionally, cascade impactor stages have been designed to achieve a maximally steep efficiency curves. However, in practice curves take on a sigmoidal type shape (). After enough material is collected the amount retained on each can be determined by weight or chemical assay and then the particle size distribution function (PSDF) can be recovered using the measured data. The PSDF recovery, usually regarded as the impactor data inversion is, in general, mathematically ill posed—there are many possible distributions consistent with the observed masses (CitationFuchs 1978).
If an impactor did have maximally sharp efficiency curves it would allow for a simplified data inversion (CitationAndersen 1958; CitationMarple and Willeke 1976). In practice, an estimation of an ideal efficiency curve can be associated with a 50% probability of impaction (d 50) for each curve (). Indeed, assuming the ideally sharp cuts at 50% cut points (d 50), the resulting probability distribution can be obtained by dividing the mass fraction g i = m i /M, M = ∑ j = 0 7 m i of the collected particles on the stage i by the particle size interval:
This distribution is a piecewise constant function and usually presented as a histogram; the corresponding smooth distribution can be approximated by drawing a curve through the mid-points of the size intervals.
In practice, the accuracy of this simplified approach is compromised because multiple size ranges get collected on a single impactor stage. Consequently more complex distributions, such as multimodal ones, can be completed missed. The collection efficiency curves are typically S-shaped and show a large degree of overlap. While such overlaps make the simplistic inversion approach mentioned above more inaccurate, they provide additional information about the distribution that if properly taken into account, can help recover the most “realistic” distribution, consistent with the measured masses.
A formal description of the inverse problem can be given as follows. Assuming that a set of S-shaped efficiency curves S i (d) for stages i = 0,1,…,7 is available from the impactor calibration, the PSDF P(d) sought is a solution of the first-kind Fredholm integral equation with a discrete right hand side
Problem (2) is a classical case of an underdetermined ill-posed problem. A number of techniques have been developed to recover the distribution from Equation (Equation2). One group of methods is based on an a priori assumption about the functional form of the distribution, for instance a superposition of log-normal and normal, with further determination of the coefficients of the assumed form. The disadvantage of this approach is that the parametric form of the PSDF is not usually known; in fact, it is the very thing that we seek to determine.
In more recent years, several non-parametric regularization methods have been proposed to overcome the ill-posedness of the data inversion problem. The idea of regularization is to incorporate some additional information about the solution in order to compute a unique and stable solution. Thus, the well-known Tikhonov regularization scheme (CitationHansen 1998) reduces to a minimization of the residual with an additional “smoothing” constraint Ω (P) ≥ 0
Besides the Tikhonov regularization, there are many other inversion techniques that have been used in aerosol science to determine the aerosol PSDF. A critical survey and comparison of several popular methods is presented in CitationKandlikar and Ramachandran (1999), which includes constrained least squares, truncated singular value decomposition, Twomey's nonlinear approach, and Bayesian methods. The conclusion of the survey indicates that there is no single algorithm that can be considered superior to others, and applicable for all types of aerosol spectrometers such as impactors, elutrilators, nephelometers, and diffusion batteries. Also, while different algorithms produce PSDF that match the experimental data well, the resulting distributions may be quite different from each other.
In the present work, we examine the ACI data inversion problem using a maximum entropy method, which was originally proposed by CitationJanes (1957) as a variational technique to calculate equilibrium states in statistical physics, as well as for solving inverse problems (CitationJanes 1984). This method has proven in many other disciplines to be useful and efficient for recovering probability distributions from information on a few of its moments (CitationKapur 1989). In the field of aerosol spectrometry, the entropy maximization was introduced two decades ago in CitationYee (1989) to reconstruct the PSDF from diffusion battery measurements, but to our knowledge, has not been applied to the inversion of the ACI data since then.
From a technical viewpoint, one of the advantages of the maximum entropy method is that it can be considered as a special case of the convex optimization problem, which allows us to rely on general existence and duality theory for this problem (CitationBorwein and Zhu 2005) and also simplifies its computational treatment. However, since the aim of this article is to describe a reliable and efficient data inversion procedure with a particular application in mind, we provide only formal (heuristic) derivations that lead directly to computational schemes. Other interesting details on the rigorous problem formulations, solution functional spaces, the analysis and proofs of the existence, and characterization of the maximum entropy solution can be found in CitationHiriart-Urruty and Lemaréchal (1993), CitationBorwein and Zhu (2005), and references therein.
The structure of the article is the following. In Section 2, we first discuss a classical maximum entropy maximization (the primal problem) using Lagrange multipliers. Then, exploiting the convexity of the original problem, an alternative computational scheme based on dual formulation is described. As a result, the original infinite-dimensional problem reduces to a solution of finite dimensional nonlinear system in the former case, or to the maximization of a dual function of finitely many variables in the latter case. Section 3 presents a modification of primal-dual computational models that allows imprecise data values in the constraints. Section 4 gives details of the algorithmic implementation and computational results.
2. MAXIMUM ENTROPY METHOD
The basic idea of the maximum entropy approach to obtain a unique and robust particle size distribution P(d) from the data g i is to maximize the Boltzmann-Shannon-Jaynes entropy
The multiplier ν can be eliminated using the normalization constraint in Equation (Equation8) and the resulting PSDF has the form
Now, the unknown multipliers λ i , i = 0, …,7 can be found as solutions of a nonlinear system obtained by substituting this expression for P(x) into the remaining constraints in Equation (Equation8). This gives a nonlinear system of 8 equations for 8 unknowns
While the formal derivation above usually leads to the solution, an alternative approach that uses convex duality is often preferred. It allows detailed, rigorous analysis of the maximum entropy solution and, in certain cases, dual problems are easier to solve numerically. To illustrate the basic idea of the duality, we define the (Lagrange) dual function as the minimum value of the Lagrangian (Equation (Equation9)) over P
To obtain a practical formulation of the dual problem (Equation (Equation14)), the dual function has to be evaluated by minimizing the Lagrangian (Equation (Equation9))
Thus, the dual problem is the unconstrained maximization of the expression (18) with respect to λ and ν
It can be further simplified by maximizing over ν for fixed λ, which results in
3. MAXIMUM ENTROPY WITH NOISY DATA
One problem with the analysis presented above is that the stage loadings g i , in most practical applications, cannot be determined precisely since all measurements will have some level of uncertainty. A classical way to account for such errors is to apply a χ2 statistic to define a confidence region about the expected value, assuming that the observation noise is Gaussian. In the frame of the maximum entropy method, this can be accomplished by replacing exact constraints on the stage loadings g i in Equation (Equation8) by a single constraint (Skilling and Bryan 1984; Yee 1989)
Here L(P;λ,ν) is the Lagrangian (Equation (Equation9)) of the problem with the exact constraints. Note, that [Ltilde] can be minimized separately with respect to P and ξ. The variation of [Ltilde] with respect to P gives the same expression (11) for the PSDF (however, with different values of λ i ), and with respect to ξ:
To find the dual function for the problem (Equation (Equation24)) we minimize [Ltilde] over P and ξ, which can also be done separately:
The dual problem now reduces to the maximization of the last expression with respect to λ, ν, and μ. The maximization in μ can be carried out explicitly, and using the result of Equation (Equation21) the final form of the dual problem is an unconstrained maximization with respect to λ:
4. IMPLEMENTATION AND RESULTS OF COMPUTATIONS
As follows from the above discussion, the infinite-dimensional constrained entropy maximization can be reduced to the solution of the finite (low) dimensional nonlinear system (primal problem) or unconstrained minimization (dual problem). These finite-dimensional problems can be solved directly using standard numerical analysis techniques, such as the Newton's method. We implemented both the primal and dual problems in Matlab (The MathWorks Inc., Natick, MA) utilizing the Optimization toolbox that provides standard routines for the unconstraint minimization and nonlinear system solvers. We also employed an excellent, recently developed Chebfun system (CitationTrefethen et al. 2008), which allows practically symbolic manipulation of Matlab data and functions. This system is particularly convenient for handling recursive computations with the impactor stage response functions internally represented as Chebyshev polynomials.
To test how well different PSDF can be reconstructed using the maximum entropy approach, our computational experiments were performed in the following sequence:
1. | assume some initial distribution P(d), such as log-normal, Rosin-Rammler, etc.; | ||||
2. | calculate the corresponding stage loadings g i = ∫ a b P(d) R i (d) d(d), i = 0,1,…,7 where the particle size limits are taken as a = 0.2, b = 15; | ||||
3. | perform the inversion and compare with the initial distribution. |
Note that in step 2, we initially assume accurate measurements of the stage loadings without adding noise to the data.
In our first test case, we use the ideal, piecewise constant collection efficiency/response functions to recover a log-normal initial distribution. shows the initial (smooth line), as well as the resulting PSDF (dashed line), which is also a piecewise constant function. This result can be confirmed analytically in this case, and the maximal entropy solution is identical to Equation (Equation1). This test illustrates both the strength and the weakness of the entropy maximization method. Indeed, the recovered distribution is not smooth; however, as often argued, it is the one most objectively consistent with the incomplete available information.
In further tests, we use the manufacturer's calibrated efficiency/response data shown on . We experiment next with the log-normal and the Rosin-Rammler initial distributions. The Rosin-Rammler (also known as Weibull) distribution has a density function
Next, we consider more complicated multimodal initial distributions. The bimodal distribution is a superposition of two log-normal distributions. The trimodal consists of two log-normal and one normal distribution. shows that the bimodal distribution is recovered quite satisfactorily, giving reasonable estimates of the distribution peak positions. However, two peaks of the trimodal distribution on the interval between 6 and 10 micron have not been properly resolved, which reveals that the resolution quality of the PSDF might not be uniform on the considered particle size interval. To examine this issue, we define a bimodal distribution and translate it along the d axis from left to right, applying the maximum entropy method at several intermediate points. The results are presented in and demonstrate the loss of resolution and larger shifts of the distribution peaks as the original distribution moves to the right.
On the last graph of , the second peak cannot be identified at all, which corresponds to a 2 micron translation of the original PSDF, which appeared on the first graph.
The observed resolution problem can be understood by analyzing the approximation properties of the maximum entropy solution (Equation (Equation11)), which indicates that the logarithm of the PSDF is approximated by a linear combination of the impactor stage response functions. The response functions, however, are not evenly distributed on the considered particle size interval, as shown on . Clearly, a larger gap between them can be seen on the interval from 6 and 10 micron, which results in a growth of the PSDF approximation error on this interval.
The consequences of this observation are clear: if the maximum entropy method is to be used to invert the distribution, then instead of minimizing overlap between S-curves, one needs uniform overlap between S-curves.
In the next test we perform the reconstruction of the PSDF from the stage loadings data that contains random measurement errors. Such errors can be simulated by adding noise to the calculated exact values of the stage loadings g i obtained on step 2 of our numerical procedure. To examine the sensitivity of the inversion, we used the stage loadings calculated for the initial log-normal distribution shown on , superimposed with the up to 10% uniformly distributed random noise. However, we assume that the thus obtained values of g i represent accurate stage loading measurements. shows several PSDF recovered from the corresponding noisy data sets. The perturbation of the stage loadings clearly affects the recovered distributions, but does not produce high-frequency instabilities and preserves the shape of the original PSDF. We noticed, however, a slower convergence of the nonlinear system, as well as the minimization solvers in the course of these computations.
Finally, we performed numerical experiments using the maximum entropy inversion procedures for imprecise data, based on the formulations (27) and (29), assuming the uniform noise level σ i = σ for all impactor stages. Here we again attempt to reconstruct the log-normal distribution shown on from the data obtained by adding the Gaussian noise (with a mean and standard deviation of 0 and 0.02, respectively) to the theoretically exact stage loadings data. A particular noisy data set, generated this way, is presented in along with the corresponding “exact” fractional stage loadings.
TABLE 1 Fractional impactor stage loadings
For this noisy loadings data, the classical maximum entropy approach fails to converge. shows the PSDF, recovered using different values of the noise level σ. Choosing large values of the parameter σ results in over-smoothing of the distribution; however, reasonably accurate recovery can be achieved even by taking σ smaller that the actual noise level.
5. SUMMARY
The results of our study show that the maximum entropy method is an accurate and reliable technique of reconstructing the aerosol PSDF from the ACI measurements. The method performed well on the distributions commonly observed in practice, such as log-normal, Rosin-Rammler, and bimodal distributions. The main advantages of the entropy maximization are the following:
• | The method inverts Cascade Impactor data using no a priori assumption about the shape of the distribution. | ||||
• | The method is based on sound physical and information theory principles, in contrast with many other regularization techniques. | ||||
• | The mathematical theory of the entropy maximization is well developed and allows to transform the original infinite-dimensional problem into a finite dimensional one. | ||||
• | The numerical implementation of the method reduces to readily available, efficient algorithms. |
REFERENCES
- Andersen , A. A. 1958 . New Sampler for the Collecting, Sizing, and Enumeration of Viable Airborne Particles . J. Bacteriol. , 76 ( 5 ) : 471 – 484 .
- Borwein , J. and Zhu , Q. J. 2005 . Techniques of Variational Analysis , Springer .
- Fuchs , N. A. 1978 . “ Aerosol Impactors: A Review ” . In Fundamentals of Aerosol Science , Edited by: Shaw , D. T. 1 – 83 . New York : Wiley & Sons .
- Hansen , P. C. 1998 . Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion , Philadelphia : SIAM .
- Hiriart-Urruty , J.-B. and Lemaréchal , C. 1993 . Convex Analysis and Minimization Algorithms. II , Berlin : Springer-Verlag .
- Jaynes , E. T. 1957 . Information Theory and Statistical Physics . Phys. Rev. , 106 : 620 – 630 .
- Jaynes , E. T. 1984 . “ Prior Information and Ambiguity in Inverse Problems ” . In Inverse Problems , Edited by: McLaughlin , D. W. Providence, RI : Am. Math. Soc. .
- Kandlikar , M. and Ramachandran , G. 1999 . Inverse Methods for Analyzing Aerosol Spectrometer Measurements: A Critical Review . J. Aerosol Sci. , 30 : 413 – 437 .
- Kapur , J. N. 1989 . Maximum-Entropy Models in Science and Engineering , New York : John Wiley & Sons .
- Marple , V. A. and Willeke , K. 1976 . Impactor Design . Atmos. Environ. , 10 : 891 – 896 .
- Rader , D. J. , Mondy , L. A. , Brockmann , J. E. , Lucero , D. A. and Rubow , K. L. 1991 . Aero. Sci. Tech. , 14 : 365 – 379 .
- Skilling , J. and Bryan , R. K. 1984 . Maximum Entropy Reconstruction: General Algorithm . Mon. Not. R. Astr. Soc. , 211 : 111 – 124 .
- Trefethen , L. N. , Pachón , R. , Platte , R. B. and Driscoll , T. A. 2008 . Chebfun Version 2 , Oxford University . http://www.comlab.ox.uk/chebfun
- Vaughn , N. P. 1989 . The Andersen Cascade Impactor: Calibration, Wall Loses and Numerical Simulation . J. Aerosol Sci. , 20 : 213 – 221 .
- Yee , E. 1989 . On the Interpretation of Diffusion Battery Data . J. Aerosol Sci. , 20 ( 7 ) : 797 – 811 .