3,011
Views
24
CrossRef citations to date
0
Altmetric
Original Article

Pattern recognition analysis on nutritional profile and chemical composition of edible bird’s nest for its origin and authentication

, ORCID Icon, ORCID Icon, ORCID Icon &
Pages 1680-1696 | Received 08 May 2018, Accepted 18 Jul 2018, Published online: 06 Aug 2018

ABSTRACT

Authenticity of food is of great importance to ensure food safety and quality, and to protect consumer rights. A rapid and accurate method for authentication of edible bird’s nest (EBN) was proposed by using nutritional profile and chemical composition, and pattern recognition analysis. The authentication of EBN includes identification and classification of EBN by production origin (houses or caves), species origin (Aerodramus fuciphagus or Aerodramus maximus) and geographical origin (Peninsular Malaysia or East Malaysia) based on their active compositional content. Three pattern recognition methods, principal component analysis (PCA), hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA), were employed to develop classification models for authentication of EBN origins. Compared to PCA and HCA, LDA is more accurate and efficient in distinguishing EBN by different production, species, and geographical origins, having classification ability of 100% and prediction ability of 92% as validated by cross-validation method. The key chemical markers for production origin differentiation are total phenolic content, zinc, valine, and calcium, while for species origin discrimination are sialic acid, serine, phenylalanine and valine, and for geographical origin differentiation are arsenic and mercury. The findings suggest that nutritional and chemical profiles combined with pattern recognition analysis are promising strategy for rapid authentication of EBN and its products.

Introduction

Composition of food has been used for food classification and identification such as the Codex Alimentarius.[Citation1] Recently, the classification and identification of specific compounds in food have also led to the identification of its origin and authentication studies.[Citation2Citation5] Knowing the origin of food is important for quality control, food traceability. It ensures consumer protection and fair competition in the food industry against fraudulent substitution and adulteration practices. Food with high economic value are often substituted and/or adulterated with cheaper materials in order to gain higher profit.[Citation6] These fraud and adulterated food can cause adverse health impacts to consumers and lead to economic losses to the food industry.[Citation7]

Edible bird’s nest (EBN), a type of food built from the salivary secretion of swiftlets. In Malaysia, the EBN is mainly produced by two swiftlets species, namely Aerodramus fuciphagus and Aerodramus maximus.[Citation8] It is one of the common targets for food fraud due to its high economic value and commercial importance. Many studies have reported the medicinal and therapeutic effects in EBN, such as the ability in rejuvenating skin, inhibiting influenza virus and inflammation, proliferating cell, enhancing bone strength and dermal thickness, reducing tumor production, treating erectile dysfunction and osteoporosis.[Citation9Citation15] Owing to its potential abilities and high price, EBN is increasingly fraud substituted with lower quality EBN and adulterated with cheaper ingredients such as Tremella fungus, karaya gum, red seaweed and fried porcine skin, and then marketed at a premium price for greater financial gain.[Citation16] Visual inspection is less likely to detect these fraud substitutions and adulterants in EBN because they have similar appearance with the genuine EBN. Thus, in concern for the consumers’ health and industries’ economic, it is important to establish an accurate and reliable approach to examine EBN.

Physicochemical, proximate, elemental, fatty acids, triacylglycerol, sialic acid, saccharides, peptides, and amino acids have been used previously to classify EBN following its colorations, regions, production sites, and harvesting seasons,[Citation16Citation21] but no report on EBN species origin except for recent study by Quek et al.[Citation4] For a holistic representation of the characteristics and properties of EBN, various important parameters and properties of EBN including the physicochemical properties, proximate content, antioxidant activities, elemental compounds, amino acids, nitrite, nitrate, and sialic acid often result in a very large sampling size, a very large dataset which is challenging for analysis and interpretation. The conventional univariate data analysis is inadequate in handling large dataset with highly complex food composition. Thus, pattern recognition analysis, also known as multivariate data analysis is introduced. Pattern recognition analysis is widely used in agricultural and food domain analysis to efficiently interpret large and complex dataset,[Citation22] and to examine failure risk of processes and products in wide range of industries including EBN.[Citation23Citation26] It uses mathematical and statistical procedures to obtain useful information from the dataset.[Citation27] Among the pattern recognition methods, the unsupervised algorithms of principal component analysis (PCA) and hierarchical clustering analysis (HCA), and the supervised algorithms of linear discriminant analysis (LDA) are most commonly used and compared. The unsupervised algorithms excel at discovering unexpected patterns in dataset, while the supervised algorithms well at classification.[Citation28] PCA was previously employed in analyzing active components (protein and sialic acid) of EBN to classify EBN by color, production sites, and geographical origins.[Citation21,Citation29] In addition, Fourier transform infrared spectroscopy (FTIR) data of EBN was analyzed by LDA with satisfactory classification ability of 94.12%.[Citation30]

This research aimed to use pattern recognition analyses of PCA, HCA, and LDA on the nutritional profile and chemical compositions of the EBN to differentiate EBN by their production, swiftlet species and geographical origins. This approach has been used in previous works which include classification of olive oils, fish, wine, honey, and dairy products[Citation31;−Citation35] despite, no reports on EBN. From the developed pattern recognition analysis models and identified chemical markers from the nutritional and chemical profiles, a rapid and accurate EBN origin identification method to ensure safe and genuine products sold in the market thus help to prevent fraudulence in the bird nest industry and protect consumers from the danger of adulterated products.

Materials and methods

Edible bird’s nest preparation

Thirteen EBN samples were collected in Malaysia from 2013 to 2014. These samples were originated from two different production origins (house and cave), species origins (A. fuciphagus and A. maximus) and geographical origins (Peninsular Malaysia and East Malaysia) (). The production and geographical origins of EBN samples were guaranteed by the respective farmers, except two samples (EBN 12 and 13) that were purchased from the local markets. The species origin of EBN samples was preliminary confirmed using molecular technique by DNA sequencing of the cytochrome b gene.[Citation36] The EBN samples were ground into powder using a MFM-202 high-speed grinder (Ta Feng Electrical Appliances Co. Ltd., Taoyuan, Taiwan) at 20,000 rpm and screened through 1 mm mesh size. The ground EBN samples were kept in airtight containers and stored at −20°C until analysis.

Table 1. Information of edible bird’s nest samples.

Compositional analysis

A total of 45 compositional properties were assessed on each EBN sample, including physiochemical properties of water activity and color. Water activity was determined using a Fast-lab water activity meter (GBX, Romans-sur-Isere, France). Color was measured using a CR-10 colour reader (Konica Minolta Sensing Inc., Tokyo, Japan) and results were represented in L* for lightness, a* for redness and b* for yellowness. Hue and chroma values were calculated using the following equations, respectively.

Hue=tan1ba
Chroma=a2+b2

Proximate analysis was conducted using the established AOAC Official Methods.[Citation37] Moisture content, protein, ash, and fat were determined following the AOAC Official Methods 950.46, 981.10, 923.03, and 991.36, respectively. Carbohydrate was determined by difference method using the following equation.

Carbohydrategkg1=100moisture+protein+ash+fat

Antioxidant properties determined were total phenolic content (TPC), 1, 1-diphenyl-2-picrylhydrazyl (DPPH) and ferric ion reducing antioxidant power (FRAP). These antioxidant analyses were performed following the procedures in Zhang et al.[Citation38] and Thaipong et al.[Citation39]

Elemental analysis was performed following the AOAC Official Methods 968.08 and 965.09. Sodium, calcium, magnesium, potassium, iron, and zinc were measured using a GBC906 atomic adsorption spectrophotometer (AAS) (GBC Scientific Equipment, Victoria, Australia), while copper, phosphorus, lead, mercury, arsenic, and cadmium were determined using an ELAN9000 inductively coupled plasma-mass spectrometer (ICP-MS) (Perkin Elmer, Massachusetts, USA).

Amino acids analysis was performed using a Waters 2695 HPLC system (Waters Co., Milford, USA) equipped with a Waters 2475 multi-wavelength fluorescence detector (Waters Co., Milford, USA).[Citation40] Sixteen amino acids were determined including lysine, threonine, leucine, phenylalanine, valine, isoleucine, histidine, methionine, serine, aspartic acid, glutamic acid, proline, arginine, tyrosine, glycine, and alanine.

Nitrite and nitrate were determined using a Dionex ICS-90 ion chromatography system (Dionex Co., Sunnyvale, USA) coupled with a conductivity detector following the Malaysian Standard MS 2509:2012 (P)[Citation41] and Paydar et al.[Citation42] Sialic acid was determined using a Dionex UltiMate 3000 HPLC system (Dionex Co., Sunnyvale, USA) coupled with a FLD-3400RS fluorescence detector (Dionex Co., Sunnyvale, USA).[Citation43] The entire analyses were conducted in at least two replications to ensure reproducibility.

Data analysis

Data matrix of 26 observations (13 EBN samples x 2 replicates) and 45 compositional variables (physicochemical, proximate, antioxidants, elementals, amino acids, nitrite, nitrate, and sialic acid) was used in this study. The EBN replicates were used as observations in order to enlarge the sample size. The number of variables is larger than the number of observations, so a variable selection step is required.[Citation27] This step removes those variables that contained redundant and noisy information from the dataset to minimize the overfitting problem in classification.[Citation44] The variable selection was conducted using analysis of variance (ANOVA) and Pearson correlation analysis.

One-way ANOVA was first performed on each variable to determine which variables significantly differentiate EBN between classes. Variable that has P-value less than 0.05 is considered to be statistically significant (< 0.05). Next, those variables with significant P-values were subjected to Pearson correlation analysis, to determine their correlation level. Variables having high correlation were pruned by retaining only variables with higher F-ratio values. The F-ratio value indicates the significant effect in differentiating between classes. By definition, F-ratio is the ratio of between-group to within-group variances, which used to rank the potential variables with significant effect.[Citation6] Then, the selected variables were allowed to proceed for pattern recognition analysis.

Three pattern recognition methods were employed for analyzing the selected variables, in order to classify and differentiate EBN following their production origin, species origin, and geographical origin. The three pattern recognition methods utilized were two unsupervised algorithms of PCA and HCA, and a supervised algorithm of LDA. All the compositional data was standardized (auto-scaled or unit-variance scaled) prior to pattern recognition analysis. Standardization was performed on each variable individually by subtracting its mean value and then divided by standard deviation, to ensure all variables contribute equally to a scale. Statistical analyses were performed using Statistica software version 10.0 for Windows (StatSoft Inc., Oklahoma, USA).

Principal Component Analysis (PCA)

PCA is a commonly used unsupervised pattern recognition method to reduce data dimensionality and discover unsuspected relationships in extremely large dataset. PCA transforms a large number of original variables into a smaller set of new uncorrelated variables, known as principal components.[Citation27] Principal components retain most of the information from the original data in terms of variance. The first principal component (PC1) accounts for the maximum variance of the data and the subsequent principal components (PC2, PC3 … PCn) account for the remaining variance in lesser proportion. The optimal number of principal components to retain was determined based on eigenvalues greater than 1.[Citation45] The first two or three principal components often represent the main structure of the data, while the remaining principal components contain noise or less relevant information.[Citation46] The EBN samples were differentiated based on correlation matrix and the results were presented on the PCA score plot and loading plot. The score plot portrays the similarity or differences between observations, while the loading plot illustrates the correlations between the principal components and the variables.[Citation47]

Hierarchical Cluster Analysis (HCA)

HCA is an unsupervised pattern recognition method that forms natural grouping for observations. The observations are grouped into respective clusters using similarity or distance metric without a priori information about the class memberships.[Citation48] The EBN samples were grouped using the distance metric based on single Euclidean distance and the clustering method used was complete linkage method, and the results were presented on the dendrogram.

Linear Discriminant Analysis (LDA)

LDA is the most frequently used supervised pattern recognition method that uses variables and observations with prior known information to build a classification model. The model built is cross-validated using a new independent set of observations with prior known information. After cross-validation, the model can be used to estimate the class memberships of unknown samples.[Citation49] LDA is also able to identify marker variables, which contribute to the differentiation between the classes. The LDA model for classification of EBN samples was built using a forward stepwise analysis. Variables that contribute most in differentiating EBNs were sorted and included into the model step-by-step based on Fisher criteria test (F to enter/remove values). In general, the F-value for variable refers to its statistical significance in the differentiation between groups. The F to enter value indicates the level of significant contribution of a variable to be added into the model, while F to remove value determines how insignificant the contribution of a variable to be removed from the model. The most significant and differentiating variables identified for each origin differentiation were determined with respect to their P-values (P < 0.05). The performance of the LDA model constructed was examined and the results were expressed as classification ability and prediction ability. Classification ability is the capability to group the observations to the correct category to establish a classification rule, while prediction ability is the capability to group the new observations with prior known information to the correct category.[Citation47] A 10-fold cross-validation method (internal validation) was performed to validate the model and estimate the predictive ability of the model.

For evaluation of the LDA models, sensitivity, specificity, and accuracy are measured. Sensitivity and specificity are statistical measures of binary classification test.[Citation50] Sensitivity, also known as true positive rate, measures how well the test correctly predicts a condition whereas specificity, also called true negative rate, measures how well the test correctly predicts the other condition. Sensitivity is the proportion of true positive of all positive cases while specificity is the proportion of true negative of all negative cases. Accuracy measures how well the test correctly predicts both conditions, meaning the proportion of true results of all possible results. Sensitivity, specificity, and accuracy are calculated using the following equations.

Sensitivity=TruePositive/TruePositive+FalseNegative
Specificity=TrueNegative/TrueNegative+FalsePositive
Accuracy=TruePositive+TrueNegative/TruePositive+FalsePositive+TrueNegative+FalseNegative

Results and discussion

summarizes the chemical compositions such as physicochemical properties, proximate composition, antioxidant activities, elemental content, amino acid profile, nitrite and nitrate, and sialic acid contents of the EBN samples as grouped following their production, species and geographical origins. The compositional data of EBN samples were combined with pattern recognition analysis such as PCA, HCA, and LDA for identification of different origins of EBN.

Table 2. Physicochemical properties, proximate composition, antioxidant activities, elemental content, amino acid profile, nitrite, nitrate and sialic acid contents in EBN samples with respect to production, species and geographical origins.

Production origin differentiation

The EBN samples studied were from two different production origins, house, and cave. The house EBNs were obtained from man-made buildings or houses, while the cave EBNs were collected from natural limestone caves. Dataset consisted of 22 observations (11 EBN samples X 2 replicates) and 45 compositional variables were used to differentiate the house and cave EBNs.

After performing one-way ANOVA, 19 variables with significant effect (P < 0.05) in differentiating EBN were selected and ranked based on F-ratio value as listed in . Variable with higher F-ratio value indicating greater differences between the house and cave EBNs. The Pearson correlation analysis showed that TPC, DPPH, and FRAP were highly correlated (0.82 ≤ ≤ 0.89). The TPC was then retained for successive analysis as it had the largest F-ratio value. Hue and color a* had a strong correlation (= −0.83), while nitrite and nitrate had a good correlation (= 0.59). For the similar reasons based on F-ratio value, hue and nitrite were retained. Hence, the 15 remaining compositional variables were proceeded for pattern recognition analysis.

Table 3. Ranking of variables based on F-ratio value in differentiating EBN following (a) production origin, (b) species origin and (c) geographical origin.

PCA revealed that the first four principal components explained 87.39% of the total variance was able to differentiate EBN samples based on their production origin. The PC1 explained 51.11% of the total variance, and the three subsequent principal components explained 16.80%, 11.75% and 7.73% of the total variance, respectively. It is impractical to display four-dimensional plots, so the first two principal components were plotted. shows PC1 and PC2 plots that explained most of the total variance at 67.91%. The score plot in shows the house and cave EBNs were separated distinctively into two groups as indicated by two ellipses. Tight group was observed for the house EBN samples compared to the cave EBN samples, implying higher degree of similarity within the house EBN. On the loading plot, variables that contributed to each of the principal components were identified (). Generally, the closer the variables to the unit circle, the higher their contributions to the principal components.[Citation47] Variables 1, 2, 5, 6, 11, 12, 13, and 15 (color L*, hue, TPC, mercury, valine, isoleucine, glutamic acid and sialic acid), and variables 4, 8, 9, 10, and 14 (ash, calcium, magnesium, zinc, and nitrite) had the highest negative and positive loadings on PC1, respectively. The results indicated that these variables have higher values in those observations that have highest negative score (house EBN samples) and positive score (cave EBN samples) on PC1, respectively. Thus, it could be deduced that the house EBN is whiter, less yellowish in color and had higher concentrations in TPC, mercury, valine, isoleucine, glutamic acid, and sialic acid. The ash, calcium, magnesium, and nitrate contents were higher in the cave EBN. Variables 3 and 7 (carbohydrate and cadmium) had highest negative and positive loadings on PC2. These results interpreted that the variables on PC1 were responsible for the separation between the house and cave EBNs, while carbohydrate and cadmium were accountable for the variability within the house or cave EBN.

Figure 1. PCA plots of EBN classification based on production origin. (a) Score plot of house (H) and cave (C). (b) Loading plot of 15 compositional variables, (1) color L*; (2) hue; (3) carbohydrate; (4) ash; (5) TPC; (6) mercury; (7) cadmium; (8) calcium; (9) magnesium; (10) zinc; (11) valine; (12) isoleucine; (13) glutamic acid; (14) nitrite; (15) sialic acid. The grey circle indicates unit circle.

Figure 1. PCA plots of EBN classification based on production origin. (a) Score plot of house (H) and cave (C). (b) Loading plot of 15 compositional variables, (1) color L*; (2) hue; (3) carbohydrate; (4) ash; (5) TPC; (6) mercury; (7) cadmium; (8) calcium; (9) magnesium; (10) zinc; (11) valine; (12) isoleucine; (13) glutamic acid; (14) nitrite; (15) sialic acid. The grey circle indicates unit circle.

In addition, PCA was also conducted using all the 45 compositional variables without performing variable selection step. As expected, a poor separation between the house and cave EBNs was obtained. This is probably due to the incorporation of less relevant variables which did not provide useful information in differentiating EBN and introduced noise to the PCA model, hence influenced accuracy of the results. shows the result of HCA performed to confirm the clustering of EBN samples following their production origin in a dendrogram. By limiting at linkage distance 6.2, three clusters were obtained. The first cluster consisted of house EBN samples, while the other two clusters contained cave EBN samples. This result was in agreement with the results obtained in PCA.

Figure 2. HCA dendrogram of EBN clustering based on production origin. The observations, H, house; C, cave; 1–11, EBN samples; a/b, replicates.

Figure 2. HCA dendrogram of EBN clustering based on production origin. The observations, H, house; C, cave; 1–11, EBN samples; a/b, replicates.

LDA was employed to build a model with high percentage of correct classification based on EBN production origin. A classification model was developed with 9 compositional variables using forward stepwise analysis. Variables that contributed in differentiating the house and cave EBN samples were, in descending order, TPC, zinc, valine, calcium, isoleucine, cadmium, magnesium, carbohydrate, and ash. The first four variables were the most significant and differentiating variables. The LDA model developed achieved 100% classification ability, where all observations were correctly classified. It can be concluded that the 9 variables were evidently significant in classifying EBNs following their production origin. A 10-fold cross-validation was performed to examine the reliability of this LDA model. All observations were correctly predicted with an excellent prediction ability of 100% in cross-validation. These results suggested that the LDA model developed was reliable and valid to be used for classification of EBNs based on their production origin. The results of classification and prediction abilities including sensitivity, specificity, and accuracy of the LDA model are shown in .

Table 4. Classification and prediction abilities of the LDA models developed for different origin classifications.

Despite TPC, zinc, valine, and calcium contents being identified responsible for EBN classification based on production origin, Seow et al.[Citation51] reported glutamic acid and tyrosine as the differentiating markers between house and cave nests. This may be due to the use of the entire amino acids for analysis in Seow et al.’s study. Inclusion of amino acids that do not contain relevant information in the differentiation of EBN could have introduced noise to the analysis and thus affecting the accuracy of the results. The classification rate achieved in this study is believed to be better when compared to Seow et al.’s study as a more comprehensive compositional data which involved more differentiating variables was used for the analysis.

Species origin differentiation

One-way ANOVA and Pearson correlation analysis were applied to the dataset containing 26 observations (13 EBN samples X 2 replicates) and 45 compositional variables to eliminate less relevant and redundant variables for differentiation of EBN samples by swiftlet species origin, A. fuciphagus and A. maximus, also commonly known as the white-nest swiftlet and black-nest swiftlet, respectively, by the locals in Malaysia. Twenty-six variables with significant effect (P < 0.05) were selected and listed in . The Pearson correlation analysis showed strong correlations between TPC, DPPH and FRAP (0.77 ≤ ≤ 0.84), and between color L*, color b* and chroma (0.58 ≤ ≤ 1.00). Only TPC and color L* were retained for subsequent analysis because they had larger F-ratio values. A good correlation was also obtained between nitrite and nitrate (= 0.62). Similarly, nitrite with a larger F-ratio value was retained. The 21 remaining compositional variables were then applied for pattern recognition analysis.

For PCA, the first five principal components were used to differentiate EBN samples following their species origin, accounted for 90.73% of the total variance. The PC1 accounted for 47.26% of the total variance, and the following principal components accounted for 19.50%, 11.59%, 6.74%, and 5.64% of the total variance, respectively. PC1 and PC2 plots accounted for most of the total variance at 66.76% are shown in . The score plot shows that EBN samples produced by A. fuciphagus and A. maximus were slightly overlapped as illustrated by ellipses (). This indicated that the two swiftlet species of EBNs was not well-differentiated. From the loading plot in , variables 1, 4, 9, 10, 12, 15, 18, and 21 (color L*, TPC, threonine, leucine, valine, serine, tyrosine and sialic acid), and variables 3, 6 and 20 (ash, calcium and nitrite) showed the highest negative and positive loadings on PC1, respectively. Results implied that these variables have higher values in those observations that having the highest negative score (A. fuciphagus) and positive score (A. maximus) on PC1, respectively. Thus, it could be derived that EBN produced by A. fuciphagus were whiter in appearance and had higher concentrations in TPC, threonine, leucine, valine, serine, tyrosine, and sialic acid, while A. maximus had higher ash, calcium and nitrite contents. Variables 11 and 13 (phenylalanine and isoleucine) have highest negative and positive loadings on PC2. These results deduced that the variables on PC1 contributed in differentiating EBN by A. fuciphagus and A. maximus, whereas the variability within the A. fuciphagus EBN was defined by phenylalanine and isoleucine.

Figure 3. PCA plots of EBN classification based on species origin. (a) Score plot of Aerodramus fuciphagus (AF) and Aerodramus maximus (AM). (b) Loading plot of 21 compositional variables, (1) color L*; (2) hue; (3) ash; (4) TPC; (5) mercury; (6) calcium; (7) magnesium; (8) zinc; (9) threonine; (10) leucine; (11) phenylalanine; (12) valine; (13) isoleucine; (14) methionine; (15) serine; (16) aspartic acid; (17) glutamic acid; (18) tyrosine; (19) glycine; (20) nitrite; (21) sialic acid. The gray circle indicates unit circle.

Figure 3. PCA plots of EBN classification based on species origin. (a) Score plot of Aerodramus fuciphagus (AF) and Aerodramus maximus (AM). (b) Loading plot of 21 compositional variables, (1) color L*; (2) hue; (3) ash; (4) TPC; (5) mercury; (6) calcium; (7) magnesium; (8) zinc; (9) threonine; (10) leucine; (11) phenylalanine; (12) valine; (13) isoleucine; (14) methionine; (15) serine; (16) aspartic acid; (17) glutamic acid; (18) tyrosine; (19) glycine; (20) nitrite; (21) sialic acid. The gray circle indicates unit circle.

HCA was performed to affirm the clustering of EBN samples following their species origin. HCA result presented in a dendrogram () shows two predominant clusters, being the A. fuciphagus and A. maximus were obtained by limiting at linkage distance 8.4. Two observations from EBN sample 6 which produced by A. fuciphagus were incorrectly clustered to A. maximus. This result was consistent with the results attained in PCA.

Figure 4. HCA dendrogram of EBN clustering based on species origin. The observations, AF, Aerodramus fuciphagus; AM, Aerodramus maximus; 1–13, EBN samples; a/b, replicates.

Figure 4. HCA dendrogram of EBN clustering based on species origin. The observations, AF, Aerodramus fuciphagus; AM, Aerodramus maximus; 1–13, EBN samples; a/b, replicates.

A forward stepwise LDA was used to construct a model with high correct classification rate based on EBN species origin. A classification model with 100% classification ability was developed using 9 compositional variables. Variables that contributed most in differentiating EBN samples by A. fuciphagus and A. maximus were sialic acid, serine, phenylalanine and valine, followed by calcium, color L*, mercury, ash, and zinc, in descending order. The reliability of the LDA model constructed was further examined using a 10-fold cross-validation. The LDA model obtained a satisfactory prediction ability of 92%. Out of the 26 observations, two observations (EBN sample 6 with two replicates) originated from A. fuciphagus were misclassified to A. maximus. This could be explained by the close relationship between A. fuciphagus and A. maximus, in which they belong to the same family and genus named Apodidae Aerodramus. shows the abilities of the LDA model in classifying and predicting EBN by species origin.

The sialic acid, serine, phenylalanine, and valine had the most significant impact to the differentiation of A. fuciphagus and A. maximus. Sialic acid is produced from the salivary glands of swiftlets and the amino acids are coded by the nucleotides containing unique information for every swiftlet. These variables possessed close proximity with the genetic information in swiftlets. Thus, it is suggested that genetic factor could be an alternative promising tool to differentiate EBN from A. fuciphagus and A. maximus.

Geographical origin differentiation

The EBN samples studied were from two geographical origins, the Peninsular Malaysia and East Malaysia (Sabah and Sarawak). Dataset containing 22 observations (11 EBN samples X 2 replicates) and 45 compositional variables were used to differentiate EBN samples from the Peninsular and East Malaysia. A lesser observations were used in this classification compared to 26 observations used in species origin classification. It is because some of the EBN samples with unknown geographical origin were not included in this classification.

Fifteen variables having a significant effect at P < 0.05 in differentiating EBN samples by geographical origin were selected using one-way ANOVA and ranked based on F-ratio value as shown in . The Pearson correlations analysis revealed that TPC, DPPH and FRAP (0.82 ≤ ≤ 0.89), and hue and color a* (= −0.83) had strong correlations. Nitrite was also positively correlated with nitrate at = 0.59. Based on the higher F-ratio value thumb rule, the TPC, hue, and nitrite were retained for the subsequent analysis. The 11 compositional variables remained were used for pattern recognition analysis.

PCA showed that the first two principal components explained a total variance of 71.37%, were able to differentiate EBN samples following their geographical origin. The PC1 and PC2 explained 56.80% and 14.57% of the total variance, respectively. illustrates PCA plots of PC1 and PC2. The score plot in shows a distinct separation between the Peninsular Malaysia and East Malaysia EBN samples. The EBN samples from Peninsular Malaysia were more closely grouped together compared to East Malaysia, implying higher similarity within the Peninsular Malaysia EBN. The loading plot in shows contribution of variables with respect to each principal component. For PC1, variables 3, 8, 9, and 10 (ash, calcium, magnesium, and nitrite), and variables 1, 2, 5, 6, and 11 (color L*, hue, TPC, mercury, and sialic acid) had the highest negative and positive loadings, respectively. These variables were used to characterize the Peninsular Malaysia and East Malaysia EBNs. Results interpreted that the Peninsular Malaysia EBN appeared in whiter with slight yellowish and had higher concentration in TPC, mercury, and sialic acid, while the East Malaysia EBNs had higher ash, calcium, magnesium, and nitrite contents. Variable 7 (arsenic) had the highest negative loading on PC2. These results speculated that the variables on PC1 were accountable for the separation between the Peninsular Malaysia and East Malaysia EBNs. The high variability within the East Malaysia EBN was explained by arsenic content.

Figure 5. PCA plots of EBN classification based on geographical origin. (a) Score plot of Peninsular Malaysia (P) and East Malaysia (E). (b) Loading plot of 11 compositional variables, (1) color L*; (2) hue; (3) ash; (4) fat; (5) TPC; (6) mercury; (7) arsenic; (8) calcium; (9) magnesium; (10) nitrite; (11) sialic acid. The gray circle indicates unit circle.

Figure 5. PCA plots of EBN classification based on geographical origin. (a) Score plot of Peninsular Malaysia (P) and East Malaysia (E). (b) Loading plot of 11 compositional variables, (1) color L*; (2) hue; (3) ash; (4) fat; (5) TPC; (6) mercury; (7) arsenic; (8) calcium; (9) magnesium; (10) nitrite; (11) sialic acid. The gray circle indicates unit circle.

HCA was conducted to confirm EBN samples clustering following their geographical origin. shows a dendrogram obtained by HCA. At linkage distance of 6.3, two clusters were obtained. The first cluster was predominantly Peninsular Malaysia EBN samples, while the second cluster contained East Malaysia EBN samples.

Figure 6. HCA dendrogram of EBN clustering based on geographical origin. The observations, P, Peninsular Malaysia; E, East Malaysia; 1–11, EBN samples; a/b, replicates.

Figure 6. HCA dendrogram of EBN clustering based on geographical origin. The observations, P, Peninsular Malaysia; E, East Malaysia; 1–11, EBN samples; a/b, replicates.

A classification model was developed using forward stepwise LDA, to correctly classify EBN samples based on geographical origin. Eight compositional variables were used in developing LDA model and it achieved excellent classification ability of 100%. Among these variables, arsenic and mercury contributed most significantly to the geographical origin differentiation, next were color L*, nitrite, TPC, magnesium, hue, and ash. The significant variations in arsenic and mercury (P < 0.05) between different geographical origins were probably linked to the environmental conditions. The Peninsular Malaysia is concentrated with many industrial activities which is expanding rapidly, for instance, the Straits of Malacca, one of the most hectic shipping lanes in the world.[Citation52,Citation53] These activities emit toxic metal pollutants that contaminate the environment and ecosystem, including forages for the swiftlets. Swiftlets are aerial insectivores that usually forage for insects around their habitats.[Citation54] Swiftlets from the Peninsular Malaysia mostly forage around the industrial areas, thus their diets are often enriched with toxic metal pollutants. Expectedly, EBNs that they produced are higher in toxic metal contents. The diets of swiftlets may impose a significant effect on the marker variables and influence the differentiation of EBN by geographical origin. This result is consistent with the findings reported by Chua et al.,[Citation55] stated that the swiftlet’s diet significantly affected variables that differentiate EBN from different countries.

A 10-fold cross-validation was employed in examining the reliability of the LDA model developed in term of its prediction ability. All 22 observations were correctly predicted and achieved 100% prediction ability in cross-validation. It is recommended that this LDA model can be used to classify EBN by geographical origin due to its high reliability and validity. presents the results of classification and prediction abilities of the LDA model.

Conclusion

Good classifications of EBN were demonstrated by PCA, HCA, and LDA models with good agreements and consistency. Comparison between the three models indicated that the supervised stepwise LDA model achieved a more effective classification for EBN with regards to reliability, time, and cost of analysis. The developed LDA model required fewer variables of 8–9 variables, is more time saving and cost-effective than the PCA and HCA models in identifying EBN origins. The LDA models were highly reliable and valid for EBN identification, with 100% classification abilities and at least 92% prediction abilities achieved. The TPC, zinc, valine, and calcium were identified as key markers to differentiate EBN by production origin, while sialic acid, serine, phenylalanine and valine for species origin, and arsenic and mercury for geographical origin determination. It is suggested that nutritional and chemical profiles coupled with pattern recognition analysis are a viable approach to rapidly determine EBN origins for food safety and quality control plus traceability and authenticity purposes.

Highlights

  • Rapid and accurate authentication method of edible bird’s nest (EBN) origins

  • Pattern recognition analysis based on nutritional profile and chemical composition

  • EBN can be distinguish following their production, species, and geographical origins

  • Key discriminating parameters are antioxidant, sialic acid, amino acids, elemental

  • Developed LDA model is accurate and reliable with excellent classification ability

  • First report to identify swiftlet species of EBN using chemical pattern recognition

Additional information

Funding

This work was supported by the The Ministry of Education Malaysia [ERGS/1/2013/TK05/UPM/02/6].

References

  • Ireland, J. D.; Moller, A. Review of International Food Classification and Description. Journal of Food Composition and Analysis 2000, 13, 529–538. DOI: 10.1006/jfca.2000.0921.
  • Kek, S. P.; Chin, N. L.; Yusof, Y. A.; Tan, S. W.; Chua, L. S. Classification of Entomological Origin of Honey Based on Its Physicochemical and Antioxidant Properties. International Journal of Food Properties 2017,20, Sup3. DOI:10.1080/10942912.2017.1359185.
  • Kek, S. P.; Chin, N. L.; Yusof, Y. A.; Tan, S. W.; Chua, L. S. Classification of Honey from Its Bee Origin via Chemical Profiles and Mineral Content. Food Analytical Methods 2017, 10, 19–30. DOI: 10.1007/s12161-016-0544-0.
  • Quek, M. C.; Chin, N. L.; Yusof, Y. A.; Law, C. L.; Tan, S. W. Characterisation of Edible Bird’s Nest of Different Production, Species and Geographical Origins Using Nutritional Composition, Physicochemical Properties and Antioxidant Activities. Food Research International 2018, 109, 35–43. DOI: 10.1016/j.foodres.2018.03.078.
  • Zhang, J.; Li, D.; Lv, Q.; Ye, F.; Jing, X.; Masters, E. T.; Shimizu, N.; Abe, M.; Akihisa, T.; Feng, F. Compositions and Melanogenesis-Inhibitory Activities of the Extracts of Defatted Shea (Vitellaria Paradoxa) Kernels from Seven African Countries. Journal of Food Composition and Analysis 2018, 70, 89–97. DOI: 10.1016/j.jfca.2018.04.010.
  • Marini, F.; Balestrieri, F.; Bucci, R.; Magrı̀, A. L.; Marini, D. Supervised Pattern Recognition to Discriminate the Geographical Origin of Rice Bran Oils: A First Study. Microchemical Journal 2003, 74, 239–248. DOI: 10.1016/S0026-265X(03)00028-6.
  • Nurjuliana, M.; Man, Y. B. C.; Hashim, D. M.; Mohamed, A. K. S. Rapid Identification of Pork for Halal Authentication Using the Electronic Nose and Gas Chromatography Mass Spectrometer with Headspace Analyzer. Meat Science 2011, 88, 638–644. DOI: 10.1016/j.meatsci.2011.02.022.
  • Lim, C. K.; Cranbrook, E. Swiftlets of Borneo: Builders of Edible Nests; Natural History Publications (Borneo): Malaysia, 2002
  • Chua, K. H.; Lee, T. H.; Nagandran, K.; Yahaya, N. H. M.; Lee, C. T.; Tan, T. T. E.; Aziz, R. A. Edible Bird’s Nest Extract as a Chondro-Protective Agent for Human Chondrocytes Isolated from Osteoarthritic Knee: In Vitro Study. BMC Complementary and Alternative Medicine 2013, 13, 1–9. DOI: 10.1186/1472-6882-13-1.
  • Ma, F. C.; Liu, D. C.; Dai, M. X. The Effects of the Edible Bird’s Nest on Sexual Function of Male Castrated Rats. African Journal of Pharmacy and Pharmacology 2012, 6, 2875–2879. DOI: 10.5897/AJPP12.307.
  • Guo, C. T.; Takahashi, T.; Bukawa, W.; Takahashi, N.; Yagi, H.; Kato, K.; Hidari, K. I. P. J.; Miyamoto, D.; Suzuki, T.; Suzuki, Y. Edible Bird’s Nest Extract Inhibits Influenza Virus Infection. Antiviral Research 2006, 70, 140–146. DOI: 10.1016/j.antiviral.2006.02.005.
  • Matsukawa, N.; Matsumoto, M.; Bukawa, W.; Chiji, H.; Nakayama, K.; Hara, H.; Tsukahara, T. Improvement of Bone Strength and Dermal Thickness Due to Dietary Edible Bird’s Nest Extract in Ovariectomized Rats. Bioscience, Biotechnology, and Biochemistry 2011, 75, 590–592. DOI: 10.1271/bbb.100705.
  • Kong, Y. C.; Keung, W. M.; Yip, T. T.; Ko, K. M.; Tsao, S. W.; Ng, M. H. Evidence that Epidermal Growth Factor Is Present in Swiftlet’s (Collocalia) Nest. Comparative Biochemistry and Physiology Part B: Comparative Biochemistry 1987, 87, 221–226. DOI: 10.1016/0305-0491(87)90133-7.
  • Vimala, B.; Hussain, H.; Nazaimoon, W. M. W. Effects of Edible Bird’s Nest on Tumour Necrosis Factor-Alpha Secretion, Nitric Oxide Production and Cell Viability of Lipopolysaccharide-Stimulated RAW 264.7 Macrophages. Food and Agricultural Immunology 2012, 23, 303–314. DOI: 10.1080/09540105.2011.625494.
  • Roh, K. B.; Lee, J.; Kim, Y. S.; Park, J.; Kim, J. H.; Lee, J.; Park, D. Mechanisms of Edible Bird’s Nest Extract-Induced Proliferation of Human Adipose-Derived Stem Cells. Evidence-Based Complementary and Alternative Medicine 2011, 2012, 1–11. DOI: 10.1155/2012/797520.
  • Marcone, M. F. ;. Characterization of the Edible Bird’s Nest the “Caviar of the East”. Food Research International 2005, 38, 1125–1134. DOI: 10.1016/j.foodres.2005.02.008.
  • Huda, M. Z. N.; Zuki, A. B. Z.; Azhar, K.; Goh, Y. M.; Suhaimi, H.; Hazmi, A. J. A.; Zairi, M. S. Proximate, Elemental and Fatty Acid Analysis of Pre-Processed Edible Birds’ Nest (Aerodramus Fuciphagus): A Comparison between Regions and Type of Nest. Journal of Food Technology 2008, 6, 39–44.
  • Ma, F. C.; Liu, D. C. Extraction and Determination of Hormones in the Edible Bird’s Nest. Asian Journal of Chemistry 2012, 24, 117–120.
  • Norhayati, M. K.; Azman, O.; Nazaimoon, W. M. W. Preliminary Study of the Nutritional Content of Malaysian Edible Bird’s Nest. Malaysian Journal of Nutrition 2010, 16, 389–396.
  • Saengkrajang, W.; Matan, N.; Matan, N. Nutritional Composition of the Farmed Edible Bird’s Nest (Collocalia Fuciphaga) in Thailand. Journal of Food Composition and Analysis 2013, 31, 41–45. DOI: 10.1016/j.jfca.2013.05.001.
  • Wong, C. F.; Chan, G. K. L.; Zhang, M. L.; Yao, P.; Lin, H. Q.; Dong, T. T. X.; Li, G.; Lai, X. P.; Tsim, K. W. K. Characterization of Edible Bird’s Nest by Peptide Fingerprinting with Principal Component Analysis. Food Quality and Safety 2017, 1, 83–92. DOI: 10.1093/fqsafe/fyx002.
  • Legin, A.; Rudnitskaya, A.; Lvova, L.; Vlasov, Y.; Natale, C. D.; D’amico, A. Evaluation of Italian Wine by the Electronic Tongue: Recognition, Quantitative Analysis and Correlation with Human Sensory Perception. Analytica Chimica Acta 2003, 484, 33–44. DOI: 10.1016/S0003-2670(03)00301-5.
  • Berrueta, L. A.; Alonso-Salces, R. M.; Héberger, K. Supervised Pattern Recognition in Food Analysis. Journal of Chromatography A 2007, 1158, 196–214. DOI: 10.1016/j.chroma.2007.05.024.
  • Chai, K. C.; Jong, C. H.; Tay, K. M.; Lim, C. P. A Perceptual Computing-Based Method to Prioritize Failure Modes in Failure Mode and Effect Analysis and Its Application to Edible Bird Nest Farming. Applied Soft Computing 2016, 49, 734–747. DOI: 10.1016/j.asoc.2016.08.043.
  • Chang, W. L.; Tay, K. M.; Lim, C. P. Clustering and Visualization of Failure Modes Using an Evolving Tree. Expert Systems with Applications 2015, 42, 7235–7244. DOI: 10.1016/j.eswa.2015.04.036.
  • Tay, K. M.; Jong, C. H.; Lim, C. P. A Clustering-Based Failure Mode and Effect Analysis Model and Its Application to the Edible Bird Nest Industry. Neural Computing and Applications 2015, 26, 551–560. DOI: 10.1007/s00521-014-1647-4.
  • Jong, C. H.; Tay, K. M.; Lim, C. P. Application of the Fuzzy Failure Mode and Effect Analysis Methodology to Edible Bird Nest Processing. Computers and Electronics in Agriculture 2013, 96, 90–108. DOI: 10.1016/j.compag.2013.04.015.
  • Boutros, P. C.; Okey, A. B. Unsupervised Pattern Recognition: An Introduction to the Whys and Wherefores of Clustering Microarray Data. Briefings in Bioinformatics 2005, 6, 331–343.
  • Shi, J. Y.; Zhang, F.; Li, Z. H.; Huang, X. W.; Zou, X. B.; Zhang, W.; Holmes, M.; Chen, Y. Rapid Authentication of Indonesian Edible Bird’s Nests by Near-Infrared Spectroscopy and Chemometrics. Analytical Methods 2017, 9, 1297–1306. DOI: 10.1039/C6AY03352K.
  • Guo, L.; Wu, Y.; Liu, M.; Wang, B.; Ge, Y.; Chen, Y. Determination of Edible Bird’s Nests by FTIR and SDS-PAGE Coupled with Multivariate Analysis. Food Control 2017, 80, 259–266. DOI: 10.1016/j.foodcont.2017.05.007.
  • Pizarro, C.; Rodríguez-Tecedor, S.; Pérez-del-Notario, N.; González-Sáiz, J. M. Recognition of Volatile Compounds as Markers in Geographical Discrimination of Spanish Extra Virgin Olive Oils by Chemometric Analysis of Non-Specific Chromatography Volatile Profiles. Journal of Chromatography A 2011, 1218, 518–523. DOI: 10.1016/j.chroma.2010.11.045.
  • Manzanares, A. B.; García, Z. H.; Galdón, B. R.; Rodríguez, E. R.; Romero, C. D. Differentiation of Blossom and Honeydew Honeys Using Multivariate Analysis on the Physicochemical Parameters and Sugar Composition. Food Chemistry 2011, 126, 664–672. DOI: 10.1016/j.foodchem.2010.11.003.
  • Rodríguez, N.; Ortiz, M. C.; Sarabia, L.; Gredilla, E. Analysis of Protein Chromatographic Profiles Joint to Partial Least Squares to Detect Adulterations in Milk Mixtures and Cheeses. Talanta 2010, 81, 255–264. DOI: 10.1016/j.talanta.2009.11.067.
  • Fasolato, L.; Novelli, E.; Salmaso, L.; Corain, L.; Camin, F.; Perini, M.; Antonetti, P.; Balzan, S. Application of Nonparametric Multivariate Analyses to the Authentication of Wild and Farmed European Sea Bass (Dicentrarchus Labrax). Results of a Survey on Fish Sampled in the Retail Trade. Journal of Agricultural and Food Chemistry 2010, 58, 10979–10988.
  • Bellomarino, S. A.; Parker, R. M.; Conlan, X. A.; Barnett, N. W.; Adams, M. J. Partial Least Squares and Principal Components Analysis of Wine Vintage by High Performance Liquid Chromatography with Chemiluminescence Detection. Analytica Chimica Acta 2010, 678, 34–38. DOI: 10.1016/j.aca.2010.08.021.
  • Quek, M. C. ; Pattern recognition models for identification of Malaysian edible bird’s nest origin [Thesis]. Malaysia: Universiti Putra Malaysia; 2017.
  • AOAC . Official Methods of Analysis of AOAC International, 17th ed.; Association of Analytical Communities: Gaithersburg, Maryland, USA, 2000.
  • Zhang, Q.; Zhang, J.; Shen, J.; Silva, A.; Dennis, D. A.; Barrow, C. J. A Simple 96-Well Microplate Method for Estimation of Total Polyphenol Content in Seaweeds. Journal of Applied Phycology 2006, 18, 445–450. DOI: 10.1007/s10811-006-9048-4.
  • Thaipong, K.; Boonprakob, U.; Crosby, K.; Cisneros-Zevallos, L.; Byrne, D. H. Comparison of ABTS, DPPH, FRAP, and ORAC Assays for Estimating Antioxidant Activity from Guava Fruit Extracts. Journal of Food Composition and Analysis 2006, 19, 669–675. DOI: 10.1016/j.jfca.2006.01.003.
  • Azilawati, M. I.; Hashim, D. M.; Jamilah, B.; Amin, I. Validation of a Reverse-Phase High-Performance Liquid Chromatography Method for the Determination of Amino Acids in Gelatins by Application of 6-Aminoquinolyl-N-Hydroxysuccinimidyl Carbamate Reagent. Journal of Chromatography A 2014, 1353, 49–56. DOI: 10.1016/j.chroma.2014.04.050.
  • Malaysian Standard MS 2509:2012 (P) . Test Method for Edible-Birdnest (Ebn)-Determination of Nitrite (NO2) and Nitrate (NO3) Contents; Department of Standards Malaysia: Selangor, Malaysia, 2012.
  • Paydar, M.; Wong, Y. L.; Wong, W. F.; Hamdi, O. A. A.; Kadir, N. A.; Looi, C. Y. Prevalence of Nitrite and Nitrate Contents and Its Effect on Edible Bird Nest’s Color. Journal of Food Science 2013, 78, T1940–T1947. DOI: 10.1111/1750-3841.12313.
  • Wang, H.; Ni, K. Y.; Wang, Y. Determination of Sialic Acid in Edible Bird’s Nest. Chinese Journal of Pharmaceutical Analysis 2006, 26, 1251–1253.
  • Marini, F.; Magrı̀, A. L.; Balestrieri, F.; Fabretti, F.; Marini, D. Supervised Pattern Recognition Applied to the Discrimination of the Floral Origin of Six Types of Italian Honey Samples. Analytica Chimica Acta 2004, 515, 117–125. DOI: 10.1016/j.aca.2004.01.013.
  • Kaiser, H. F. ;. The Varimax Criterion for Analytic Rotation in Factor Analysis. Psychometrika 1958, 23, 187–200. DOI: 10.1007/BF02289233.
  • Sacco, A.; Brescia, M. A.; Liuzzi, V.; Reniero, F.; Guillou, G.; Ghelli, S.; Meer, P. Characterization of Italian Olive Oils Based on Analytical and Nuclear Magnetic Resonance Determinations. Journal of the American Oil Chemists’ Society 2000, 77, 619–625. DOI: 10.1007/s11746-000-0100-y.
  • Abdi, H.; Williams, L. J. Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics 2010, 2, 433–459. DOI: 10.1002/wics.101.
  • Møller, S. F.; Frese, J. V.; Bro, R. Robust Methods for Multivariate Data Analysis. Journal of Chemometrics 2005, 19, 549–563. DOI: 10.1002/(ISSN)1099-128X.
  • Tistaert, C.; Dejaegher, B.; Heyden, Y. V. Chromatographic Separation Techniques and Data Handling Methods for Herbal Fingerprints: A Review. Analytica Chimica Acta 2011, 690, 148–161. DOI: 10.1016/j.aca.2011.02.023.
  • Antonogeorgos, G.; Panagiotakos, D. B.; Priftis, K. N.; Tzonou, A. Logistic Regression and Linear Discriminant Analyses in Evaluating Factors Associated with Asthma Prevalence among 10- to 12-Years-Old Children: Divergence and Similarity of the Two Statistical Methods. International Journal of Pediatrics 2009, 952042, 1–6. DOI: 10.1155/2009/952042.
  • Seow, E. K.; Ibrahim, B.; Muhammad, S. A.; Lee, L. H.; Cheng, L. H. Differentiation between House and Cave Edible Bird’s Nests by Chemometric Analysis of Amino Acid Composition Data. LWT-Food Science and Technology 2016, 65, 428–435. DOI: 10.1016/j.lwt.2015.08.047.
  • Hajeb, P.; Jinap, S.; Ismail, A.; Mahyudin, N. A. Mercury Pollution in Malaysia. Reviews of Environmental Contamination & Toxicology 2012, 220, 45–66.
  • Shazili, N. A. M.; Yunus, K.; Ahmad, A. S.; Abdullah, N.; Rashid, M. K. A. Heavy Metal Pollution Status in the Malaysian Aquatic Environment. Aquatic Ecosystem Health & Management 2006, 9, 137–145. DOI: 10.1080/14634980600724023.
  • Lourie, S. A.; Tompkins, D. M. The Diets of Malaysian Swiftlets. Ibis 2000, 142, 596–602. DOI: 10.1111/j.1474-919X.2000.tb04459.x.
  • Chua, Y. G.; Bloodworth, B. C.; Leong, L. P.; Li, S. F. Y. Metabolite Profiling of Edible Bird’s Nest Using Gas Chromatography/Mass Spectrometry and Liquid Chromatography/Mass Spectrometry. Rapid Communications in Mass Spectrometry 2014, 28, 1387–1400. DOI: 10.1002/rcm.6914.