229
Views
0
CrossRef citations to date
0
Altmetric
Research Article

The application of statistical and novel unsupervised machine learning methodology to forensic hair analysis

ORCID Icon, ORCID Icon, ORCID Icon &
Received 31 Dec 2023, Accepted 10 Apr 2024, Published online: 22 Apr 2024

ABSTRACT

Hair colour is a valuable feature in forensic hair analysis and hair comparisons. Five hundred microscopic images of hair shafts were taken and for each image and colour model (RGB, CIE XYZ and CIE L*a*b*) values for each image were determined. The discriminating power of the three colour model values using statistical and unsupervised machine learning methods was evaluated. Additionally, the discriminating power of the images of the same hair shafts using unsupervised machine learning methods was evaluated. All methods were compared to determine which method had the greatest discriminating power to best distinguished between participants and accurately assigned hair to an individual and assist in forensic hair comparisons. The RGB colour model demonstrated the highest discriminating power of the colour model values while the CIE L*a*b* model achieved complete discrimination with a reduced number of participants. The unsupervised k-means model yielded similar results. Unsupervised PCA/k-means and agglomerative clustering models demonstrated low discrimination power of the images, suggesting the existence of additional features within the data beyond colour. This research highlights the significance of the incorporation of colour values in forensic hair comparisons and for further exploration of the incorporation of other hair features, beyond colour values.

1. Introduction

Subjective assessments, inherent in forensic methodology, can create a vulnerability to forensics examination as these types of assessments are open to varied interpretation and susceptible to bias, opinions, and beliefs as opposed to facts and empirical evidence. Forensic scientists have acknowledged this vulnerability and the necessity for more objective feature comparison methodologyCitation1. Forensic hair analysis requires the macroscopic and microscopic assessment of hair features for routine analysis and detailed morphological comparisonsCitation2,Citation3. Hair features that can be objectively measured and assessed as discrete variables include hair length, shaft diameter, medullary index, scale count index and nuclear stainingCitation2,Citation4. Of these features, unusual hair lengths have been reported to increase the evidential value of hair evidence in contrast to medullary index and scale count index as the least useful characteristic, except for non-human hair analysisCitation5. In contrast, hair colour that is subjectively categorized has been reported as the most valuable characteristicCitation5. Forensic hair analysis uses five nominal descriptive hair colour categories that include: colourless, yellow, brown, reddish, and blackCitation2. Light, medium, and dark groupings can also be used to characterize hair colour further2. The use of descriptive categories is prone to discrepancies whereby hair analysts have reported using fewer descriptive categories to minimize subjectivity while using more descriptive categories to allow for greater discriminationCitation6. Considering the inherently subjective forensic hair analysis, a comprehensive evaluation of methodologies and more recent technologies may provide the best objective approach for forensic hair analysis.

For hair colour to be more objectively assessed, its colour can be assigned numerical coordinates as part of a colour space. Colour spaces are mathematical geometric spaces used to describe and standardize colourCitation7. The use of colour coordinates is of particular value given subjective visual characterization of hair colour can be inadvertently influenced by lighting, contamination, and hair propertiesCitation2. The RGB (red, green, blue) colour model is an additive colour space that defines colour by the proportional mixing of the three monochromatic spectra of lightCitation7. Every colour can be recorded as a set of three numbers (from 0 to 255) that describe the saturation of each component colourCitation8. The CIE (Commission Internationale de l’Elcairage) is an internationally recognized standard of colour and renowned as the basis of colourimetryCitation7. The CIE XYZ colour model represents colours as three chromaticity coordinates that represent the three theoretical primary colours (XYZ) or portions of the three primary colours (xyz) and can account for brightness/illumination but not colour intensityCitation7. A later developed CIE L*a*b* colour model represents colours as three chromaticity coordinates that represent luminance (L), red to green (a) and yellow to blue (b) and is the most accurate representation of colourCitation7. Research into the use colour models to establish a more objective forensic applications include chromatogram scanning and generation of RGB colour model values with a discriminating power of 92.8% to differentiate blue ballpoint pen inks for objective document examinationCitation9. Additionally, digital images of uranium oxide powder and generation of mean RGB colour model values to determine heating temperatures for objective nuclear forensic analysisCitation10. In the context of forensic hair analysis, an evaluation of RGB and CIE colour models aimed to determine which colour model best distinguished between participants with brown hair and accurately assigned hair to an individualCitation11,Citation12. Mean colour model values from montaged microscopic images of hair segments were generatedCitation11,Citation12. Statistical analysis was conducted to assess the power of each colour modelCitation11,Citation12. When discrimination between more than two groups is sought using data with similar features, canonical discriminant analysis (CDA) is used to determine how to best discriminate groups of individuals based on quantitative measurementsCitation13. CDA involves the linear computation of variables to derive an axis of discrimination between the predefined groups to determine if the mean values of the groups along the axis are significantCitation13. Furthermore, CDA computes discriminant functions that are the maximum ratio of between group variance to within group variance and are used to classify new samples to groupsCitation14,Citation15. Brooks (2007) found all three colour models distinguished hairs into two groups (light or dark) and the CIE XYZ colour model best assigned hair to an individual 88.9% of the time. These results were encouraging for the use of colour models to objectively describe, compare and distinguish hair. In addition to CDA, machine learning (ML) can be used to evaluate the discriminating power of the three colour models. Unsupervised ML is an exploratory approach where models are trained on unlabelled data to predict otherwise unseen groups within unlabelled dataCitation16. The application of unsupervised ML is a novel and innovative approach to discriminate colour model values from microscopic images of hairs and in this research the k-means clustering algorithm was usedCitation17.

Unsupervised k-means clustering of colour model values can be used to similarly determine the power to assign hair to an individual. CDA differs from the k-means clustering, whereby CDA discriminates between predefined groups (participants), while k-means clustering is an unsupervised method that discriminates groups without predefined group labelsCitation17. CDA reduces dimensionality by creating canonical variates in contrast to k-means clustering that indirectly reduces dimensionality by assigning data points to clustersCitation14,Citation18,Citation19. The k-means clustering algorithm distinguishes variables into k number of distinct groups or clustersCitation17,Citation18. The fit function involves training the model to a predetermined number of clusters, by iteratively assigning variables to the nearest centroid and updating the centroids until convergence is reachedCitation19,Citation20. The predict function uses the trained model’s stored cluster centroids and corresponding group labels to classify new variablesCitation19,Citation20. In addition to the use of colour model values, unsupervised ML can be used to distinguish imagesCitation21.

Computer vision, a subset of ML, involves training models on features such as colour, brightness, contrast, outline, and shape to recognize and classify (label) imagesCitation22,Citation23. This process mirrors human judgement based on learning and experienceCitation22. Forensic disciplines such as anthropology, ballistics, odontology, and pathology, have developed computer vision models, such as age, ethnicity, and gender determinations from images of bone features, and range determinations from images of shotgun patterns, with mixed resultsCitation24. While these models have been developed using images of distinct features, few have been developed using microscopic features. Notably, HairNet is a computer vision model trained on microscopic images of hair for routine forensic hair analysisCitation25. HairNet classifies low magnification images of hair shafts and hair roots for human/non-human classifications and suitability for DNA analysis with high accuracy attributed to the application of transfer learning and data augmentationCitation25. An expansion of HairNet to high magnification image classifications is a novel approach and important in determining the extent to which computer vision models can accurately analyze and categorize the detail and variability inherent in high magnification images. Principal component analysis (PCA) is a dimensionality reduction technique whereby patterns and correlations are extracted from unlabelled dataCitation21,Citation26. The number of features or labels found is referred to as the dimensionality of the dataCitation26. Dimensionality reduction is reducing the data to the most relevant features with minimal information lossCitation18. The PCA algorithm extracts the features with the most variance as the independent principal componentsCitation26. The fit and transform function can call different clustering algorithms, such as k-mean clustering and agglomerative clusteringCitation26. The agglomerative clustering algorithm uses hierarchical clusteringCitation26. The clustering of images occurs by pre-processing the images, extraction of features and evaluation of the optimal featuresCitation26.

This research had two primary objectives. Firstly, to assess the discriminating power of hair colour model values, obtained from microscopic images of hair shafts, in distinguishing between participants and accurately attributing hair to an individual. The change in discriminating power of statistical methods was also assessed by the removal of groups through backward group elimination to improve group prediction by reducing complexity of the data and improving generalization of the modelCitation27,Citation28. Secondly, to assess the accuracy of assigning hair to a participant based solely on microscopic images of hair shafts. These objectives were assessed with CDA, backward group elimination, and unsupervised ML models. The application of unsupervised ML to colour model values derived from microscopic images of hair and the microscopic images of hair shafts themselves is a novel approach to forensic hair analysis. All methods were evaluated to determine potential applications within forensic science.

2. Materials and methods

Hairs were used from a sample set collected (UC Ethics Committee Project ID: 1771). A total of 100 hairs were used, consisting of five scalp hairs from each of the 20 participants. The hairs were mounted onto a glass slide in Histomount mounting medium and sealed with a glass coverslip. Hair shafts were visualized using a Lecia DM2500 compound light microscope at high (400×) magnification and imaged with a Lecia digital camera and Lecia imaging software. Images ranged in size from 1–6 KB and were of 300 × 300 pixel resolution. Brown hair was selected given the amount of variation in brown hair colour. The hairs were imaged a five sections along each hair shaft. A total of 500 white balanced images were taken (25 images from 5 hairs from each of the 20 participants). Each image was cropped to include only the hair shaft with the imaging tool GIMPCitation29 (). All images were numbered with the following convention: participant number_hair number_region along hair. For example, the image of the fifth participants, first hair at the second region along that hair was numbered 5_1_2. Visual perception of participants hair as light brown, medium brown or dark brown and microscopic perception of hair as light yellow, light brown, medium yellow, medium brown, dark red brown, and dark brown was noted and outlined in . Only the images and the extracted image values were used in the analysis. In addition to the results from each analysis being compared to the true labels, the results were also evaluated against the subjective perception of each participants hair colour as an additional layer of analysis.

Figure 1. Example of final cropped images of light, medium and dark brown hair.

Figure 1. Example of final cropped images of light, medium and dark brown hair.

Table 1. Hair sample perceived colours.

2.1. Canonical discriminant analysis – colour model values

Mean RGB, CIE xyz and CIE L*a*b* values of each image were calculated from a derived python script that utilized image processing from the python scikit-image libraryCitation30. The script read the hair images from a file, converted each image (RGB) to floating integers to sRGB to linRGB to CIE XYZ and from linRGB to CIE Lab values. The script assigned the output to an output file. Canonical discriminant analysis was conducted of the output to assess the discriminating power of the three colour models with SPSSCitation31. The independent variable (predictor) were the three mean colour model values calculated for each image, and the dependent variable (group) were the participants (25 images from 20 participants, 500 images in total). The discriminating power of the three colour models was assessed by backward group elimination.

2.2. Unsupervised machine learning – colour model values

A k-means clustering unsupervised model was derived as a python script that utilized the python scikit-image libraryCitation30. The script loaded the previously calculated colour model values of each hair image as unlabelled values into an array and the k-means clustering model was implemented with the same number of clusters as participants. The model was trained to fit and predict the hairs to each optimal cluster centre (label) assigned by the model. The script assigned the results to an output file. The labels assigned by the model for the unlabelled values were compared to their true labels (participants) to assess the discriminating power of unsupervised ML to the three colour models. The discriminating power of the three colour models was similarly assessed with the same groups used in the backward group elimination.

2.3. Unsupervised machine learning – images

A PCA unsupervised model was derived as a python script that utilized python clusterimage libraryCitation26. The script loaded and extracted the unlabelled images by reshaping the images, converting the image pixel values to an array, normalizing the pixel values within the array and reshaping the array by PCA. The script loaded the array, and the model was instantiated with the same number of clusters as participants. The model was trained to fit and predict clusters (label) in the image data with both the k-means clustering algorithm and the agglomerative clustering algorithm. The script assigned the results to an output file. The labels assigned by the model of the images were compared to their true labels (participants) to assess the clustering of the unsupervised ML methods. Clustering was similarly assessed with the same groups used in the backward group elimination CDA.

3. Results

3.1. Canonical discriminant analysis – colour model values

CDA was conducted to assess the discriminating power of the three colour model values to distinguish between 20 participants and to accurately assign hair to each participant. For the RGB colour model, two discriminant functions accounted for 93.1% of the between group variability and from the cross-validated classifications, 63.2% of hairs were correctly classified. For the CIE XYZ and CIE L*a*b* models, two discriminant functions accounted for 87.8% and 92.7% of the between group variability and 60.8% and 63.0% of the hair were correctly classified, respectively. For each colour model Box’s M was significant, p < 0.001 and therefore the assumption of equality of covariance matrices was rejected. In instances where Box’s M is significant and there are more than three groups, rejection of the null hypotheses can be accepted as it indicates that variance differs between the groups. The removal of groups with the greatest within-group variance by backward group elimination was conducted from 20 to 6 participants to improve group prediction by reducing complexity of the data.

The corresponding visual perception of participants hair as light brown, medium brown or dark brown and microscopic observations of hair as light yellow, light brown, medium yellow, medium brown, dark red brown, and dark brown for the iterations of 16, 12 and 6 participants following backward group elimination are outlined in . Of note, the final six participants included two of each visual perception of hair colour and one of each microscopic observations of the hair colour.

Table 2. Hair colours following backward group elimination.

For each of the colour models, the percentage of between group variability accounted for by the functions and percentage of cross-validated classifications are outlined in .

Table 3. CDA – colour model values.

The only colour model able to discriminate 100% was the CIE L*a*b* colour model between 6 participants. The colour model with the greatest discriminating power overall was the RGB colour model. The canonical discriminant functions for the colour model values of each hair image for 6 participants, showing the greatest discrimination, are represented in .

Graph 1. Canonical discriminant functions – RGB colour model values.

Graph 1. Canonical discriminant functions – RGB colour model values.

Graph 2. Canonical discriminant functions – CIE XYZ colour model values.

Graph 2. Canonical discriminant functions – CIE XYZ colour model values.

Graph 3. Canonical discriminant functions – CIE L*a*b* colour model values.

Graph 3. Canonical discriminant functions – CIE L*a*b* colour model values.

For all colour models, as the groups with the greatest within-group variance were removed during backward group elimination, the percentage variability accounted for by the discriminant functions increased as well as the number of correct classifications. The percentage increase of correct classifications can be attributed to colour differences becoming more distinct and subsequently within-group variability decreasing while between-group variability increased. The high accuracy obtained could be due to underlying features in the data that could be attributed to the visual difference in hair colour as light, medium or dark, differences in the microscopic observations of hair colour of light yellow, light brown, medium yellow, medium brown, dark red brown, and dark brown or other unknown features. An unsupervised ML model was applied to the same but unlabelled values to assess and compare the discriminating power of the three colour models.

3.2. Unsupervised machine learning – colour model values

The unsupervised learning model was trained to fit and predict hairs to labels assigned by the model using k-means clustering and the predictions were then compared to their true labels (participants). Each iteration of the model was implemented with the same number of clusters as participants. The power to distinguish between participants and accurately assign hair to each participant was similarly determined with the same groups as the CDA. For each of the colour models, the percentage of accurate classifications are outlined in .

Table 4. Unsupervised ML – colour model values.

The results from the unsupervised learning model were similar to those obtained for the CDA. The greatest accuracy was 92.0% for the RGB colour model between 6 participants. For all colour models as the number of participants decreased the percentage of correct classifications increased, as expected. That the unlabelled data groupings aligned with the predefined groups confirms the discriminant functions from the CDA and supports the significance and robustness of features within the data. Additionally, the power to distinguish hair was determined with the group of 6 participants. For this analysis, the iteration of the unsupervised learning model was instantiated with three clusters. The percentage of accurate classifications for the RGB colour model was 97.3%, CIE XYZ colour model was 96.0% and the CIE L*a*b* colour model was 100%. That the unlabelled data groupings aligned with the light, medium and dark perception of hair supports these features within the data. An unsupervised learning model was then applied to the actual images.

3.3. Unsupervised leaning – images

The PCA model was trained to extract, fit, and predict images of hairs to labels assigned by the model using k-means clustering and agglomerative clustering. The clusters were compared to their true labels (participants). Clustering was determined for the 6 participants that gave the greatest discriminating power for the CDA and the unsupervised k-means clustering model previously used. The PCA model was instantiated with 6 clusters. The percentage of accurate classifications for the k-means and agglomerative algorithms are outlined in .

Table 5. Unsupervised ML – images.

The percentage of accurate classifications by the unsupervised learning model of the actual images was significantly lower than the two methods used with the three colour model values. The greatest accuracy was 76.0% for the PCA k-means model between 6 participants. That the unlabelled image data did not align with the predefined groups from the values indicates additional features in the data not attributed to colour discrimination. That there was no alignment implies greater complexity and multi-dimensionality of images to colour model values for forensic hair classification. Hair features in addition to colour, such as pigment and pigment features, may contribute to this difference and further analysis of these features is necessary.

4. Discussion

Hair features have been assessed for their potential as an objective measure in forensic hair analysisCitation11. Shaft length and profile, colour, pigmentation features, medulla features and the presence of ovoid bodies, cortical fusi, cortical texture such features observed and used in hair analysisCitation2. Hair colour, that is subjectively categorized, has been reported as the most valuable characteristic and was the hair feature exploredCitation5.

This research aimed to firstly, evaluate the efficacy of hair colour model values, derived from microscopic images of hair shafts, in distinguishing between individuals and accurately attributing hair to individuals. Secondly, the research aimed to assess the precision of assigning hair to a specific individual based solely on microscopic images of hair shafts. These objectives were assessed with CDA, backward group elimination, and unsupervised ML. The novel application of unsupervised ML to colour model values and microscopic images of hair is a novel approach to forensic hair analysis. This research holds significant importance in determining the capacity of computer vision models to effectively analyze and categorize the intricate details and variations present in high-magnification images.

4.1. Colour model values

The RGB colour model was found to have the greatest between group and discriminating power of the colour model values. The high between group variability obtained could be attributed to the visual colour classifications of the brown hair colour as light, medium and dark. Our findings compliment previous research that found all three colour models similarly categorized brown hairs into light or dark groups and the colour models had relative accuracy in assigning hair to individualsCitation16,Citation17. The CIE L*a*b* model achieved complete discrimination but with a significantly reduced dataset. The percentage increase of correct classifications can be attributed to a reduction of within-group variability and increase in between-group variability. Of note, the final six groups included two groups of visually classified hair colours light, medium and dark brown and one group of microscopically classified hair colours light yellow and brown, medium yellow and brown, dark red brown and dark brown. Greater diversity in colour representation is therefore essential for ensuring more accurate colour classifications based on colour model values as supported by research into hair colour that found differentiation between hair samples from different individuals of diverse ancestries when using RGB colour model values derived from digital microscopy of hairCitation32. Additionally, research into hair colour comparing digital images to reflective spectrophotometry found accuracy declined with an increasing number of clusters, indicating that the application of hair colour model values had similar limitationsCitation33. An unsupervised learning model that used k-means clustering yielded similar results to the CDA of the colour model values. This convergence of results confirms the significance of hair colour values and their ability to distinguish between participants and to individuals. These results highlight the significance of hair colour values to distinguish between participants and to a lesser extent to individuals and the robustness of colour model values with fewer groups.

4.2. Colour Images

The PCA unsupervised model that used k-means clustering and agglomerative clustering produced significantly reduced discrimination with only six participants. The difference in these results suggests the existence of additional features in the data that are not solely attributed to colour. The misalignment of the unlabelled images with the predefined groups also suggests greater complexity and multi-dimensionality of features within the images. Similar complexity was found in research of applying spectral imaging of hair to identifying individuals using a SpectraCube-SystemCitation34. The results indicated individuals whose hair could not be differentiated based on hair morphology also could not be accurately distinguished and that perceived hair colour is a composite of many naturally distinct individual coloursCitation34.

This research has highlighted the complexity of forensic hair analysis, including the integration of features, and aligning different methodologies. The application of unsupervised ML to colour model values derived from microscopic images of hair and the microscopic images of hair shafts themselves is an innovative and novel approach to forensic hair analysis. The assessment of these methods to distinguish between participants and accurately assign hair to an individual is a step towards more objective methodology in forensic hair analysis. Beyond colour, an analysis of the discriminative power of other hair features, such as pigment and pigment-related characteristics should be determined. Ultimately, the integration of colour model values with other relevant hair features, would greatly enhance discrimination accuracy and provide more objective forensic methodology.

Ethics statement

This research was conducted in accordance with the principles embodied in the Declaration of Helsinki and in accordance with local statutory requirements. Human hairs were used from a sample set collected (UC Ethics Committee Project ID: 1771).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

Due to the size of the image data set please contact the corresponding author for access to all data related to this research.

Additional information

Funding

This research is part of a PhD of which the first author is the candidate, and the Australian Postgraduate Award by the University of Canberra was granted.

References