Abstract
Forensic ink analyst is frequently asked to determine the source of questioned ink entries. Ultra-performance liquid chromatography (UPLC) technique is useful for profiling organic compounds of pen inks. However, interpreting the UPLC chromatograms requires the use of statistical approaches in order to ensure the objectivity and reliability of the results. This work aims to compare the performances of univariate and multivariate statistical techniques in determining the source of ballpoint pen inks based on UPLC chromatograms. Two sets of UPLC chromatograms comprising of four blue and black ballpoint pen inks were, respectively, treated like a mini ink database for predicting unknown blue and black pen inks. The pitfalls and merits of descriptive and inferential statistics, principal component analysis (PCA) and hierarchical clustering analysis (HCA) in interpreting the source of pen inks based on the mini database were critically discussed. The results strongly indicated the superiority of HCA over the other statistical techniques. The descriptive statistics presented the worst prediction results whereas the one-way ANOVA-LSD test rarely produced definite identification. In conclusion, HCA appears to be the most robust statistical-based approach in the inferring source of pen ink based on UPLC chromatograms.
Graphical Abstract
![](/cms/asset/88670cfa-bbed-4948-89e5-f49c2ecc3961/ljlc_a_1858867_uf0001_c.jpg)
Acknowledgements
We are grateful to the Faculty of Health Sciences, Universiti Kebangsaan Malaysia for providing the facility and resources to conduct the research.
Disclosure statement
The authors have declared that no competing interest exists.
Data availability statement
All relevant data are within the paper.