546
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Interpretability is in the Eye of the Beholder: Human Versus Artificial Classification of Image Segments Generated by Humans Versus XAI

ORCID Icon, , , ORCID Icon & ORCID Icon
Received 21 Nov 2023, Accepted 21 Feb 2024, Published online: 06 Mar 2024

References

  • Adebayo, J., Muelly, M., Abelson, H., & Kim, B. (2022). Post hoc explanations may be ineffective for detecting unknown spurious correlation. arXiv Preprint (pp. 1–31). arXiv221204629. https://doi.org/10.48550/arXiv.2212.04629
  • Arras, L., Arjona-Medina, J., Widrich, M., Montavon, G., Gillhofer, M., Müller, K. R., … Samek, W. (2019). Explaining and interpreting LSTMs. Explainable AI: Interpreting. Explaining and Visualizing Deep Learning (pp. 211–238). Springer. https://doi.org/10.1007/978-3-030-28954-6_11
  • Arras, L., Osman, A., & Samek, W. (2022). CLEVR-XAI: A benchmark dataset for the ground truth evaluation of neural network explanations. Information Fusion, 81, 14–40. https://doi.org/10.1016/j.inffus.2021.11.008
  • Balayn, A., Rikalo, N., Lofi, C., Yang, J., Bozzon, A. (2022). How can explainability methods be used to support bug identification in computer vision models? [Paper presentation]. 2022 CHI Conference on Human Factors in Computing Systems (pp. 1–16). https://doi.org/10.1145/3491102.3517474
  • Boyd, A., Tinsley, P., Bowyer, K. W., Czajka, A. (2023). CYBORG: Blending human saliency into the loss improves deep learning-based synthetic face detection [Paper presentation]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 6108–6117). https://doi.org/10.1109/WACV56688.2023.00605
  • Chien, S. Y., Yang, C. J., & Yu, F. (2022). XFlag: Explainable fake news detection model on social media. International Journal of Human–Computer Interaction, 38(18-20), 1808–1827. https://doi.org/10.1080/10447318.2022.2062113
  • Chu, E., Roy, D., & Andreas, J. (2020). Are visual explanations useful? A case study in model-in-the-loop prediction. arXiv Preprint (pp. 1–18). arXiv200712248. https://doi.org/10.48550/arXiv.2007.12248
  • Colin, J., Fel, T., Cadène, R., Serre, T. (2022). What I cannot predict, i do not understand: A human-centered evaluation framework for explainability methods [Paper presentation]. 36th Conference on Neural Information Processing Systems (pp. 1–14). https://proceedings.neurips.cc/paper_files/paper/2022/hash/13113e938f2957891c0c5e8df811dd01-Abstract-Conference.html
  • Das, A., Agrawal, H., Zitnick, L., Parikh, D., & Batra, D. (2017). Human attention in visual question answering: Do humans and deep networks look at the same regions? Computer Vision and Image Understanding, 163, 90–100. https://doi.org/10.1016/j.cviu.2017.10.001
  • Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database [Paper presentation]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). IEEE. https://doi.org/10.1109/CVPR.2009.5206848
  • Ebermann, C., Selisky, M., & Weibelzahl, S. (2023). Explainable AI: The effect of contradictory decisions and explanations on users’ acceptance of AI systems. International Journal of Human–Computer Interaction, 39(9), 1807–1826. https://doi.org/10.1080/10447318.2022.2126812
  • Ebrahimpour, M. K., Falandays, J. B., Spevack, S., & Noelle, D. C. (2019). Do humans look where Deep Convolutional Neural Networks “attend”?. In Advances in Visual Computing: 14th International Symposium on Visual Computing (pp. 53–65). Springer International Publishing. https://doi.org/10.1007/978-3-030-33723-0_5
  • Emhardt, S. N., Kok, E., van Gog, T., Brandt-Gruwel, S., van Marlen, T., & Jarodzka, H. (2023). Visualizing a task performer’s gaze to foster observers’ performance and learning—a systematic literature review on eye movement modeling examples. Educational Psychology Review, 35(1), 23. https://doi.org/10.1007/s10648-023-09731-7
  • Fel, T., Picard, A., Bethune, L., Boissin, T., Vigouroux, D., Colin, J., … Serre, T. (2023). CRAFT: Concept recursive activation factorization for explainability [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2711–2721). https://openaccess.thecvf.com/content/CVPR2023/html/Fel_CRAFT_Concept_Recursive_Activation_FacTorization_for_Explainability_CVPR_2023_paper.html
  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181. https://doi.org/10.1023/B:VISI.0000022288.19776.77
  • Giulivi, L., Carman, M. J., & Boracchi, G. (2021). Perception visualization: Seeing through the eyes of a DNN. arXiv Preprint (pp. 1–13). arXiv220409920.
  • Greene, M. R., & Oliva, A. (2009). Recognition of natural scenes from global properties: Seeing the forest without representing the trees. Cognitive Psychology, 58(2), 137–176. https://doi.org/10.1016/j.cogpsych.2008.06.001
  • Greiner, B. (2015). Subject pool recruitment procedures: Organizing experiments with ORSEE. Journal of the Economic Science Association, 1(1), 114–125. https://doi.org/10.1007/s40881-015-0004-4
  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition [Paper presentation]. IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90
  • Henderson, J. M. (2017). Gaze control as prediction. Trends in Cognitive Sciences, 21(1), 15–23. https://doi.org/10.1016/j.tics.2016.11.003
  • Hwu, T., Levy, M., Skorheim, S., & Huber, D. (2021). Matching representations of explainable artificial intelligence and eye gaze for human-machine interaction. arXiv preprint: arXiv:2102.00179. https://doi.org/10.48550/arXiv.2102.00179
  • Itti, L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40(10-12), 1489–1506. https://doi.org/10.1016/s0042-6989(99)00163-7
  • Kapishnikov, A., Bolukbasi, T., Viégas, F., Terry, M. (2019). XRAI: Better attributions through regions [Paper presentation]. Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 4948–4957). https://doi.org/10.1109/ICCV.2019.00505
  • Karargyris, A., Kashyap, S., Lourentzou, I., Wu, J. T., Sharma, A., Tong, M., Abedin, S., Beymer, D., Mukherjee, V., Krupinski, E. A., & Moradi, M. (2021). Creation and validation of a chest X-ray dataset with eye-tracking and report dictation for AI development. Scientific Data, 8(1), 92–18. 1 https://doi.org/10.1038/s41597-021-00863-5
  • Karran, A. J., Demazure, T., Hudon, A., Senecal, S., & Léger, P.-M. (2022). Designing for confidence: The impact of visualizing artificial intelligence decisions. Frontiers in Neuroscience, 16, 883385. https://doi.org/10.3389/fnins.2022.883385
  • Knapič, S., Malhi, A., Saluja, R., & Främling, K. (2021). Explainable artificial intelligence for human decision support system in the medical domain. Machine Learning and Knowledge Extraction, 3(3), 740–770. https://doi.org/10.3390/make3030037
  • Lai, Q., Khan, S., Nie, Y., Sun, H., Shen, J., & Shao, L. (2021). Understanding more about human and machine attention in deep neural networks. IEEE Transactions on Multimedia, 23, 2086–2099. https://doi.org/10.1109/TMM.2020.3007321
  • Lanfredi, R. B., Arora, A., Drew, T., Schroeder, J. D., & Tasdizen, T. (2021). Comparing radiologists’ gaze and saliency maps generated by interpretability methods for chest x-rays. arXiv preprint arXiv:2112.11716. https://doi.org/10.48550/arXiv.2112.11716
  • Leemann, T., Rong, Y., Nguyen, T. T., Kasneci, E., Kasneci, G. (2023). Caution to the exemplars: On the intriguing effects of example choice on human trust in XAI [Paper presentation]. 37th Annual Conference on Neural Information Processing Systems (pp. 1–12).
  • Leichtmann, B., Hinterreiter, A., Humer, C., Streit, M., & Mara, M. (2023). Explainable artificial intelligence improves human decision-making: Results from a mushroom picking experiment at a public art festival. International Journal of Human–Computer Interaction, 1–18. Advance online publication. https://doi.org/10.1080/10447318.2023.2221605
  • Liu, G., Zjang, J., Chan, A. B., Hsiao, J. H. (2023). Human attention-guided explainable AI for object detection [Paper presentation]. Proceedings of the 45th Annual Conference of the Cognitive Science Society (pp. 2573–2580). https://escholarship.org/uc/item/9r53b44n
  • Lu, X., Tolmachev, A., Yamamoto, T., Takeuchi, K., Okajima, S., Takebayashi, T., … Kashima, H. (2021). Crowdsourcing evaluation of saliency-based XAI methods. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 431–446). Springer International Publishing. https://doi.org/10.1007/978-3-030-86517-7_27
  • Morrison, K., Mehra, A., Perer, A. (2023). Shared interest… sometimes: Understanding the alignment between human perception, vision architectures, and saliency map techniques [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3775–3780). https://doi.org/10.1109/CVPRW59228.2023.00391
  • Müller, R., Dürschmidt, M., Ullrich, J., Knoll, C., Weber, S., & Seitz, S. (2023). Do humans and Convolutional Neural Networks attend to similar areas during scene classification: Effects of task and image type. arXiv Preprint (pp. 1–39). https://doi.org/10.48550/arXiv.2307.13345
  • Müller, R., Helmert, J. R., & Pannasch, S. (2014). Limitations of gaze transfer: Without visual context, eye movements do not to help to coordinate joint action, whereas mouse movements do. Acta Psychologica, 152(1), 19–28. https://doi.org/10.1016/j.actpsy.2014.07.013
  • Müller, R., Helmert, J. R., Pannasch, S., & Velichkovsky, B. M. (2013). Gaze transfer in remote cooperation: Is it always helpful to see what your partner is attending to? Quarterly Journal of Experimental Psychology (2006), 66(7), 1302–1316. https://doi.org/10.1080/17470218.2012.737813
  • Müller, R., Reindel, D. F., & Stadtfeldt, Y. D. (2024). The benefits and costs of explainable artificial intelligence in visual quality control: Evidence from fault detection performance and eye movements. Human Factors and Ergonomics in Manufacturing & Service Industries. https://doi.org/10.48550/arXiv.2310.01220
  • Nguyen, G., Kim, D., Nguyen, A. (2021). The effectiveness of feature attribution methods and its correlation with automatic evaluation scores [Paper presentation]. 35th Conference on Neural Information Processing Systems (pp. 1–15). https://proceedings.neurips.cc/paper/2021/hash/de043a5e421240eb846da8effe472ff1-Abstract.html
  • Nourani, M., Kabir, S., Mohseni, S., & Ragan, E. D. (2019). The effects of meaningful and meaningless explanations on trust and perceived system accuracy in intelligent systems [Paper presentation]. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (pp. 97–105). https://doi.org/10.1609/hcomp.v7i1.5284
  • Oliva, A. (2005). Gist of the scene. In L. Itti, G. Rees, & J. K. Tsotos (Eds.), Neurobiology of attention (pp. 251–256). Elsevier Academic Press. https://doi.org/10.1016/B978-012375731-9/50045-8
  • Quattoni, A., Torralba, A. (2009). Recognizing indoor scenes [Paper presentation]. 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 413–420). IEEE. https://doi.org/10.1109/CVPR.2009.5206537
  • Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you?: Explaining the predictions of any classifier[Paper presentation]. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144). ACM. https://doi.org/10.1145/2939672.2939778
  • Rong, Y., Leemann, T., Nguyen, T.-T., Fiedler, L., Qian, P., Unhelkar, V., Seidel, T., Kasneci, G., & Kasneci, E. (2023). Towards human-centered explainable ai: A survey of user studies for model explanations [Paper presentation]. IEEE Transactions on Pattern Analysis and Machine Intelligence (pp, 1–20). https://doi.org/10.1109/TPAMI.2023.3331846
  • Rong, Y., Xu, W., Akata, Z., & Kasneci, E. (2021). Human attention in fine-grained classification. In The 32nd British Machine Vision Conference (pp. 1–14). https://www.bmvc2021-virtualconference.com/assets/papers/0421.pdf
  • Roque, A., & Damodaran, S. K. (2022). Explainable AI for security of human-interactive robots. International Journal of Human–Computer Interaction, 38(18-20), 1789–1807. https://doi.org/10.1080/10447318.2022.2066246
  • Sanneman, L., & Shah, J. A. (2022). The situation awareness framework for explainable AI (SAFE-AI) and human factors considerations for XAI systems. International Journal of Human–Computer Interaction, 38(18-20), 1772–1788. https://doi.org/10.1080/10447318.2022.2081282
  • Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47(5), 609–612. https://doi.org/10.1016/j.jrp.2013.05.009
  • Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization [Paper presentation]. IEEE International Conference on Computer Vision (pp. 618–626). IEEE. https://doi.org/10.1109/ICCV.2017.74
  • Shen, H., & Huang, T. H. (2020). How useful are the machine-generated interpretations to general users? A human evaluation on guessing the incorrectly predicted labels. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 8, 168–172. https://doi.org/10.1609/hcomp.v8i1.7477
  • Shitole, V., Li, F., Kahng, M., Tadepalli, P., Fern, A. (2021). One explanation is not enough: Structured attention graphs for image classification [Paper presentation]. 35th Conference on Neural Information Processing Systems (pp. 1–12). https://proceedings.neurips.cc/paper/2021/hash/5e751896e527c862bf67251a474b3819-Abstract.html
  • Simpson, E. H. (1951). The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society. Series B (Methodological), 13(2), 238–241. https://doi.org/10.1111/j.2517-6161.1951.tb00088.x
  • Singh, N., Lee, K., Coz, D., Angermueller, C., Huang, S., Loh, A., & Liu, Y. (2020). Agreement between saliency maps and human-labeled regions of interest: Applications to skin disease classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 736–737). IEEE Computer Society. https://doi.org/10.1109/CVPRW50498.2020.00376
  • Slack, D., Hilgard, A., Singh, S., Lakkaraju, H. (2021). Reliable post hoc explanations: Modeling uncertainty in explainability [Paper presentation]. 35th Conference on Neural Information Processing Systems (pp. 9391–9404). https://proceedings.neurips.cc/paper/2021/file/4e246a381baf2ce038b3b0f82c7d6fb4-Paper.pdf
  • Srinivasu, P. N., Sandhya, N., Jhaveri, R. H., & Raut, R. (2022). From blackbox to explainable AI in healthcare: Existing tools and case studies. Mobile Information Systems, 2022, 1–20. https://doi.org/10.1155/2022/8167821
  • Sundararajan, M., Taly, A., Yan, Q. (2017). Axiomatic attribution for deep networks [Paper presentation]. Proceedings of the 34th International Conference on Machine Learning (pp. 3319–3328). https://dl.acm.org/doi/105555/3305890.3306024
  • Sundararajan, M., Xu, J., Taly, A., Sayres, R., & Najmi, A. (2019). Exploring principled visualizations for deep network attributions [Paper presentation]. IUI Workshops’19. https://ceur-ws.org/Vol-2327/IUI19WS-ExSS2019-16.pdf
  • Tatler, B. W. (2007). The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision, 7(14), 4.1–17. https://doi.org/10.1167/7.14.4
  • Tomsett, R., Harborne, D., Chakraborty, S., Gurram, P., & Preece, A. (2020). Sanity checks for saliency metrics [Paper presentation]. Proceedings of the AAAI Conference on Artificial Intelligence (pp.6021–6029). https://doi.org/10.1609/aaai.v34i04.6064
  • Torralba, A., & Oliva, A. (2003). Statistics of natural image categories. Network: Computation in Neural Systems, 14(3), 391–412. https://doi.org/10.1088/0954-898X_14_3_302
  • van Dyck, L. E., Kwitt, R., Denzler, S. J., & Gruber, W. R. (2021). Comparing object recognition in humans and Deep Convolutional Neural Networks—an eye tracking study. Frontiers in Neuroscience, 15, 750639. https://doi.org/10.3389/fnins.2021.750639
  • Vilone, G., & Longo, L. (2021). Notions of explainability and evaluation approaches for explainable artificial intelligence. Information Fusion, 76, 89–106. https://doi.org/10.1016/j.inffus.2021.05.009
  • Võ, M. L.-H., Boettcher, S. E., & Draschkow, D. (2019). Reading scenes: How scene grammar guides attention and aids perception in real-world environments. Current Opinion in Psychology, 29, 205–210. https://doi.org/10.1016/j.copsyc.2019.03.009
  • Wang, Z., Mardziel, P., Datta, A., Fredrikson, M. (2020). Interpreting interpretations: Organizing attribution methods by criteria [Paper presentation]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 1–8). https://doi.org/10.1109/CVPRW50498.2020.00013
  • Wiesmann, S. L., & Võ, M. L.-H. (2023). Disentangling diagnostic object properties for human scene categorization. Scientific Reports, 13(1), 5912. https://doi.org/10.1038/s41598-023-32385-y
  • Yang, Y., Zheng, Y., Deng, D., Zhang, J., Huang, Y., Yang, Y., … Cao, C. C. (2022). HSI: Human saliency imitator for benchmarking saliency-based model explanations [Paper presentation]. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (pp.231–242). https://doi.org/10.1609/hcomp.v10i1.22002
  • Zhang, Z., Singh, J., Gadiraju, U., & Anand, A. (2019). Dissonance between human and machine understanding. In Proceedings of the ACM on Human Computer Interaction (pp. 1–23), Association for Computing Machinery. https://doi.org/10.1145/3359158
  • Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A. (2017). Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1452–1464. https://doi.org/10.1109/TPAMI.2017.2723009