192
Views
4
CrossRef citations to date
0
Altmetric
Research Article

Estimating Level of Engagement from Ocular Landmarks

ORCID Icon, , ORCID Icon &
Pages 1527-1539 | Published online: 26 May 2020
 

ABSTRACT

E-learning offers many advantages like being economical, flexible and customizable, but also has challenging aspects such as lack of – social-interaction, which results in contemplation and sense of remoteness. To overcome these and sustain learners’ motivation, various stimuli can be incorporated. Nevertheless, such adjustments initially require an assessment of engagement level. In this respect, we propose estimating engagement level from facial landmarks exploiting the facts that (i) perceptual decoupling is promoted by blinking during mentally demanding tasks; (ii) eye strain increases blinking rate, which also scales with task disengagement; (iii) eye aspect ratio is in close connection with attentional state and (iv) users’ head position is correlated with their level of involvement. Building empirical models of these actions, we devise a probabilistic estimation framework. Our results indicate that high and low levels of engagement are identified with considerable accuracy, whereas medium levels are inherently more challenging, which is also confirmed by inter-rater agreement of expert coders.

Acknowledgments

We would like to thank our volunteer participants for their help in the experiments. We would also like to thank Dr. Francesco Zanlungo for his invaluable discussion.

Disclosure of potential conflict of interest

No potential conflict of interest was reported by the authors.

Notes

1. D’Mello et al. call attention to biological or physical indicators such as skin conductance or mouse pressure, etc., as well. But these require a specific sensory configuration and can not be easily incorporated with existing systems.

2. The participants speak English as a foreign language, have followed a similar academic curricula and, thus, are assumed to have a similar level of proficiency in English. In addition, NASA TLX surveys carried out following the semi-active task, as well as investigation of correctness of participants’ answers to the questions on the narrations, reveal that they do not experience any problems due to any insufficiency in English proficiency.

3. The participants listen to 35 stories narrated on the average for sec. After each narration, they are given 15 sec to answer a question on the story.

4. There are one male and four female participants with age .

5. The participants performed only a single task on each day and finished all tasks within a time window of 2 weeks.

6. The video footage has a resolution of and a frame rate of 30 fps, which are in line with the specifications of most off-the-shelf recording products or built-in computer hardware.

7. The start instants of the video clips are determined (from the beginning of the task) in minutes as follows: .

8. In particular, we consider as “fully engaged,” as “moderately engaged,” as “fairly engaged,” as “poorly engaged,” and as “disengaged.” However, while contrasting the extremities, we use the terms “engaged” and “disengaged” for the sake of brevity.

9. The last 35 clips coded by the teachers are found to have unreliable labels most probably due to a confusion of one of the coders; and one clip is found to involve no learning task and, thus, is discarded.

10. Taking a closer look at the most popular landmark estimation methods, one may notice that it is quite common to use templates involving around 60 points.

11. In other words, we exclude any reflex or voluntary blinks. The reason for this exclusion is two-fold. First of all, since tactile stimuli (to the face or other body parts) are not present in our experiments, and the degree of optical or auditory stimuli is not to a significant degree or subject to large variations, we assume no reflex blinks take place. In addition, since the participants are not aware that we study their blinking patterns and are neither instructed to blink intentionally, they are assumed not to perform any voluntary blinks.

12. In our specific set, since frame rate is 30 fps and clip duration is 10 sec, .

13. Obviously, one may as well opt for replacing the blink onset with blink offset. Since the duration of the clips (i.e. 10 sec) is considered to assure a uniform level of engagement over this course, even though the value associated with a particular clip may change (i.e. increase or decrease by 1), the distribution of number of blinks relating a particular level of engagement is expected not to be affected by this choice. In addition, the integration of with the other features is regarded to improve resiliency and stability of estimation.

14. The biocular breadth, i.e. the distance between the two landmarks representing the lateral canthi, can also be used to replace .

15. For optimization, a general-purpose method based on Nelder-Mead algorithm is used (Nelder & Mead, Citation1965). In implementation, we used rpy2 package, which is a back-end for R programming language to Python (Gautier, Citation2008).

16. Since the matrices in are symmetric, only the upper triangular part is presented.

17. There is no guideline to assess independence based on relative entropy distance but various studies consider values over 0.90 to indicate independence to a sufficient degree (Zanlungo et al., Citation2017).

Additional information

Funding

This work was supported by Japan Society for the Promotion of Science KAKENHI Grant Number J18K18168.

Notes on contributors

Zeynep Yücel

Zeynep Yücel is an assistant professor at Okayama University, Japan. She obtained her B.S. degree from Bogazici University, Istanbul, Turkey, and her M.S. and Ph.D. degrees from Bilkent University, Ankara, Turkey in 2005 and 2010, all in electrical engineering. She was a postdoctoral researcher at ATR labs in Kyoto, Japan for 5 years, before being awarded a JSPS fellowship in 2016. Her research interests include robotics, signal processing, computer vision, and pattern recognition.

Serina Koyama

Serina Koyama received a B.E. degree in Information Technology from Okayama University in 2016. Her research interests include pattern recognition with applications in affective computing.

Akito Monden

Akito Monden is a professor in the Graduate School of Natural Science and Technology at Okayama University, Japan. He received the B.E. degree (1994) in electrical engineering from Nagoya University, and the M.E. and D.E. degrees in information science from Nara Institute of Science and Technology (NAIST) in 1996 and 1998, respectively. His research interests include software measurement and analytics, and software security and protection. He is a member of the IEEE, ACM, IEICE, IPSJ and JSSST.

Mariko Sasakura

Mariko Sasakura is an assistant professor at Okayama University, Japan. Her research interests include visualization, especially visualizing program structures, animation systems, human computer interaction on mobile devices, origami simulator and migration simulator systems.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 306.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.