gazeNet: End-to-end eye-movement event detection with deep neural networks

Springer Science and Business Media LLC - Tập 51 - Trang 840-864 - 2018
Raimondas Zemblys1, Diederick C. Niehorster2, Kenneth Holmqvist3,4,5
1Research Institute, Siauliai University, Šiauliai, Lithuania
2Humanities Laboratory and Department of Psychology, Lund University, Lund, Sweden
3Department of Psychology, Regensburg University, Regensburg, Germany
4Department of Computer Science, University of the Free State, Bloemfontein, South Africa
5Faculty of Arts, Masaryk University, Brno, Czech Republic

Tóm tắt

Existing event detection algorithms for eye-movement data almost exclusively rely on thresholding one or more hand-crafted signal features, each computed from the stream of raw gaze data. Moreover, this thresholding is largely left for the end user. Here we present and develop gazeNet, a new framework for creating event detectors that do not require hand-crafted signal features or signal thresholding. It employs an end-to-end deep learning approach, which takes raw eye-tracking data as input and classifies it into fixations, saccades and post-saccadic oscillations. Our method thereby challenges an established tacit assumption that hand-crafted features are necessary in the design of event detection algorithms. The downside of the deep learning approach is that a large amount of training data is required. We therefore first develop a method to augment hand-coded data, so that we can strongly enlarge the data set used for training, minimizing the time spent on manual coding. Using this extended hand-coded data, we train a neural network that produces eye-movement event classification from raw eye-movement data without requiring any predefined feature extraction or post-processing steps. The resulting classification performance is at the level of expert human coders. Moreover, an evaluation of gazeNet on two other datasets showed that gazeNet generalized to data from different eye trackers and consistently outperformed several other event detection algorithms that we tested.

Tài liệu tham khảo

Amodei, D., Anubhai, R., Battenberg, E., Case, C., Casper, J., Catanzaro, B., . . . , Zhu, Z. (2015). Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. ArXiv e-prints. Anantrasirichai, N., Gilchrist, I. D., & Bull, D. R. (2016). Fixation identification for low-sample-rate mobile eye trackers. In 2016 IEEE international conference on image processing (ICIP) (pp. 3126–3130). IEEE. Andersson, R., Larsson, L., Holmqvist, K., Stridh, M., & Nyström, M. (2017). One algorithm to rule them all? An evaluation and discussion of ten eye movement event-detection algorithms. Behavior Research Methods, 49(2), 616–637. Bahill, A. T., Brockenbrough, A., & Troost, B. T. (1981). Variability and development of a normative data base for saccadic eye movements. Investigative Ophthalmology & Visual Science, 21(1), 116–125. Bishop, C.M. (1994). Mixture density networks. Blignaut, P., & Beelders, T. (2012). The precision of eye-trackers: A case for a new measure. In Proceedings of the symposium on eye tracking research and applications, ETRA ’12 (pp. 289–292). New York: ACM. Boyce, P. R. (1967). Monocular fixation in human eye movement. Proceedings of the Royal Society of London B: Biological Sciences, 167(1008), 293–315. Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR, arXiv:1406.1078 Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. Duchowski, A. T., Jörg, S., Allen, T. N., Giannopoulos, I., & Krejtz, K. (2016). Eye movement synthesis. In Proceedings of the ninth biennial ACM symposium on eye tracking research & applications (pp. 147–154). ACM. Enderle, J. D., & Zhou, W. (2010). Models of horizontal eye movements, part ii: A 3rd-order linear saccade model. Synthesis Lectures on Biomedical Engineering, 5(1), 1–159. Engbert, R., & Kliegl, R. (2003). Microsaccades uncover the orientation of covert attention. Vision Research, 43(9), 1035–1045. Friedman, L., Rigas, I., Abdulin, E., & Komogortsev, O.V. (2018). A novel evaluation of two related and two independent algorithms for eye movement classification during reading. Behavior Research Methods, 1–24. Grace, K., Salvatier, J., Dafoe, A., Zhang, B., & Evans, O. (2017). When will AI exceed human performance? Evidence from AI experts. CoRR, arXiv:1705.08807 Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv:1308.0850 Hein, O., & Zangemeister, W. (2017). Topology for gaze analyses - raw data segmentation. Journal of Eye Movement Research, 10(1), 1–25. Hessels, R. S., Hooge, I. T., & Kemner, C. (2016). An in-depth look at saccadic search in infancy. Journal of Vision, 16(8), 10–10. Hessels, R. S., Niehorster, D. C., Kemner, C., & Hooge, I. T. C. (2017). Noise-robust fixation detection in eye movement data: Identification by two-means clustering (i2mc). Behavior Research Methods, 49(5), 1802–1823. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. Holmqvist, K., & Andersson, R. (2017). Eye tracking. A comprehensive guide to methods, paradigms and measures. Lund Eye-Tracking Research Institute. Holmqvist, K., Zemblys, R., & Beelders, T. (2017). Magnitude and nature of variability in eye-tracking data. In Proceedings of the ECEM (p. 2017). Wuppertal: ECEM. Hooge, I., Holmqvist, K., & Nyström, M. (2016). The pupil is faster than the corneal reflection (CR): Are video based pupil-CR eye trackers suitable for studying detailed dynamics of eye movements? Vision Research, 128, 6–18. Hooge, I. T. C., Niehorster, D. C., Nyström, M., Andersson, R., & Hessels, R. S. (2017). Is human classification by experienced untrained observers a gold standard in fixation detection? Behavior Research Methods. Hoppe, S., & Bulling, A. (2016). End-to-end eye movement detection using convolutional neural networks. ArXiv e-prints. Houpt, J. W., Frame, M. E., & Blaha, L. M. (2017). Unsupervised parsing of gaze data with a beta-process vector auto-regressive hidden markov model. Behavior Research Methods. Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR, arXiv:1502.03167 Juhola, M. (1988). Detection of nystagmus eye movements using a recursive digital filter. IEEE Transactions on Biomedical Engineering, 35(5), 389–395. Komogortsev, O. V., Gobert, D. V., Jayarathna, S., Koh, D. H., & Gowda, S. M. (2010). Standardization of automated analyses of oculomotor fixation and Saccadic behaviors. IEEE Transactions on Biomedical Engineering, 57(11), 2635–2645. Komogortsev, O. V., & Karpov, A. (2013). Automated classification and scoring of smooth pursuit eye movements in the presence of fixations and saccades. Behavior Research Methods, 45(1), 203–215. Larsson, L., Nyström, M., Andersson, R., & Stridh, M. (2015). Detection of fixations and smooth pursuit movements in high-speed eye-tracking data. Biomedical Signal Processing and Control, 18, 145–152. Larsson, L., Nyström, M., & Stridh, M. (2013). Detection of saccades and postsaccadic oscillations in the presence of smooth pursuit. IEEE Transactions on Biomedical Engineering, 60(9), 2484–2493. Lee, S. P., Badler, J. B., & Badler, N. I. (2002). Eyes alive. In ACM transactions on graphics (TOG) (Vol. 21, pp. 637–644). ACM. Ma, X., & Deng, Z. (2009). Natural eye motion synthesis by modeling gaze-head coupling. In Virtual reality conference, 2009. VR 2009. IEEE (pp. 143–150). IEEE. Mehri, S., Kumar, K., Gulrajani, I., Kumar, R., Jain, S., Sotelo, J., ..., Bengio, Y. (2016). Samplernn: An unconditional end-to-end neural audio generation model. cite arXiv:1612.07837 Mould, M. S., Foster, D. H., Amano, K., & Oakley, J. P. (2012). A simple nonparametric method for classifying eye fixations. Vision Research, 57, 18–25. Nyström, M., & Holmqvist, K. (2010). An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data. Behavior Research Methods, 42(1), 188–204. Salvucci, D. D., & Goldberg, J. H. (2000). Identifying fixations and saccades in eye-tracking protocols. In Proceedings of the 2000 symposium on eye tracking research & applications, ETRA ’00 (pp. 71–78). Sangi, M., Thompson, B., & Turuwhenua, J. (2015). An optokinetic nystagmus detection method for use with young children. IEEE Journal of Translational Engineering in Health and Medicine, 3, 1–10. Startsev, M., Agtzidis, I., & Dorr, M. (2016). Smooth pursuit. http://michaeldorr.de/smoothpursuit/ Startsev, M., & Agtzidis, I. (2017). Manual & automatic detection of smooth pursuit in dynamic natural scenes. In Proceedings of the European conference of eye movements. Sutskever, I., Martens, J., & Hinton, G. (2011). Generating text with recurrent neural networks. In L. Getoor, & T. Scheffer (Eds.) Proceedings of the 28th international conference on machine learning (ICML-11), ICML ’11 (pp. 1017–1024). New York: ACM. Tinker, M. A. (1928). Eye movement duration, pause duration, and reading time. Psychological Review, 35(5), 385. Turuwhenua, J., Yu, T.-Y., Mazharullah, Z., & Thompson, B. (2014). A method for detecting optokinetic nystagmus based on the optic flow of the limbus. Vision Research, 103, 75–82. Van Den Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. In Proceedings of the 33rd international conference on international conference on machine learning - Volume 48, ICML’16 (pp. 1747–1756). JMLR.org. Yeo, S. H., Lesmana, M., Neog, D. R., & Pai, D. K. (2012). Eyecatch: Simulating visuomotor coordination for object interception. ACM Transactions on Graphics (TOG), 31(4), 42. Zemblys, R. (2016). Eye-movement event detection meets machine learning. In Biomedical engineering (pp. 98–101). Zemblys, R., Niehorster, D. C., Komogortsev, O., & Holmqvist, K. (2018). Using machine learning to detect events in eye-tracking data. Behavior Research Methods, 50(1), 160–181.