Eye Tracking in Virtual Reality

Eye Tracking in Virtual Reality

Status: established
Last updated: 2026-06-07
Sources: S10055 022 00738 Z.Pdf
Tags: [eye-tracking, virtual-reality, VR, HMD, foveated-rendering, gaze-interaction, calibration, data-quality, privacy, clinical, review]

Summary

Adhanom et al. (2023) review the applications, challenges, and limitations of eye tracking in head-mounted virtual reality (VR). They first set out the background concepts — the types of eye movement, the three methods used to track the eye inside head-mounted displays (HMDs), and the data-quality and calibration factors that govern tracking performance — then organise the applied literature into seven areas: display and rendering, user interaction, collaborative virtual environments, education and training, security and privacy, marketing and consumer research, and clinical use. Their overall position is that integrated eye tracking is becoming standard in consumer HMDs and is driving a rapid expansion of cross-disciplinary applications, but that data quality, calibration drift, latency, privacy, and safety remain unresolved obstacles to real-time and clinical use.

Body

Context

This article rests on a single source: Adhanom, MacNeilage and Folmer's (2023) open-access broad review in Virtual Reality, written at the University of Nevada, Reno, which surveys current and seminal literature on eye tracking applications for VR HMDs and identifies areas for future research (PDF p. 1, orig. p. 1481). Within this knowledge base it is the VR-specific counterpart to the head-worn extended-reality survey compiled in Gaze Interaction In Extended Reality (Plopski et al., 2022): where that source classifies gaze interaction across AR, VR, and mixed reality, this one spans the full range of VR applications, of which interaction is one. It draws on the eye-movement fundamentals detailed in Fixational Eye Movements, depends on the classification problem examined in Fixation Saccade Detection, connects to the pupillometric workload measures in Pupil Dilation Cognitive Load, and extends the interaction and usability themes of Gaze Based Hci And Usability into immersive settings.

Key Points

Scope and structure. The review covers eye tracking specific to head-mounted VR rather than general or desktop eye tracking, motivated by the recent commercial availability of HMD-integrated eye trackers (Tobii, Pupil Labs, Varjo, Fove) and the consequent surge in publications (PDF p. 1, orig. p. 1481). It distinguishes itself from earlier reviews that addressed eye tracking generally or focused on single VR applications by aiming for breadth across application areas (PDF p. 1, orig. p. 1481). The applied literature is organised into seven application areas, preceded by background concepts and followed by a treatment of challenges and limitations (PDF p. 4, orig. p. 1484).

Eye-movement background. Because visual acuity is highest at the fovea and falls off toward the periphery, the eye must move to sample a scene. The review distinguishes saccades (velocities of 400–600 deg/sec, durations of 30–120 msec, amplitudes of 1–45°, related by the saccadic main sequence), fixations (the relatively still 200–300 ms intervals between saccades, during which small fixational drifts and sub-1° microsaccades persist), smooth pursuit (which cannot track targets faster than about 30 deg/sec), the vestibulo-ocular reflex, and vergence (PDF p. 2, orig. p. 1482). Several movement types can occur simultaneously, and eye data measured in head coordinates is ambiguous without scene and head-movement information, which complicates classification — an active research problem also treated in Fixation Saccade Detection (PDF pp. 2–3, orig. pp. 1482–1483).

Tracking methods and data quality. Three methods have been used inside HMDs: electro-oculography (imprecise, but the only method that works with closed eyes), the scleral search coil (highly accurate — resolution below 0.1° and above 1 kHz per Whitmire et al. (2016) — but impractical for general use), and video oculography (VOG), which is by far the most common and underlies all current commercial HMD eye trackers (PDF p. 3, orig. p. 1483). Tracking performance is governed by spatial precision, spatial accuracy, latency, and sampling rate; Andersson et al. (2010) recommend an RMS precision below 0.03° for fixational-movement work and, citing the sampling theorem, a sampling rate at least twice the speed of the movement to be recorded (PDF p. 4, orig. p. 1484). Most HMD eye trackers use a point-based calibration of 5–16 targets requiring willful user cooperation, though smooth-pursuit-based methods can calibrate without explicit fixation (PDF p. 4, orig. p. 1484).

Display and rendering. Knowing where the user looks improves both rendering efficiency and comfort. The fovea covers only about 4% of the pixels rendered on consumer HMDs, so gaze-contingent (foveated) rendering degrades resolution with eccentricity, achieving reported performance savings of 50–70% (Guenter et al., 2012; Patney et al., 2016) (PDF pp. 4–5, orig. pp. 1484–1485). Gaze-adaptive streaming of 360° video can cut bandwidth by up to 83% (Lungaro et al., 2018) (PDF p. 5, orig. p. 1485). Eye tracking also addresses the vergence-accommodation conflict, a source of eye strain, blurred vision, and headaches, through varifocal and other accommodation-supporting displays (Matsuda et al., 2017) and renders the ocular-parallax depth cue that conventional stereoscopic displays omit (Konrad et al., 2020) (PDF pp. 5–6, orig. pp. 1485–1486). Foveated field-of-view restriction, which moves the restrictor with the gaze, has been used to reduce VR sickness while preserving more of the visual scene than fixed restrictors (Adhanom et al., 2020b) (PDF p. 6, orig. p. 1486).

User interaction. Following the taxonomy of LaViola Jr et al. (2017), interaction is split into selection and manipulation, virtual locomotion, and system control. Gaze pointing is faster than hand or head pointing but less accurate, and gaze-only selection runs into the Midas touch problem — because the eyes always look somewhere, unintended gaze triggers commands (Jacob, 1990) (PDF pp. 6–7, orig. pp. 1486–1487). Selection-confirmation techniques therefore add dwell time, head support, blinks, muscle contraction, or multimodal input such as gaze pointing combined with a pinch gesture (Pfeuffer et al., 2017) (PDF p. 7, orig. p. 1487). Locomotion interfaces map gaze to steering, point-and-fly, and orbital navigation, and exploit saccadic and blink suppression to hide viewpoint manipulations in redirected walking (PDF pp. 7–8, orig. pp. 1487–1488). System control uses blink-as-switch and gaze typing, though gaze typing remains slow at roughly 10 words per minute; the review notes scant literature on gaze-based manipulation tasks (PDF pp. 8–9, orig. pp. 1488–1489).

Collaboration, training, and consumer research. In collaborative virtual environments, eye tracking supports realistic avatar representation (including pupil and eye-movement animation, recently via deep multimodal models), deictic communication through shared gaze, and cooperative object manipulation (PDF pp. 9–10, orig. pp. 1489–1490). In education and training, gaze indices measure cognitive skills, affective states, visual attention, learning outcomes, immersion, and usability — but the review records that expert–novice fixation-duration findings are inconsistent across studies, and warns that eye data shows what a user perceives, not whether they comprehend it (PDF pp. 10–12, orig. pp. 1490–1492). For marketing and consumer-experience research, VR with eye tracking gives experimental control while approximating real shopping, and is applied across the pre-purchase and purchase stages, with the post-purchase stage largely unexplored (PDF pp. 13–14, orig. pp. 1493–1494).

Security, privacy, and clinical use. Idiosyncratic eye-movement features support both explicit gaze passwords and implicit, continuous authentication, with one implicit method reaching 86.7% accuracy from 90 seconds of data (Zhang et al., 2018) (PDF pp. 12–13, orig. pp. 1492–1493). The same richness is a privacy hazard: gaze data can reveal interests, cognitive state, mental and neurological disorders, and demographic traits (Kröger et al., 2020), eye images contain iris patterns usable as a biometric — degradable in hardware to cut correct-recognition rate from 79% to 7% (John et al., 2020) — and differential-privacy methods have been proposed to protect users (Steil et al., 2019) (PDF pp. 17–18, orig. pp. 1497–1498). Clinically, eye-tracked VR is used for diagnosis (eliciting eye-movement abnormalities in Parkinson's, Alzheimer's, strabismus, and contrast-sensitivity testing), for therapy (objective measurement during exposure therapy for anxiety and phobias), and for interaction by patients who cannot use hand controllers, as in VR analgesia for burn-wound care (PDF pp. 14–16, orig. pp. 1494–1496).

Challenges and limitations. Data quality is the largest technological obstacle: the most-used HMD eye trackers report accuracy of 0.5°–1.1° (and only within a small central region of about 20°), precision often worse than the 0.03° needed for fixational work, sampling rates of only 100–200 Hz, and latencies of 45–81 ms that hamper real-time use (PDF pp. 16–17, orig. pp. 1496–1497). Calibration decays over a session — one report found about 30% drift within the first four and a half minutes — mainly because the headset shifts relative to the eyes (PDF p. 17, orig. p. 1497). Beyond data privacy and security, the review flags a safety concern unique to the hardware: HMD eye trackers use multiple near-infrared sources (around 880 nm) close to the eye for prolonged periods, and the long-term ocular hazard of this exposure has not been thoroughly investigated (PDF p. 18, orig. p. 1498).

Conclusion

The review concludes that eye tracking is positioned to become a standard feature of consumer VR HMDs, and that the availability of high-precision, low-latency, low-cost trackers has already produced applications spanning many disciplines (PDF pp. 18–19, orig. pp. 1498–1499). The authors expect research and development to keep accelerating, with foveated rendering the most prominent gain in efficiency and comfort, and accessible hands-free interaction a strong motivation for further work. Against this they set a consistent list of unresolved problems — perceptual effects of gaze-contingent rendering, end-to-end latency, robust drift-resistant calibration, transferable training metrics, and the privacy, security, and IR-safety of always-on eye data — each named as a direction for future research rather than a solved matter (PDF pp. 18–19, orig. pp. 1498–1499). This breadth-first VR view complements the depth-first interaction focus of Gaze Interaction In Extended Reality: both identify data quality, the lack of standard evaluation, and gaze-data privacy as the field's recurring weaknesses.

References

Adhanom, I. B., MacNeilage, P. & Folmer, E. (2023) 'Eye tracking in virtual reality: a broad review of applications and challenges', Virtual Reality, 27, pp. 1481–1505. doi: 10.1007/s10055-022-00738-z. adhanom2023etvr

Adhanom, I. B., Navarro Griffin, N., MacNeilage, P. & Folmer, E. (2020b) 'The effect of a foveated field-of-view restrictor on VR sickness', in 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR). IEEE, pp. 645–652. doi: 10.1109/VR46266.2020.1581314. To be validated.

Andersson, R., Nyström, M. & Holmqvist, K. (2010) 'Sampling frequency and eye-tracking measures: how speed affects durations, latencies, and more', Journal of Eye Movement Research, 3(3), pp. 1–12. doi: 10.16910/jemr.3.3.6. To be validated.

Guenter, B., Finch, M., Drucker, S., Tan, D. & Snyder, J. (2012) 'Foveated 3D graphics', ACM Transactions on Graphics, 31(6), pp. 1–10. doi: 10.1145/2366145.2366183. To be validated.

Jacob, R. J. K. (1990) 'What you look at is what you get: eye movement-based interaction techniques', in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 11–18. doi: 10.1145/97243.97246. To be validated.

John, B., Jorg, S., Koppal, S. & Jain, E. (2020) 'The security-utility trade-off for iris authentication and eye animation for social virtual avatars', IEEE Transactions on Visualization and Computer Graphics, 26(5), pp. 1880–1890. doi: 10.1109/TVCG.2020.2973052. To be validated.

Konrad, R., Angelopoulos, A. & Wetzstein, G. (2020) 'Gaze-contingent ocular parallax rendering for virtual reality', ACM Transactions on Graphics, 39(2), pp. 1–12. doi: 10.1145/3361330. To be validated.

Kröger, J. L., Lutz, O. H. M. & Müller, F. (2020) 'What does your gaze reveal about you? On the privacy implications of eye tracking', in IFIP Advances in Information and Communication Technology. doi: 10.1007/978-3-030-42504-3_15. To be validated.

LaViola Jr, J. J., Kruijff, E., McMahan, R. P., Bowman, D. & Poupyrev, I. P. (2017) 3D User Interfaces: Theory and Practice. Addison-Wesley Professional. To be validated.

Lungaro, P., Sjöberg, R., Valero, A. J. F., Mittal, A. & Tollmar, K. (2018) 'Gaze-aware streaming solutions for the next generation of mobile VR experiences', IEEE Transactions on Visualization and Computer Graphics, 24(4), pp. 1535–1544. doi: 10.1109/TVCG.2018.2794119. To be validated.

Matsuda, N., Fix, A. & Lanman, D. (2017) 'Focal surface displays', ACM Transactions on Graphics, 36(4), pp. 1–14. doi: 10.1145/3072959.3073590. To be validated.

Patney, A., Salvi, M., Kim, J., Kaplanyan, A., Wyman, C., Benty, N., Luebke, D. & Lefohn, A. (2016) 'Towards foveated rendering for gaze-tracked virtual reality', ACM Transactions on Graphics, 35(6), pp. 1–12. doi: 10.1145/2980179.2980246. To be validated.

Pfeuffer, K., Mayer, B., Mardanbegi, D. & Gellersen, H. (2017) 'Gaze + pinch interaction in virtual reality', in Proceedings of the 5th Symposium on Spatial User Interaction. ACM, pp. 99–108. doi: 10.1145/3131277.3132180. To be validated.

Steil, J., Hagestedt, I., Huang, M. X. & Bulling, A. (2019) 'Privacy-aware eye tracking using differential privacy', in Eye Tracking Research and Applications Symposium (ETRA). doi: 10.1145/3314111.3319915. To be validated.

Whitmire, E., Trutoiu, L., Cavin, R., Perek, D., Scally, B., Phillips, J. & Patel, S. (2016) 'EyeContact: scleral coil eye tracking for virtual reality', in International Symposium on Wearable Computers, Digest of Papers. doi: 10.1145/2971763.2971771. To be validated.

Zhang, Y., Hu, W., Xu, W., Chou, C. T. & Hu, J. (2018) 'Continuous authentication using eye movement response of implicit visual stimuli', Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(4), pp. 1–22. doi: 10.1145/3161410. To be validated.

Open Questions

  • How do gaze-contingent (foveated) rendering and field-of-view manipulation affect attentional behaviour and task performance in VR, given that non-immersive gaze-contingent displays have reduced reading speed?
  • What calibration procedure would stay accurate across a long VR session despite headset slippage, the main cause of the ~30% calibration drift the review reports?
  • What is the long-term ocular effect of prolonged exposure to the multiple near-infrared sources used by HMD eye trackers — a hazard the review flags as not yet thoroughly investigated?
  • Why do expert–novice fixation-duration findings disagree across studies, and which task and environment factors moderate them?