Description:
Keywords
Gaze correction, machine learning, predictor, decision forest
Applications
Videoconferencing and other remote video communication systems
Problem statement
The problem of gaze in videoconferencing has been attracting researchers and engineers for a long time. The problem manifests itself as the inability of the people engaged into a videoconferencing (the proverbial “Alice” and “Bob”) to maintain gaze contact. The lack of gaze contact is due to the disparity between Bob’s camera and the image of Alice’s face on Bob’s screen (and vice versa).
Technology
We revisit the problem of gaze correction and present a solution based on supervised machine learning. At training time, our system observes pairs of images, where each pair contains the face of the same person with a fixed angular difference in gaze direction. It then learns to synthesize the second image of a pair from the first one. After learning, the system becomes able to redirect the gaze of a previously unseen person by the same angular difference (10 or 15 degrees upwards in our experiments). Unlike many previous solutions to gaze problem in videoconferencing, ours is purely monocular, i.e. it does not require any hardware apart from an in-built web-camera of a laptop. Being based on efficient machine learning predictors such as decision forests, the system is fast (runs in real-time on a single core of a modern laptop).
Advantages
High realism
High resource effectiveness
Requires no extra hardware
Publications
Daniil Kononenko, Victor Lempitsky / Learning To Look Up: Realtime Monocular Gaze Correction Using Machine Learning / IEEE Computer Vision and Pattern Recognition (CVPR), Boston MA, 2015