Emotion Sensing In VR: The New Face of Digital Interaction

Since the first mainframe computers were created, designers and engineers have sought more intuitive ways to interact with communication technologies. The Analytical Engine of Charles Babbage of the early 19th century used punch cards and levers to enable interaction with these early computers. Fast forward 100 years and the keyboard, and later the mouse would become the the dominant methods for input for decades.

The advent of the laptop, and then the smartphone helped popularise the touch pad mouse and the touchscreen. But as the devices have become ever more intuitive and naturalistic, so have our methods of interaction.  The overweight computational partner used to have a whole room to itself, it eventually slimmed down and sat on your desk. Next it was hopping onto your lap which made you decide that you wanted to take it home. Thereafter you were never more than a few feet away, and it spent much of your time holding it in your hand. The next transition- the leap onto your face, is when computing really gets personal.

Thus interaction methods have changed from levers and buttons to be pulled and pushed, to keys that you tap, the mice that you click, the touchpad that you slide, the touchscreen that you swipe.

With face worn computers, the challenges and potential benefits are significant. As the computer’s  screen size has diminished from 19-inch desktop to virtually no screen at all (Magic Leap), we have introduced a need for new methods of interaction.

The 4 levels of interaction

I propose a classification of input devices that view VR/AR platforms from the perspective of the level of interactivity and chronological introduction. Viewed in this way, the missing link that stops VR/AR from becoming truly immersive becomes clear.

First generation AR/VR input

A first generation input device can be considered the motion sensor incorporated into the headmounted device (HMD) which translates the head position to change the scene.

For VR this enables basic interaction by creating a pointer (reticle) on the screen.

 


 

Second generation AR/VR input

VR systems literally and metaphorically made a step forward when the software was able to interact with the user’s limbs.

A variety of pointing devices were introduced in the 1990s, notably gloves and treadmills which enable the wearer to move around and include a representation of their hands in the virtual scene. Even without tactile feedback, the introduction of wireless and camera-based limb tracking such as the Leap Motion for VR and and similar technologies for AR  considerably improved interactivity.


 

Third generation VR input

Until recently, wearable eye-tracking has been a niche and comparatively expensive technology, mostly confined to academic uses and market researchers.

However, the potential for foveated rendering has increased interest with the promise of a marked reduction in the computational demands of high resolution, low latency image display.

The other benefit of adding eye-tracking to VR is that it enables more realistic interactions between the user and virtual characters. Speech recognition, a technology that has also benefited from the smartphone revolution can add to eye-tracking by enabling categorical commands such as looking at a door and saying ‘open’.

The major players have all purchased eye-tracking companies. Google acquired Eyefluence, Facebook purchased the Eye Tribe, and Apple has bought SensoMotoric Instruments (SMI).


 

Fourth Generation VR Input

Facial expressions are the important missing element from VR and AR interactions. HMDs with depth cameras attached have been used to visualise the lower face (e.g. Binary VR), but whether this method proves popular in the future is yet to be seen.

Three potential reasons why this approach may be problematic relate to i) the way humans interact, ii) ergonomic concerns and iii) computational and battery life considerations.

One learning from eye-tracking research is that during face to face interactions we infer information from the eye region. Surprise, anger, disgust and a genuine (Duchenne) smile all require visibility of the brow area and the skin typically covered by the HMD. Hao Li for Oculus research has incorporated stretch sensors in the foam interface of the HMD to derive information from behind the headset, and it will be interesting to see how this performs when the final version is released.

Mindmaze have revealed their Mask prototype, which requires the user to wear a clip on their ear and according to one account “conductive gel” on the skin. Samsung have also announced a development nicknamed “FaceSense” although details are still limited.

Emteq’s solution is called FaceTeq and is a platform technology that uses novel sensor modalities to detect the minute electrical changes that occur when facial muscles contract. With each facial expression a characteristic wave of electrical activity washes over the skin and this can be detected non-invasively and without the need for cameras.

Faceteq1Our light-weight, low-cost open platform will herald the 4th generation of VR. Researchers, developers and market researchers will undoubtedly be the initial adopters. However the real advance will be the ability to enable face-to-face social experiences. There are so many areas where facial expressions in VR could improve communication and interactivity. We will be opening our platform and are excited to see what ideas developers come up with. At Emteq, we’re passionate about fostering the 4th generation of AR/VR interaction. We look forward to partnering with headset manufacturers and content creators.

Follow us to learn more about the possibilities and to stay up to date with developments whilst you can of course also follow VRFocus for ongoign developments in the technology space at large