XRI: Cross-Reality Interaction

Widespread consumer adoption of XR devices will redefine how humans interact with both technology and each other. In coming decades, the standard mouse and QWERTY keyboard may fade as the dominant computing UX, giving way to holographic UI, precise hand/eye/body-tracking and, eventually, powerful brain-to-computer interfaces. One key UX pattern that must be answered by designers and developers is: How to input?

That is, by what means does a user communicate and interact with your software and to what end? Aging 2D input paradigms are of limited use, while new ones are little understood or undiscovered altogether. Further, XRI best practices will vary widely per application, use case and individual mechanic.

The mind reels. Though these interaction patterns will become commonplace in time, right now we’re very much living through the “Cinema of Attractions” era of XR tech. As such, we’re privileged to witness the advent of a broad range of wildly creative immersive design solutions, some as fantastic as they are impractical. How have industry best practices evolved?

Controllers

These may seem pedestrian, but it’s easy to forget that the first controllers offering room-scale, six degrees-of-freedom (6-DoF) tracking only hit the market in 2016 (first Vive’s Wands then Oculus’ more ergonomic Touch, followed by Windows’ muddled bastardization of the two in 2017). With 6-DoF XR likely coming to mobile and standalone systems in 2018, where are controller interfaces headed?

Well, Vive’s been developing its “Grip controllers” (aka the “knuckles controllers”) — which are worn as much held, allowing users freer gestural tracking and expression — for over a year, but they were conspicuously excluded from the CES launch announcement of the Vive Pro.

One controller trend we did see at CES: haptics. Until now, handheld inputs have largely utilised general vibration to indicate haptic feedback. The strength of the rumble can be throttled up or down, but limited to just one vibratory output, developers’ power to express information with physical feedback has been limited. It’s a challenging problem: how to simulate physical resistance where there is none?

VR Controllers
Left: the HaptX Glove, Right: the Tactical Haptics Reactive Grip Motion Controller

HaptX Inc. is one firm leading advances in this field with their HaptX Gloves, two Nintendo Power Glove-style offerings featuring tiny air pockets that dynamically expand and contract to provide simulated touch and pressure in VR in real-time. All reports indicate some truly impressive tech demos, though perhaps at the cost of form-factor — the hardware involved looks heavy-duty and removing the glove would appear to be several degrees more difficult than setting down a Vive Wand, for contrast.

Theirs strikes me as a specialty solution, perhaps more suited to location-based VR or commercial/industrial applications. (Hypothetical: would a Wand/Touch-like controller w/ this type of “actuators” built into the grips provide any UX benefit at the consumer level?). Meanwhile, Tactical Haptics is exploring this tech through a different lens, using a series of sliding plates and ballasts in their Reactive Grip Motion Controller, which tries to simulate some of the physical forces and resistance one feels wielding objectives with mass in meatspace. This is perhaps a more practical haptics approach for consumer adoption — they’re still simple controllers, but the added illusion of physics force could be a truly compelling XRI mechanic (for more, check out their white paper on the tech).

Hand-Tracking

Who needs a controller? For some XR applications, the optimal UX will take advantage of the same built-in implements with which humans have explored the material world for thousands of years: their hands.

Tracking a user’s hands in real-time 27 degrees of freedom (four per finger, five in the thumb, six in the wrist) absent any handheld implement allows them to interact with physical objects in their environment as one normally would (useful in MR contexts)— or to interact with virtual assets and UI in a more natural, frictionless and immersive way than, say, the pulling of a trigger on a controller.

And of course, I defy you to test such software without immediately making rude gestures with it.

Pricier AR/MR rigs like Microsoft’s Hololens will have hand-tracking technology baked in — though reliability, field of view and latency vary. However, most popular VR headsets on the market don’t offer this integration natively thus far. Thankfully, the Leap Motion hand-tracking sensor, available as a desktop peripheral for years, is being retrofitted by XR developers with compelling results. For additional reading, and to see some UX possibilities in action I’d recommend checking out this great series by Leap Motion designer Martin Schubert.

These hand-eye interaction patterns have been entrenched in our brains over thousands of years of evolution and (for most of us) decades of first-hand experience. This makes them feel real and natural in XR. As drawbacks go, the device adds yet another USB peripheral and extension cable to my life (surely I will drown in a sea of them), and there are still field of view and reliability issues. But as the technology improves, this set of interactions works so well that it can’t help but become an integral piece of XRI. To allow for the broadest range of use cases, I’d argue that all advanced/future XR HMDs need to feature hand-tracking natively (though optionally, per application, of course).

Interestingly enough, the upcoming Vive Pro features dual forward-facing cameras in addition to its beefed-up pixel density. We now know, having been confirmed by Vive, hand-tracking can be done using these. Developers and designers would do well to start grokking XR hand-tracking principles now.

Eye-Tracking

Though the state of the art has advanced, too much of XRI has been relegated to holographic panels attached at the wrist. While this is no doubt an extremely useful practice, endless new possibilities for UI and gameplay mechanics emerge once you add high-quality, low-latency eye tracking to any HMD-relative heads-up display UI and/or any XR environment beyond it.

Imagine browsing menus more effortlessly than ever using only your eyes to exact selection, or to target distant enemies better in shooters. Consider also the effects of eye-tracking in multiplayer VR and the possibilities that unlocks. Once combined with 3D photogrammetry scans of users faces or hyper-expressive 3D avatars, we’ll be looking at real-time, photorealistic telepresence in XR spaces (if you’re into that sort of thing).

Wrist-Mounted UI
Wrist-mounted UI has proliferated in XR — but only goes so far. Eye-tracking will usher in many HMD-relative UI possibilities.

Imagine browsing menus more effortlessly than ever using only your eyes to exact selection, or to target distant enemies better in shooters. Consider also the effects of eye-tracking in multiplayer VR and the possibilities that unlocks. Once combined with 3D photogrammetry scans of users faces or hyper-expressive 3D avatars, we’ll be looking at real-time, photorealistic telepresence in XR spaces (if you’re into that sort of thing).

Eye-tracking isn’t just promising as an input mechanism. This tech will also allow hardware and software developers to utilise a technique called foveated rendering. Basically, the human eye only sees sharply near the very center of your gaze — things get more blurred further out into your visual periphery. Foveated rendering takes advantage of this wetware limitation by precisely tracking the position of your eyes from frame to frame and rendering whatever you’re looking at super precisely on (theoretically) higher-resolution screens. Simultaneously, the quality of everything you’re not looking directly at is downgraded – which you won’t notice because your pathetic human eyes literally can’t. This will allow for more XR on lower-powered systems and will allow high-end systems to stretch possibilities even further with higher-resolution screens.

Tobii & HTC Vive
Tobii’s eye-tracking technology embedded in a custom Vive

While Oculus and Google have acquired eye-tracking companies in recent years, the current industry leader appears to be Tobii. Their CES demos were reportedly extremely impressive ;  but considering they retrofit a new Vive for each devkit, their solution is not mass-market at this point – and likely pricey, since you have to seek approval to even receive a quote. Still, the potential benefits of eye-tracking for XRI are so great, surely we’ll see native adoption of this tech by major HMD manufacturers in coming hardware generations (hopefully through a licensing deal with Tobii).

Voice & Natural Language Processing

As the trend of exploding Alexa use has taught us, many users love interacting with technology using their voices. Frankly, the tech to implement keyword and phrase recognition at relatively low cost is already there for developers to utilise — it’s officially low-hanging fruit in 2018.

On the local processing side, Windows 10 voice recognition tech runs on any PC with that OS — though it currently fairs better with shorter keywords and a low confidence threshold. (Check out this great tutorial for Unity implementation on Lightbuzz.com). Alternatively, you can offshore more complex phrases and vocal data to powerful, highly-optimized Google or Amazon processing centers. At their most basic, these services transform vocal data into stringvalues you can store and program logic against — but certainly many other kinds of analyses of and programmatic responses to the human voice are possible through the lens of machine learning: stress signals, emotional cues, sentiment evaluation, behavior anticipation, etc.

At the OS/always-on level, some Alexa-like voice-controlled task rabbit has to be in the pipeline (Rift OS Core 2.0 already gives me access to my Windows desktop, and therefore Cortana) —that’s assuming Amazon’s automated assistant doesn’t grace the XR app stores herself. At the individual app level, this powerful input may be the most widely available yet underutilised in XR (though for the record, I do see it as primarily an optional mechanic, not one that should be required for many experiences). When I’m dashing starboard to take on space pirates in From Other Suns, I want to be able to yell “Computer, fire!” so badly — this would be so pure. In Fallout 4 VR, I want to yell, “Go!” and point to exactly where Dogmeat should run (I pulled this off with my buddy BB-8 in a recent project). Developers and designers should look for more chances to use voice recognition more often as the implementation costs continue to fall.

Brain-Computer Input

Will we eventually arrive at a point where the most human of inputs —our physical and vocal communications—are no longer necessary to order each and every task? Can we interact with a computer using our minds alone? Proponents of a new generation of brain-computer-interfaces (BCI) say yes.

At a high-level, the current generation of such technology exists as helmet- or headband-like devices that use generally use safe and portable electroencephalography (EEG) sensors to monitor various brain waves. These sensors will generally output floating point values per type of wave tracked, and developers can program different responses to such data as they please.

Neurable HTC Vive
Neurable’s Vive integration

Though studied for decades, this technology has not yet reached maturity. The major caveat right now is that a given person’s ability project and/or manipulate the specific brainwaves tracked by accurately (as tracked by each device’s array of EEG sensors) will vary and can sometimes require lots of calibration and practice.

Still, recent advances appear promising. Neurable is perhaps the leader in integrating an array of EEG and other BCI sensors with a Vive VR headset. On the content side, the Midwest US-based StoryUp XR is using another BCI, the Muse, to drive a mobile VR app with the users’ “positivity,” which they say corresponds to a particular brainwave picked up by the headset that users can learn to manipulate. StoryUp, who are part of the inaugural Women In XR Fund cohort, hope to bring these kinds of therapeutic and meditative XR experiences to deployed military, combat veterans and the general public using BCI interfaces as both a critical input and a monitor of user progress.

It will likely be decades before you’re able to dictate an email via inner monologue or directly drive a cursor with your thoughts — and who knows whether such sensitive operations will even be possible without invasive surgery to hack directly into the wetware. (Yes, that was a fun and terrifying sentence to write). I would wager, however, that an eye-tracking-based cursor combined with “click” or “select” actions driven by an external BCI will become possible within a few hardware generations, and may well end up being the fastest, most natural input in the world.

Machine Learning

Imagine an AI-powered XR OS a decade from now: one that can utilise and analyse all the above inputs, divining user intent and taking action on their behalf. One that, if unsure of itself, can seek clarification in natural language or in a hundred other ways. It can acquire your likes and dislikes through experience and observation as easily as you might for a new friend, constructing a model your overall XR interaction preferences — with the AI itself, with other humans, and with the virtual realities your visit and the physical ones you augment. This system will, at the very least, be able to model and emulate human social graces and friendship.

Any such system will also have unparalleled access to your most sensitive personal and biometric data. The security, privacy and ethical concerns involved will enormous and should be given all due consideration. In his talk on XR UX at Unity HQ last fall, Unity Labs designer and developer Dylan Urquidi said he sees blockchain technology as a possible medium for context-aware, OS-level storage of these kinds of permissions or preferences. This allows ultimate ownership and decision-making power re: this data to remain with the user, who can allow or deny access to individual applications and subsystems as desired.

I’m currently working on a VR mechanic using a neural net trained from Google QuickDraw data to recognize basic shapes drawn with Leap Motion hand-tracking — check out my next piece for more.

Machine learning is likely the most important yet least understood technology coming to XR and computing at large. It’s on designers and developers to educate themselves and the public on how they’re leveraging these technologies and their users’ data safely and responsibly. For myself, machine learning is the first problem domain I’ve encountered in programming where I don’t grok all the mathematics involved.

As such, I’m currently digging through applied linear algebra coursework and Andrew Ng’s great machine learning class on Coursera.org in an effort to better understand this most arcane frontier (look out for my next piece, where I’ll apply some of these concepts and train neural net to identify shapes drawn in VR spaces). While I’m not ready to write the obituary for the QWERTY keyboard just yet, these advances make it clear that in terms of XRI, the times are a-changin’.

Accessible XR Development After VRTK

As a developer moving from the web and app world into 3D and XR, I’ve had to constantly re-evaluate my platform and tool choices as the industry evolves at tweetstorm velocity. Today’s XR development pipeline is clogged by a glut of proprietary hardware and software APIs and SDKs by competing firms like Oculus, HTC Vive, Microsoft, Google, Apple, Sony and SteamVR — to say nothing of emerging third-party peripherals like Logitech’s VR-tracked keyboard, the new AR-enabling Zed Mini dual-eye camera for the Rift or Vive, or any other industry-disrupting Kickstarters that might’ve sprung up since I started typing this paragraph.

Left to right: a bunch of cool stuff I want.

Each platform’s fine — even technologically stunning, one might argue — with respective strengths, weaknesses and use cases. But the distinctions force XR developers to ask hard questions: Where is the market going? How do I invest my skill-building time? What devices should my app support? What platform can I get a job working on? Developers must be business analysts as much as creative technologists to stay relevant. It’s easy to suffer choice paralysis with such a wide array of options, and easier still to bet on the wrong technology and lose.

Personally, I also face certain technical, logistic and financial realities as an independent XR developer in the Midwest (US), where the industry hasn’t proliferated as it has in major coastal cities. Thankfully, game engines like Unity and Unreal are rapidly democratizing this space. Both engines seek to bridge the gaps between the various XR SDKs, employing thousands of engineers to ensure their software plays nicely with just about any significant third-party API. For example, as I wrote about in August, the Oculus SDK integrates beautifully with Unity and comes equipped with many of the scripts and prefabs needed to quickly prototype, develop and deploy a custom Rift app.

I miss bossing around my hand-modeled #MadeWithBlocks BB-8. Check out my deep dive on this project, The Future of VR Creation Tools.

That’s fantastic, but it’s still non-standard. To port the same Unity app to the HTC Vive or a Windows HMD is non-trivial — not impossible or even terribly difficult, but non-trivial. Maintaining your app for multiple SDKs over the long haul is similarly non-trivial. Non-trivial costs money and time and we’re all short on both.

Instead imagine if XR practitioners had to worry less about betting on the right platform or device and could instead focus on creating unique and compelling experiences, content and UX. The first step down that path was VRTK — but sadly, one of the best tools to combat the VR SDK surplus will soon be hobbled by the loss of its founder.

VRTK: The Open Source Approach

This free, open source Unity toolkit aims to knit together a single workflow for a variety of VR APIs. It comes with the same stock prefabs and scripted mechanics you might find included in any single proprietary SDK, but makes each piece of functionality identical whether deployed to Oculus, SteamVR (read: Vive and, with v3.3.0, Windows HMDs) or Daydream — covering all major VR HMD manufacturers today.

It’s a boon to anyone wanting to dip their toes in the waters of VR development. Think of it: Want to implement teleportation locomotion over a Unity NavMesh? Just drop the component onto your player prefab. Want to test out grab mechanics, or a quick bezier pointer? VRTK’s demo scenes have you covered, and they’ll work easily on a variety of devices. Since it’s open source, you’re also free to dive in and customize the code. Struggling to get a feature working in your own project? Check out this implementation on a varieties of SDKs — not a bad way to grok new XR coding concepts.

Sadly, VRTK’s creator is sunsetting the woefully underfunded project. The UK-based developer TheStoneFox — who until recently was actively seeking contributors, partnerships and support — announced recently that he would will be stepping back from the project post-version 3.3.0. Though VRTK boasts an active Slack community, a growing list of “made with” titles and a recent Kickstarter, TheStoneFox was unable to attract the support necessary to sustain it for the long term.

Now, as the opportunity to contribute to and utilize a premier open-source VR development pipeline expediter will fade going forward, what if anything will replace it?

OpenXR: One API to Rule Them All

The VRTK approach —using Unity scripting to knit together similar mechanics across a spectrum of VR SDKs — is necessary in the current fragmented development landscape, but there are downsides. Some community still has to monitor the various proprietary SDK updates and your end-user VRTK app still has to be mindful of VRTK’s changes over time. In this way, VRTK treated the symptoms of the VR SDK overload, but was not equipped to address the root cause. Enter OpenXR, The Khronos Group’s upcoming industry standard:

The standard, announced December 2016, is being written now and is quickly gaining traction among industry players (with the notable exception of Magic Leap). Instead of forcing developers to grapple with variable propriety SDKs and all the accompanying business consequences, companies will instead tailor their hardware and software to comply with OpenXR’s spec. Khronos, the non-profit responsible for shepherding the Vulkan, OpenGL, OpenGL ES and WebGL standards, is leading the charge. Cue the infographics!

On the left, the problem — on the right, the solution:

Images courtesy of https://www.khronos.org/openxr.

“Each VR device can only run the apps that have been ported to its SDK. The result is high development costs and confused customers — limiting market growth,” reads some fairly accurate marketing copy on their website. “The cross-platform VR standard eliminates industry fragmentation by enabling applications to be written once to run on any VR system, and to access VR devices integrated into those VR systems to be used by applications.”

A working group of industry heavyweights have agreed the standards be extensible to allow for future innovation and should support a range of experiences — anything from a 3-DoF controller all the way to a high-end, room-scale devices.

The only thing missing is a realistic timetable before this standard has an impact on the development community and its day-to-day workflow. Until the market-movers get their act together, we’ll be left scrambling (and patching up VRTK projects, in many cases).

OpenXR supporters: everyone except Magic Leap.

The Cinema of Attractions: Slow Your Reel

But should we so quickly welcome industry standardization while the technology is still so new and full of possibilities? That’s the question asked in a recent Voices of VR podcast by Kent Bye and Rebecca Rouse. The two discussed the early days of cinema — when exploration and experimentation were the status quo — and Rouse drew striking parallels between that era and the current period in XR production and development.

Pure spectacle then and now. Left: a Cinema of Attractions-era still. Right: Chocolate VR.

“[Scholars of early film] came up with this term ‘cinema of attractions’ because they saw an incredible wealth of diversity and kind of range of exuberant experimentation in those early pieces, so they were very hard to sort of clump them together — there was such diversity — but this ‘attraction’ idea was a large enough umbrella, because all of those early pieces are in some way showing off the technology’s capabilities and generate this experience of wonder or amazement for the viewer. And the context in which they were shown is that of attractions, so they were shown at world’s fairs and as a part of vaudeville shows with other kinds of performances and displays.”

 — Rebecca Rouse, assistant professor of communication & media at Rensselaer Polytechnic Institute

Sounds eerily familiar, huh? The whole podcast is well worth a listen, but tldr: while there are obvious consumer and market advantages to XR standards, Rouse argues that perhaps we shouldn’t jump the gun here— not during this era of frenetic, often avant garde XRexperimentation across art, science, cinema and gaming. Looking around the industry, it’s hard to disagree.

EditorXR

One man-eating-the-camera-brilliant new application of XR technology is Unity Labs’ EditorXR. Created by Unity’s far-future R&D team (whose roles often find them working on projects and products five-to-ten years away from consumer adoption), EditorXR offers you an interface to create custom XR Unity scenes entirely within virtual reality.

Oh! And there’s flying, among other superpowers — soar through your scene like Superman or scale the whole thing down to a pinhole. They’ve literally ported the Unity inspector, hierarchy and project windows (again among others) to an increasingly user-friendly VR UI pane on your wrist. With the latest update, you’re able to:

  • hook into Google’s Poly asset database web API in real-time inside VR
  • create multiplayer EditorXR sessions for editing Unity scenes with friends and collaborators
  • run EditorXR with Unity’s primary version 2017.x editor

It’s still new and I’ve encountered bugs, but it’s a foregone conclusion that this tech will become a standard feature of Unity’s scene creation process as XR technology matures and proliferates. Even their alpha and beta efforts evoke the same sense of wonder and possibility that early Cinema of Attractions-era moviegoers must have felt.

For more insight on the design side, check out this deep dive on the future of XR UX design by Unity Lab’s Dylan Urquidi or the Twitter feed of Authoring Tools Group Lead, Timoni West.

ML-Agents

Another experimental Unity project, ML-Agents, explores one of the most promising avenues for the future of XR development, design and UX: machine learning. Using so-called “reinforcement learning” techniques which expressly don’t feed the AI model any sample data or rules for analysis, ML-Agents instead applies simple rewards and punishments (in the form of tiny float values) based on the outcomes to their [usually very narrowly defined set of] behaviors.

Stretched out over hundreds of thousands if not millions of trial-and-error training sessions, the computer experiments with its abilities and forms a model for how to best achieve the desired goal. In this way, your Agents become their own teacher s— you just write the rubric.

The original GitHub commit contained some basic demo scenes and the development community quickly took up the torch from there. Unity’s Alessia Nigretti followed up the original blog with one describing how to integrate ML-Agents into a 2D game. On Twitter, @PunchesBears has been demonstrating similar concepts — and showing that often enough, Agents respond to developers’ carefully calculated reward system in ways they don’t anticipate. Similar to actual gamers, no?

In one of my favorite applications of ML-Agents, the developer Blake Schreurs actually brings a 6-DoF robo-arm Agent trained to seek a moving point in space into virtual reality — with slightly terrifying results once he assigns that moving target to his face.

Imagine someone applying this training model to actual robotics and fat-fingering the wrong key. Or don’t, whatever. 

He’s down for the count! I was immediately reminded of the audiences pouring out of theaters in 1895, afraid they’d be run down by the Lumière brothers’ Arrival of Train at La Ciotat. We’re still in the salad days of both machine learning and XR development compared to where we hope to be 10 or even 50 years from now. In that time, some combination of traditional or procedural AI with these new machine learning approaches will doubtless lead to great developments in gaming and XR at large — or even in the very design process and daily workflow of computing itself.

Rift OS Core 2.0

With Rift’s new Core 2.0 OS, your entire Windows PC is accessible from your right-hand menu button. Being able to view and use your desktop apps, as well as pin windows inside other VR apps, introduces new possibilities for XR workflows (and even for traditional computing workflows) in VR.

While working on my next project, entirely within VR, I can watch Danny Bittman’s great Unity rendering and lighting tutorial on YouTube in a pinned browser while messing with those same settings on my wrist in UnityXR. I can watch @_naam craft original assets in Google Blocks at the same time I do, or I could gather assets from the Poly database and deploy them to my Unity scene in real-time VR, pulling up Visual Studio to code some game logic as I please.

That sounds pretty goddamn metaversal to me — and before long, we likely won’t even need code.

The XR Developer of the Future Is Not a Developer

If XR technology is to go mainstream, the development process must be as efficient and accessible as possible — and likely even open to non-developers through content creation and machine learning applications. Spanning sciences and disciplines, there’s so much more to talk about and speculate over that this piece hasn’t even touched on (next time I’ll examine WebVR and A-Frame as viable XR development pathways). More and more pieces of this accessible, standardized XR development pipeline will fall into place as the immersive computing revolution rolls on, though I’m thankful the XR industry isn’t ready to ditch its Cinema of Attractions ethos quite yet.