nvidia research – Augmented & Virtual Reality Confabulation

April 27, 2018

Researchers Exploit Natural Quirk of Human Vision for Hidden Redirected Walking in VR

Researches from Stony Brook University, NVIDIA, and Adobe have devised a system which hides so-called ‘redirected walking’ techniques using saccades, natural eye movements which act like a momentary blindspot. Redirected walking changes the direction that a user is walking to create the illusion of moving through a larger virtual space than the physical space would allow.

Update (4/27/18): The researchers behind this work have reached out with the finished video presentation for the work, which has been included below.

Original Article (3/28/18): At NVIDIA’s GTC 2018 conference this week, researchers Anjul Patney and Qi Sun presented their saccade-driven redirected walking system for dynamic room-scale VR. Redirected walking uses novel techniques to steer users in VR away from real-world obstacles like walls, with the goal of creating the illusion of traversing a larger space than is actually available to the user.

There’s a number of ways to implement redirected walking, but the strengths of this saccade-driven method is that it’s hidden from the user, widely applicable to VR content, and dynamic, allowing the system to direct users away from objects newly introduced into the environment, and even moving objects, the researchers say.

The basic principle behind their work is an exploitation of a natural quirk of human vision—saccadic suppression—to hide small rotations to the virtual scene. Saccades are quick eye movements which happen when we move our gaze from one part of a scene to another. Instead of moving in a slow continuous motion from one gaze point to the next, our eyes quickly dart about, when not tracking a moving object or focused on a singular point, a process which takes tens of milliseconds.

An eye undertaking regular saccades

Saccadic suppression occurs during these movements, essentially rendering us blind for a brief moment until the eye reaches its new point of fixation. With precise eye-tracking technology from SMI and an HTC Vive headset, the researchers are able to detect and exploit that temporary blindness to hide a slight rotation of the scene from the user. As the user walks forward and looks around the scene, it is slowly rotated, just a few degrees per saccade, such that the user reflexively alters their walking direction in response to the new visual cues.

This method allows the system to steer users away from real-world walls, even when it seems like they’re walking in a straight line in the virtual world, creating the illusion that the the virtual space is significantly larger than the corresponding virtual space.

A VR backpack allows a user at GTC 2018 to move through the saccadic redirected walking demo without a tether. | Photo by Road to VR

The researchers have devised a GPU accelerated real-time path planning system, which dynamically adjusts the hidden scene rotation to redirect the user’s walking. Because the path planning routine operates in real-time, Patney and Sun say that it can account for objects newly introduced into the real world environment (like a chair), and can even be used to steer users clear of moving obstacles, like pets or potentially even other VR users inhabiting the same space.

The research is being shown off in a working demo this week at GTC 2018. An academic paper based on the work is expect to be published later this year.

The post Researchers Exploit Natural Quirk of Human Vision for Hidden Redirected Walking in VR appeared first on Road to VR.

March 29, 2018

Want More Space for Roomscale VR? NVIDIA Research can do This Virtually

Roomscale virtual reality (VR) technology is a wonderful thing. It allows you to explore a virtual world with your own feet, being able to wander round a room and interact with objects as if you were really there. While the HTC Vive system for example can cover an area of 15ft x 15ft everyone doesn’t necessarily have that amount of space to work with, meaning walls and other furniture can quickly be bumped into if not careful. So during the GPU Technology Conference (GTC) 2018, NVIDIA Research demonstrated a new technique its been working on in collaboration with Adobe and Stony Brook University to make physical areas seem much bigger in VR.

NVIDIA redirected walking path1

Called Saccadic Redirected Walking, the technique utilises a quirk in your eyes where involuntary movements temporarily blind you a few times per second. These movements, known as saccades, are imperceptible because they last only tens of milliseconds.

So during those fractions of a second the technique rotates the scene ever so slightly. What this does without the user noticing it is guide them on a physical path that’s ever so slightly different to the one they’re viewing in the virtual world. As shown in the image above, this means the physical path of the player can be small whilst in VR it can seem far larger, imagine walking round a grand hall just in your living room. This also helps with avoiding objects like walls, or other players if systems are setup up nearby.

NVIDIA is demoing the technique this week at the VR Village using Quadro GPUs, HTC Vive and SMI eye tracking, with guests able to walk around a huge virtual Alice in Wonderland-like chess board with pieces the size of people, all within a 15×15 foot booth.

HTC Vive stock image 4

The teams will be taking the research to SIGGRAPH later this year to present a paper on the technique. As development progresses and further details released, VRFocus will keep you updated.

November 30, 2017

Exclusive: How NVIDIA Research is Reinventing the Display Pipeline for the Future of VR, Part 2

In Part 1 of this article we explored the current state of CGI, game, and contemporary VR systems. Here in Part 2 we look at the limits of human visual perception and show several of the methods we’re exploring to drive performance closer to them in VR systems of the future.

Guest Article by Dr. Morgan McGuire

Dr. Morgan McGuire is a scientist on the new experiences in AR and VR research team at NVIDIA. He’s contributed to the Skylanders, Call of Duty, Marvel Ultimate Alliance, and Titan Quest game series published by Activision and THQ. Morgan is the coauthor of The Graphics Codex and Computer Graphics: Principles & Practice. He holds faculty positions at the University of Waterloo and Williams College.

Note: Part 1 of this article provides important context for this discussion, consider reading it before proceeding.

Reinventing the Pipeline for the Future of VR

We derive our future VR specifications from the limits of human perception. There are different ways to measure these, but to make the perfect display you’d need roughly the equivalent to 200 HDTVs updating at 240 Hz. This equates to about 100,000 megapixels per second of graphics throughput.

Recall that modern VR is around 450 Mpix/sec today. This means we need a 200x increase in performance for future VR. But with factors like high dynamic range, variable focus, and current film standards for visual quality and lighting in play, the more realistic need is a 10,000x improvement… and we want this with only 1ms of latency.

We could theoretically accomplish this by committing increasingly greater computing power, but brute force simply isn’t efficient or economical. Brute force won’t get us to pervasive use of VR. So, what techniques can we use to get there?

Rendering Algorithms

Foveated Rendering
Our first approach to performance is the foveated rendering technique—which reduces the quality of images in a user’s peripheral vision—takes advantage of an aspect of human perception to generate an increase in performance without a perceptible loss in quality.

Because the eye itself only has high resolution right where you’re looking, in the fovea centralis region, a VR system can undetectably drop the resolution of peripheral pixels for a performance boost. It can’t just render at low resolution, though. The above images are wide field of view pictures shrunk down for display here in 2D. If you looked at the clock in VR, then the bulletin board on the left would be in the periphery. Just dropping resolution as in the top image produces blocky graphics and a change in visual contrast. This is detectable as motion or blurring in the corner of your eye. Our goal is to compute the exact enhancement needed to produce a low-resolution image whose blurring matches human perception and appears perfect in peripheral vision (Patney, et al. and Sun et al.)

Light Fields
To speed up realistic graphics for VR, we’re looking at rendering primitives beyond just today’s triangle meshes. In this collaboration with McGill and Stanford we’re using light fields to accelerate the lighting computations. Unlike today’s 2D light maps that paint lighting onto surfaces, these are a 4D data structure that stores the lighting in space at all possible directions and angles.

They produce realistic reflections and shading on all surfaces in the scene and even dynamic characters. This is the next step of unifying the quality of ray tracing with the performance of environment probes and light maps.

Real-time Ray Tracing
What about true run-time ray tracing? The NVIDIA Volta GPU is the fastest ray tracing processor in the world, and its NVIDIA Pascal GPU siblings are the fastest consumer ones. At about 1 billion rays/second, Pascal is just about fast enough to replace the primary rasterizer or shadow maps for modern VR. If we unlock the pipeline with the kinds of changes I’ve just described, what can ray tracing do for future VR?

The answer is: ray tracing can do a lot for VR. When you’re tracing rays, you don’t need shadow maps at all, thereby eliminating a latency barrier Ray tracing can also natively render red, green, and blue separately, and directly render barrel-distorted images for the lens. So, it avoids the need for the lens warp processing and the subsequent latency.

In fact, when ray tracing, you can completely eliminate the latency of rendering discrete frames of pixels so that there is no ‘frame rate’ in the classic sense. We can send each pixel directly to the display as soon as it is produced on the GPU. This is called ‘beam racing’ and eliminates the display synchronization. At that point, there are zero high-latency barriers within the graphics system.

Because there’s no flat projection plane as in rasterization, ray tracing also solves the field of view problem. Rasterization depends on preserving straight lines (such as the edges of triangles) from 3D to 2D. But the wide field of view needed for VR requires a fisheye projection from 3D to 2D that curves triangles around the display. Rasterizers break the image up into multiple planes to approximate this. With ray tracing, you can directly render even a full 360 degree field of view to a spherical screen if you want. Ray tracing also natively supports mixed primitives: triangles, light fields, points, voxels, and even text, allowing for greater flexibility when it comes to content optimization. We’re investigating ways to make all of those faster than traditional rendering for VR.

In addition to all of the ways that ray tracing can accelerate VR rendering latency and throughput, a huge feature of ray tracing is what it can do for image quality. Recall from the beginning of this article that the image quality of film rendering is due to an algorithm called path tracing, which is an extension of ray tracing. If we switch to a ray-based renderer, we unlock a new level of image quality for VR.

Real-time Path Tracing
Although we can now ray trace in real time, there’s a big challenge for real-time path tracing. Path tracing is about 10,000x more computationally intensive than ray tracing. That’s why movies takes minutes per frame to generate instead of milliseconds.

Under path tracing, the system first traces a ray from the camera to find the visible surface. It then casts another ray to the sun to see if that surface is in shadow. But, there’s more illumination in a scene than directly from the sun. Some light is indirect, having bounced off the ground or another surface. So, the path tracer then recursively casts another ray at random to sample the indirect lighting. That point also requires a shadow ray cast, and its own random indirect light…the process continues until it has traced about about 10 rays for each single path.

But if there’s only one or two paths at a pixel, the image is very noisy because of the random sampling process. It looks like this:

Film graphics solves this problem by tracing thousands of paths at each pixel. All of those paths at ten rays each are why path tracing is a net 10,000x more expensive than ray tracing alone.

To unlock path tracing image quality for VR, we need a way to sample only a few paths per pixel and still avoid the noise from random sampling. We think we can get there soon thanks to innovations like foveated rendering, which makes it possible to only pay for expensive paths in the center of the image, and denoising, which turns the grainy images directly into clear ones without tracing more rays.

We released three research papers this year towards solving the denoising problem. These are the result of collaborations with McGill University, the University of Montreal, Dartmouth College, Williams college, Stanford University, and the Karlsruhe Institute of Technology. These methods can turn a noisy, real-time path traced image like this:

Into a clean image like this:

Using only milliseconds of computation and no additional rays. Two of the methods use the image processing power of the GPU to achieve this. One uses the new AI processing power of NVIDIA GPUs. We trained a neural network for days on denoising, and it can now denoise images on its own in tens of milliseconds. We’re increasing the sophistication of that technique and training it more to bring the cost down. This is an exciting approach because it is one of several new methods we’ve discovered recently for using artificial intelligence in unexpected ways to enhance both the quality of computer graphics and the authoring process for creating new, animated 3D content to populate virtual worlds.

Computational Displays

The displays in today’s VR headsets are relatively simple output devices. The display itself does hardly any processing, it simply shows the data that is handed to it. And while that’s fine for things like TVs, monitors, and smartphones, there’s huge potential for improving the VR experience by making displays ‘smarter’ about not only what is being displayed but also the state of the observer. We’re exploring several methods of on-headset and even in-display processing to push the limits of VR.

Solving Vergence-Accommodation Disconnect
The first challenge for a VR display is the focus problem, which is technically called the ‘vergence-accommodation disconnect’. All of today’s VR and AR devices force you to focus about 1.5m away. That has two drawbacks:

When you’re looking at a very distant or close up object in stereo VR, the point where your two eyes converge doesn’t match the point where they are focused (‘accommodated’). That disconnect creates discomfort and is one of the common complaints with modern VR.
If you’re using augmented reality, then you are looking at points in the real world at real depths. The virtual imagery needs to match where you’re focusing or it will be too blurry to use. For example, you can’t read augmented map directions at 1.5m while you’re looking 20m into the distance while driving.

We created a prototype computational light field display allows you to focus at any depth by presenting light from multiple angles. This display represents an important break with the past because computation is occurring directly in the display. We’re not sending mere images: we’re sending complex data that the display converts into the right form for your eye. Those tiny grids of images that look a bit like a bug’s view of the world have to be specially rendered for the display, which incorporates custom optics—a microlens array—to present them in the right way so that they look like the natural world.

That first light field display was from 2013. Next week, at the ACM SIGGRAPH Asia 2018 conference, we’re presenting a new holographic display that uses lasers and intensive computation to create light fields out of interfering wavefronts of light. It is harder to visualize the workings here, but relies on the same underlying principles and can produce even better imagery.

We strongly believe that this kind of in-display computation is a key technology for the future. But light fields aren’t the only approach that we’ve taken for using computation to solve the focus problem. We’ve also created two forms of variable-focus, or ‘varifocal’ optics.

This display prototype projects the image using a laser onto a diffusing hologram. You look straight through the hologram and see its image as if it was in the distance when it reflects off a curved piece of glass:

We control the distance at which the image appears by moving either the hologram or the sunglass reflectors with tiny motors. We match the virtual object distance to the distance that you’re looking in the real world, so you can always focus perfectly naturally.

This approach requires two pieces of computation in the display: one tracks the user’s eye and the other computes the correct optics in order to render a dynamically pre-distorted image. As with most of our prototypes, the research version is much larger than what would become an eventual product. We use large components to facilitate research construction. These displays would look more like sunglasses when actually refined for real use.

Here’s another varifocal prototype, this one created in collaboration with researchers at the University of North Carolina, the Max Planck Institute, and Saarland University. This is a flexible lens membrane. We use computer-controlled pneumatics to bend the lens as you change your focus so that it is always correct.

Hybrid Cloud Rendering
We have a variety of new approaches for solving the VR latency challenge. One of them, in collaboration with Williams College, leverages the full spread of GPU technology. To reduce the delay in rendering, we want to move the GPU as close as possible to the display. Using a Tegra mobile GPU, we can even put the GPU right on your body. But a mobile GPU has less processing power than a desktop GPU, and we want better graphics for VR than today’s games… so we team the Tegra with a discrete GeForce GPU across a wireless connection, or even better, to a Tesla GPU in the cloud.

This allows a powerful GPU to compute the lighting information, which it then sends to the Tegra on your body to render final images. You get the benefit of reduced latency and power requirements while actually increasing image quality.

Reducing the Latency Baseline
Of course, you can’t push latency to less than the frame rate. If the display updates at 90 FPS, then it is impossible to have latency less than 11 ms in the worst case, because that’s how long the display waits between frames. So, how fast can we make the display?

We collaborated with scientists at the University of North Carolina to build a display that runs at sixteen thousand binary frames per second. Here’s a graph from a digital oscilloscope showing how well this works for the crucial case of a head turning. When you turn your head, latency in the screen update causes motion sickness.

In the graph, time is on the horizontal access. When the top, green line jumps, that is the time at which the person wearing the display turned their head. The yellow line is when the display updated. It jumps up to show the new image only 0.08 ms later…that’s about 500 times better than the 20 ms you experience in the worst case on a commercial VR system today.

The renderer can’t run at 16,000 fps, so this kind of display works by Time Warping the most recent image to match the current head position. We speed that Time Warp process up by running it directly on the head-mounted display. Here’s an image of our custom on-head processor prototype for this:

Unlike regular Time Warp which distorts the 2D image or the more advanced Space Warp that uses 2D images with depth, our method works on a full 3D data set as well. The picture on the far right shows a case where we’ve warped a full 3D scene in real-time. In this system, the display itself can keep updating while you walk around the scene, even when temporarily disconnected from the renderer. This allows us to run the renderer at a low rate to save power or increase image quality, and to produce low-latency graphics even when wirelessly tethered across a slow network.

The Complete System

As a reminder, in Part 1 of this article we identified the rendering pipeline employed by today’s VR headsets:

Putting together all of the techniques just described, we can sketch out not just individual innovations but a completely new vision for building a VR system. This vision removes almost all of the synchronization barriers. It spreads computation out into the cloud and right onto the head-mounted display. Latency is reduced by 50-100x and images have cinematic quality. There’s a 100x perceived increase in resolution, but you only pay for pixels where you’re looking. You can focus naturally, at multiple depths.

We’re blasting binary images out of the display so fast that they are indistinguishable from reality. The system has proper focus accommodation, a wide field of view, low weight, and low latency…making it comfortable and fashionable enough to use all day.

By breaking ground in the areas of computational displays, varifocal optics, foveated rendering, denoising, light fields, binary frames and others, NVIDIA Research is innovating for a new system for virtual experiences. As systems become more comfortable, affordable and powerful, this will become the new interface to computing for everyone.

All of the methods that I’ve described can be found in deep technical detail on our website.

I encourage everyone to experience the great, early-adopter modern VR systems available today. I also encourage you to join us in looking to the bold future of pervasive AR/VR/MR for everyone, and recognize that revolutionary change is coming through this technology.

The post Exclusive: How NVIDIA Research is Reinventing the Display Pipeline for the Future of VR, Part 2 appeared first on Road to VR.

November 29, 2017

Exclusive: How NVIDIA Research is Reinventing the Display Pipeline for the Future of VR, Part 1

Virtual experiences through virtual, augmented, and mixed reality are a new frontier for computer graphics. This frontier state is radically different from modern game and film graphics. For those, decades of production expertise and stable technology have already realized the potential of graphics on 2D screens. This article describes comprehensive new systems optimized for virtual experiences we’re inventing at NVIDIA.

Guest Article by Dr. Morgan McGuire

Dr. Morgan McGuire is a scientist on the new experiences in AR and VR research team at NVIDIA. He’s contributed to the Skylanders, Call of Duty, Marvel Ultimate Alliance, and Titan Quest game series published by Activision and THQ. Morgan is the coauthor of The Graphics Codex and Computer Graphics: Principles & Practice. He holds faculty positions at the University of Waterloo and Williams College.

NVIDIA Research sites span the globe, with our scientists collaborating closely with local universities. We cover a wide domain of applications, including self-driving cars, robotics, and game and film graphics.

Our innovation on virtual experiences includes technologies that you’ve probably heard a bit about already, such as foveated rendering, varifocal optics, holography, and light fields. This article details our recent work on those, but most importantly reveals our vision for how they’ll work together to transform every interaction with computing and reality.

NVIDIA works hard to ensure that each generation of our GPUs are the best in the world. Our role in the research division is thinking beyond that product cycle of steady evolutionary improvement, in order to look for revolutionary change and new applications. We’re working to take virtual reality from an early adopter concept to a revolution for all of computing.

Research is Vision

Our vision is that VR will be the interface to all computing. It will replace cell phone displays, computer monitors and keyboards, televisions and remotes, and automobile dashboards. To keep terminology simple, we use VR as shorthand for supporting all virtual experiences, whether or not you can also see the real world through the display.

We’re targeting the interface to all computing because our mission at NVIDIA is to create transformative technology. Technology is truly transformative only when it is in everyday use. It has to become a seamless and mostly transparent part of our lives to have real impact. The most important technologies are the ones we take for granted.

If we’re thinking about all computing and pervasive interfaces, what about VR for games? Today, games are an important VR application for early adopter power users. We already support them through products and are releasing new VR features with each GPU architecture. NVIDIA obviously values games highly and is ensuring that they will be fantastic in VR. However, the true potential of VR technology goes far beyond games, because games are only one part of computing. So, we started with VR games but that technology is now spreading with the scope of VR to work, social, fitness, healthcare, travel, science, education, and all other tasks for which computing now plays a role.

NVIDIA is in a unique position to contribute to the VR revolution. We’ve already transformed consumer computing once before having introduced the modern GPU in 1999, and with it high-performance computing for consumer applications. Today, not only your computer, but also your tablet, smartphone, automobile, and television now have GPUs in them. They provide a level of performance that once would have been considered a supercomputer only available to power users. As a result, we all enjoy a new level of productivity, convenience, and entertainment. Now we’re all power users, thanks to invisible and pervasive GPUs in our devices.

For VR to become a seamless part of our lives, the VR systems must become more comfortable, easy to use, affordable, and powerful. We’re inventing new headset technology that will replace modern VR’s bulky headsets with thin glasses driven by lasers and holograms. They’ll be as widespread as tablets, phones, and laptops, and even easier to operate. They’ll switch between AR/VR/MR modes instantly. And they’ll be powered by new GPUs and graphics software that will be almost unrecognizably different from today’s technology.

All of this innovation points to a new way of interacting with computers, and this will require not just a new devices or software but an entirely new system for VR. At NVIDIA, we’re inventing that system with cutting-edge tools, sensors, physics, AI, processors, algorithms, data structures, and displays.

Understanding the Pipeline

NVIDIA Research is very open about what we’re working on and sharing our results through scientific publications and open source code. In Part 2 of this article, I’m going to present a technical overview of some of our recent inventions. But first, to put them and our vision for future AR/VR systems in context, let’s examine how current film, game, and modern VR systems work.

Film Graphics Systems

Hollywood-blockbuster action films contain a mixture of footage of real objects and computer generated imagery (CGI) to create amazing visual effects. The CGI is so good now that Hollywood can make scenes that are entirely computer generated. During the beautifully choreographed introduction to Marvel’s Deadpool (2016), every object in the scene is rendered by a computer instead of filmed. Not just the explosions and bullets, but the buildings, vehicles, and people.

From a technical perspective, the film system for creating these images with high visual fidelity can be described by the following diagram:

The diagram has many parts, from the authoring stages on the left, through the modeling primitives of particles, triangles, and curved subdivision surfaces, to the renderer. The renderer uses an algorithm called ‘path tracing’ that photo-realistically simulates light in the virtual scene.

The rendering is also followed by manual post-processing of the 2D images for color and compositing. The whole process loops, as directors, editors, and artists iterate to modify the content based on visual feedback before it is shown to audiences. The image quality of film is our goal for VR realism.

Game Systems

The film graphics system evolved into a similar system for 3D games. Games represent our target for VR interaction speed and flexibility, even for non-entertainment applications. The game graphics system looks like this diagram:

I’m specifically showing a deferred shading pipeline here. That’s what most PC games use because it delivers the highest image quality and throughput.

Like film, it begins with the authoring process and has the big art direction loop. Games add a crucial interaction loop for the player. When the player sees something on-screen, they react with a button press. That input then feeds into a later frame in the pipeline of graphics processing. This process introduces ‘latency’, which is the time it takes to update frames with new user input taken into account. For an action title to feel responsive, latency needs to be under 150ms in a traditional video game, so keeping it reasonably low is a challenge.

Unfortunately, there are many factors that can increase latency. For instance, games use a ‘rasterization’-based rendering algorithm instead of path tracing. The deferred-shading rasterization pipeline has a lot of stages, and each stage adds some latency. As with film, games also have a large 2D post-processing component, which is labelled ‘PostFX’ in the multi-stage pipeline referenced above. Like an assembly line, that long pipeline increases throughput and allows smooth framerates and high resolutions, but the increased complexity adds latency.

If you only look at the output, pixels are coming out of the assembly line quickly, which is why PC games have high frame rates. The catch is that the pixels spend a long time in the pipeline because it has so many stages. The red vertical lines in the diagram represent barrier synchronization points. They amplify the latency of the stages because at a barrier, the first pixel of the next stage can’t be processed until the last pixel of the previous stage is complete.

The game pipeline can deliver amazing visual experiences. With careful art direction, they approach film CGI or even live-action film quality on a top of the line GPU. For example, look at the video game Star Wars: Battlefront II (2017).

Still, the best frames from a Star Wars video game will be much more static than those from a Star Wars movie. That’s because game visual effects must be tuned for performance. This means that the lighting and geometry can’t change in the epic ways we see on the big screen. You’re probably familiar with relatively static gameplay environments that only switch to big set-piece explosions during cut scenes.

Modern Virtual Reality Systems

Now let’s see how film and games differ from modern VR. When developers migrate their game engines to VR, the first challenge they hit is the specification increase. There’s a jump in raw graphics power from 60 million pixels per second (MPix/s) in a game to 450 MPix/s for VR. And that’s just the beginning… these demands will quadruple that in the next year.

450 Mpix/second on an Oculus Rift or HTC Vive today is almost a seven times increase in the number of pixels per second compared to 1080p gaming at 30 FPS. This is a throughput increase because it changes the rate at which pixels move through the graphics system. That’s big, but the performance challenge is even greater. Recall how game interaction latency was around 100-150ms between a player input and pixels changing on the screen for a traditional game. For VR, we need not only a seven times throughput increase, but also a seven times reduction in the latency at the same time. How do today’s VR developers accomplish this? Let’s look at latency first.

In the diagram below, latency is the time it takes data to move from the left to the right side of the system. More stages in the system give better throughput because they can work in parallel, but they also make the pipeline longer, so latency gets worse. To reduce latency, you need to eliminate boxes and red lines.

As you might expect, to reduce latency developers remove as many stages as they can, as shown in the modified diagram above. That means switching back to a ‘forward’ rendering pipeline where everything is done in one 3D pass over the scene instead of multiple 2D shading and PostFX passes. This reduces throughput, which is then conserved by significantly lowering image quality. Unfortunately, it still doesn’t give quite enough latency reduction.

The key technology that helped close the latency gap in modern VR is called Time Warp. Under Time Warp, images shown on screen can be updated without a full trip through the graphics pipeline. Instead, the head tracking data are routed to a GPU stage that appears after rendering is complete. Because this stage is ‘closer’ to the display, it can warp the already-rendered image to match the latest head-tracked data, without taking a trip through the entire rendering pipeline. With some predictive techniques, this brings the perceived latency down from about 50ms to zero in the best case.

Another key enabling idea for modern VR hardware is Lens Distortion. A good camera’s optics contain at least five high quality glass lenses. Unfortunately, that’s heavy, large, and expensive, and you can’t strap the equivalent of two SLR cameras to your head.

This is why many head-mounted displays use a single inexpensive plastic lens per eye. These lenses are light and small, but low quality. To correct for the distortion and chromatic aberration from a simple lens, shaders pre-distort the images by the opposite amounts.

NVIDIA GPU hardware and our VRWorks software accelerate the modern VR pipeline. The GeForce GTX 1080 and other Pascal architecture GPUs use a new feature called Simultaneous Multiprojection to render multiple views with increased throughput and reduced latency. This feature provides single-pass stereo so that both eyes render at the same time, along with lens-matched shading, which renders directly into the predistorted image and gives better performance and more sharpness. The GDDR5X memory in the 1080 provides 1.7x the bandwidth of the previous generation and hardware audio and physics help create a more accurate virtual world to increase immersion.

Reduced pipeline stages, Time Warp, Lens Distortion, and a powerful PC GPU comprise the Modern VR system.

– – — – –

Now that we’ve established how film, games, and VR graphics work, stay tuned for Part 2 of this article where we’ll explore the limits of human visual perception and methods we’re exploring to get closer to them in VR systems of the future.

The post Exclusive: How NVIDIA Research is Reinventing the Display Pipeline for the Future of VR, Part 1 appeared first on Road to VR.

January 27, 2017

Researchers Demonstrate 100° Dynamic Focus AR Display With Membrane Mirrors

Achieving a wide field of view in an AR headset is a challenge in itself, but so too is fixing the so-called vergence-accommodation conflict which presently plagues most VR and AR headsets, making them less comfortable and less in sync with the way our vision works in the real world. Researchers have set out to try to tackle both issues using varifocal membrane mirrors.

Researchers from UNC, MPI Informatik, NVIDIA, and MMCI have demonstrated a novel see-through near-eye display aimed at augmented reality which uses membrane mirrors to achieve varifocal optics which also manage to maintain a wide 100 degree field of view.

Vergence-Accommodation Conflict

accomodation-eye-diagram — Accommodation is the bending of the eye’s lens to focus light from objects at different depths. | Photo courtesy Pearson Scott Foresman

In the real world, to focus on a near object, the lens of your eye bends to focus the light from that object onto your retina, giving you a sharp view of the object. For an object that’s further away, the light is traveling at different angles into your eye and the lens again must bend to ensure the light is focused onto your retina. This is why, if you close one eye and focus on your finger a few inches from your face, the world behind your finger is blurry. Conversely, if you focus on the world behind your finger, your finger becomes blurry. This is called accommodation.

vergence-diagram — Vergence is the rotation of each eye to overlap each individual view into one aligned image. | Photo courtesy Fred Hsu (CC BY-SA 3.0)

Then there’s vergence, which is when each of your eyes rotates inward to ‘converge’ the separate views from each eye into one overlapping image. For very distant objects, your eyes are nearly parallel, because the distance between them is so small in comparison to the distance of the object (meaning each eye sees a nearly identical portion of the object). For very near objects, your eyes must rotate sharply inward to converge the image. You can see this too with our little finger trick as above; this time, using both eyes, hold your finger a few inches from your face and look at it. Notice that you see double-images of objects far behind your finger. When you then look at those objects behind your finger, now you see a double finger image.

With precise enough instruments, you could use either vergence or accommodation to know exactly how far away an object is that a person is looking at (remember this, it’ll be important later). But the thing is, both accommodation and vergence happen together, automatically. And they don’t just happen at the same time; there’s a direct correlation between vergence and accommodation, such that for any given measurement of vergence, there’s a directly corresponding level of accommodation (and vice versa). Since you were a little baby, your brain and eyes have formed muscle memory to make these two things happen together, without thinking, any time you look at anything.

But when it comes to most of today’s AR and VR headsets, vergence and accommodation are out of sync due to inherent limitations of the optical design.

In a basic AR or VR headset, there’s a display (which is, let’s say, 3″ away from your eye) which makes up the virtual image, and a lens which focuses the light from the display onto your eye (just like the lens in your eye would normally focus the light from the world onto your retina). But since the display is a static distance from your eye, the light coming from all objects shown on that display is coming from the same distance. So even if there’s a virtual mountain five miles away and a coffee cup on a table five inches away, the light from both objects enters the eye at the same angle (which means your accomodation—the bending of the lens in your eye—never changes).

That comes in conflict with vergence in such headsets which—because we can show a different image to each eye—is variable. Being able to adjust the imagine independently for each eye, such that our eyes need to converge on objects at different depths, is essentially what gives today’s AR and VR headsets stereoscopy. But the most realistic (and arguably, most comfortable) display we could create would eliminate the vergence-accommodation issue and let the two work in sync, just like we’re used to in the real world.

Eliminating the Conflict

To make that happen, there needs to be a way to adjust the focal power of the lens in the headset. With traditional glass or plastic optics, the focal power is static and determined by the curvature of the lens. But if you could adjust the curvature of a lens on-demand, you could change the focal power whenever you wanted. That’s where membrane mirrors and eye-tracking come in.

In a soon to be published paper titled Wide Field Of View Varifocal Near-Eye Display Using See-Through Deformable Membrane Mirrors, researchers demonstrated how they could use mirrors made of deformable membranes inside of vacuum chambers to create a pair of varifocal see-through lenses, forming the foundation of an AR display.

The mirrors are able to set the accommodation depth of virtual objects anywhere between 20cm to (optical) infinity. The response time of the lenses between that minimum and maximum focal power is 300ms, according to the paper, with transitions between smaller focal powers happening faster.

But how to know how far to set the accommodation depth so that it’s perfectly in sync with the convergence depth? Thanks to integrated eye-tracking technology, the apparatus is able to rapidly measure the convergence of the user’s eyes, the angle of which can easily be used to determine the depth of anything the user is looking at. With that data in hand, setting the accommodation depth to match is as easy as adjusting the focal power of the lens.

Those of you following along closely will probably see a potential limitation to this approach—the accommodation depth can only be set for one virtual object at a time. The researchers thought about this too, and proposed a solution to be tested at a later date:

Our display is capable of displaying only a single depth at a time, which leads to incorrect views for virtual content [spanning] different depths. A simple solution to this would be to apply a defocus kernel approximating the eye’s point spread function to the virtual image according to the depth of the virtual objects. Due to the potential of rendered blur not being equivalent to optical blur, we have not implemented this solution. Future work must evaluate the effectiveness of using rendered blur in place of optical blur.

Other limitations of the system (and possible solutions) are detailed in section 6 of the paper, including varifocal response time, form-factor, latency, consistency of focal profiles, and more.

Retaining a Wide Field of View & High Resolution

But this isn’t the first time someone has demonstrated a varifocal display system. The researchers identified several other varifocal display approaches, including free-form optics, light field displays, pinlight displays, pinhole displays, multi-focal plane display, and more. But, according to the paper’s authors, all of these approaches make significant tradeoffs in other important areas like field of view and resolution.

And that’s what makes this novel membrane mirror approach so interesting—it not only tackles the vergence-accommodation conflict, but does so in a way that allows a wide 100 degree field of view and retains a relatively high resolution, according to the authors. You’ll notice in the chart above, that, of the different varifocal approaches the researchers identified, they show that any large-FOV approach results in a low angular resolution (and vice-versa), except for their solution.

– – — – –

This technology is obviously at a very preliminary stage, but its use as a solution for several key challenges facing AR and VR headset designs has been effectively demonstrated. And with that, I’ll leave the parting thoughts to the paper’s authors (D. Dunn, C. Tippets, K. Torell, P. Kellnhofer, K. Akşit, P. Didyk, K. Myszkowski, D. Luebke, and H. Fuchs.):

Despite few limitations of our system, we believe that providing correct focus cues as well as wide field of view are most crucial features of head-mounted displays that try to provide seamless integration of the virtual and the real world. Our screen not only provides basis for new, improved designs, but it can be directly used in perceptual experiments that aim at determining requirements for future systems. We, therefore, argue that our work will significantly facilitate the development of augmented reality technology and contribute to our understanding of how it influences user experience.

The post Researchers Demonstrate 100° Dynamic Focus AR Display With Membrane Mirrors appeared first on Road to VR.