‘Neural Supersampling’ Could Give Future Oculus Quests Console-Quality Graphics

A new neural network developed by Facebook’s VR/AR research division could enable console-quality graphics on future standalone headsets.

This ‘Neural Supersampling’ algorithm can take a low resolution rendered frame and upscale it 16x. That means, for example, a future headset could theoretically drive dual 2K panels by only rendering 540×540 per eye – without even requiring eye tracking.

Being able to render at lower resolution means more GPU power is free to run detailed shaders and advanced effects, which could bridge the gap from mobile to console VR. To be clear, this can’t turn a mobile chip into a PlayStation 5, but it should bridge the gap somewhat.

“AI upscaling” algorithms have become popular in the last few years, with some websites even letting users upload any image on their PC or phone to be upscaled. Given enough training data, they can produce a significantly more detailed output than traditional upscaling. While just a few years ago “Zoom and Enhance” was used to mock those falsely believing computers could do this, machine learning has made it reality. The algorithm is technically only “hallucinating” what it expects the missing detail should look like, but in many cases there is little practical difference.

Facebook claims its neural network is state of the art, outperforming all other similar algorithms- the reason it’s able to achieve 16x upscaling. What makes this possible is the inherent knowledge of the depth of each object in the scene- it would not be anywhere near as effective with flat images.

In the provided example images, Facebook’s algorithm seems to have reached the point where it can reconstruct even fine details like line or mesh patterns.

Back in March, Facebook published a somewhat similar paper. It also described the idea of freeing up GPU power by using neural upsampling. But that wasn’t actually what the paper was about. The researchers’ direct goal was to figure a “framework” for running machine learning algorithms in real time within the current rendering pipeline (with low latency), which they achieved. Combining that framework with this neural network could make this technology practical.

“As AR/VR displays reach toward higher resolutions, faster frame rates, and enhanced photorealism, neural supersampling methods may be key for reproducing sharp details by inferring them from scene data, rather than directly rendering them. This work points toward a future for high-resolution VR that isn’t just about the displays, but also the algorithms required to practically drive them,” a blog post by Lei Xiao explains.

For now, this is all just research and you can read the paper here. What’s stopping this from being a software update for your Oculus Quest tomorrow? The neural network itself takes time to process. The current version runs at 40 frames per second at Quest resolution on the $3000 NVIDIA Titan V.

But in machine learning, optimization comes second- and happens to an extreme degree. Just three years ago, the algorithm Google Assistant uses to speak realistically also required a $3000 GPU. Today it runs locally on several smartphones.

The researchers “believe the method can be significantly faster with further network optimization, hardware acceleration and professional grade engineering”. Hardware acceleration for machine learning tasks is available on the Snapdragon chips- Qualcomm claims its XR2 has 11x the ML performance as Quest.

If optimization and models built for mobile system-on-a-chip neural processing units don’t work out, another approach is a custom chip designed for the task. This approach was taken on the $50 Nest Mini speaker (the cheapest device with local Google Assistant). Facebook is reportedly working with Samsung on custom chips for AR glasses, but there’s no indication of the same happening for VR- at least not yet.

Facebook classes this kind of approach “neural rendering”. Just like neural photography helped bridged the gap between smartphone and digital single lens reflex cameras, Facebook hopes it can one day push a little more power out of mobile chips than anyone might have expected.

The post ‘Neural Supersampling’ Could Give Future Oculus Quests Console-Quality Graphics appeared first on UploadVR.

Snap’s Lens Studio Now Supports Custom ML-powered Snapchat Lenses

Over the past few years, Snapchat’s growing collection of Lenses have been some of the best examples of smartphone-powered augmented reality, enabling users to effortlessly add facial modificationsenvironmental effects, and location-specific filters to their photos. Now parent company Snap is enabling creators to use self-provided machine learning models in Lenses, and hoping the initiative will inspire partnerships between ML developers and creatives.

The latest key change is an update to Lens Studio, the free desktop development app used to create most of Snapchat’s AR filters. A new feature called SnapML — unrelated to IBM’s same-named training tool — will let developers import machine learning models to power lenses, expanding the range of real world objects and body parts Snapchat will be able to instantly identify. As an example of the technology, Lens Studio will include a new foot tracking ML model developed by Wannaby, enabling developers to craft Lenses for feet.

Beyond Wannaby, developers including CV2020, visual filter maker Prisma, and several unnamed Lens creators are also working on SnapML-based filters. Lens Studio has also added new hand gesture templates, as well as Face Landmarks and Face Expressions that should improve facial tracking for specific situations. Additionally, the user-facing Snapchat app will be expanding its Scan feature with the abilities to recognize 90% of all known plants and trees, nearly 400 breeds of dogs, packaged food labels, and Louis Vuitton’s logo, plus SoundHound integration to let users find pertinent Lenses using only voice commands.

Snap is also previewing a new feature, Local Lenses, which will “soon” let users share persistent augmented reality content within neighborhoods. Local Lenses promises to create large-scale point clouds to recognize multiple buildings within an area — an expansion of the company’s prior Landmarking feature — to map entire city blocks, the same vision pursued by companies such as Immersal and Scape (now owned by Facebook). Unlike rivals, which have focused primarily on mapping and marketing applications, Snap plans to let users change the look of neighborhoods with digital content.

While the technology behind the feature is fascinating, the way Snap is promoting it today feels somewhat awkward given current social unrest over the Black Lives Matter movement. The company says Snapchat users will be able to “decorate nearby buildings with colorful paint” that will be visible to friends. Though its sample video is far more like Nintendo’s Splatoon than Sega’s Jet Set Radio, using large splashes of color rather than written words, we’ll have to see whether Local Lenses are used solely for positive purposes, or are steps towards AR graffiti similar to what’s currently appearing in real cities as protests continue.

This post by Jeremy Horwitz originally appeared on Venturebeat. 

The post Snap’s Lens Studio Now Supports Custom ML-powered Snapchat Lenses appeared first on UploadVR.

Facebook Researchers Found A Way To Essentially Give Oculus Quest More GPU Power

Facebook Researchers seem to have figured out a way to use machine learning to essentially give Oculus Quest developers 67% more GPU power to work with.

The Oculus Quest is a standalone headset, which means the computing hardware is inside the device itself. Because of the size and power constraints this introduces, as well as the desire to sell the device at a relatively affordable price, Quest uses a smartphone chip significantly less powerful than a gaming PC.

“Creating next-gen VR and AR experiences will require finding new, more efficient ways to render high-quality, low-latency graphics.”

Facebook AI Research

The new technique works by rendering at a lower resolution than usual, then the center of the view is upscaled using a machine learning “super resolution” algorithm. These algorithms have become popular in the last few years, with some websites even letting users upload any image on their PC or phone to be AI upscaled.

Given enough training data, super resolution algorithms can produce a significantly more detailed output than traditional upscaling. While just a few years ago “Zoom and Enhance” was a meme used to mock those who falsely believed computers could do this, machine learning has made this idea a reality. Of course, the algorithm is technically only “hallucinating” what it expects the missing detail might look like, but in many cases there is no practical difference.

One of the paper’s authors is Behnam Bastani, Facebook’s Head of Graphics in the Core AR/VR Technologies department. Between 2013 and 2017, Bastani worked for Google, developing “advanced display systems” and then leading development of Daydream’s rendering pipeline.

It’s interesting to note that the paper is not actually primarily about either the super resolution algorithm or freeing up GPU resources by using that. The researchers’ direct goal was to figure a “framework” for running machine learning algorithms in real time within the current rendering pipeline (with low latency), which they achieved. Super resolution upscaling is essentially just the first example of what this enables.

Because this is the focus of the paper, there isn’t much detail on the exact size of the upscaled region or the perceptibility, other than a mention of “temporally coherent and visually pleasing results in VR“.

The researchers claim that when rendering at 70% lower resolution in each direction, the technique can save roughly 40% of GPU time, and developers can “use those resources to generate better content”.

For applications like a media viewer, the saved GPU power could be kept unused to increase battery life, since on Snapdragon chips (and most others) the DSP (used for machine learning tasks like this) is significantly more power efficient than the GPU.

A demo video was produced using Beat Saber, where the left image “was generated using a fast super-resolution network applied to 2x low resolution content” (the right image is regular full resolution rendering):

Apparently, using super resolution to save GPU power is just one potential application of this rendering pipeline framework:

“Besides super-resolution application, the framework can also be used to perform compression artifact removal for streaming content, frame prediction, feature analysis and feedback for guided foveated rendering. We believe enabling computational methods and machine learning in mobile graphics pipeline will open the door for a lot of opportunities towards the next generation of mobile graphics.”

Facebook AI Research

There is no indication from this paper that this technology is planned to be deployed in the consumer Oculus Quest, although it doesn’t give any reason why it couldn’t either. There could be technical barriers that aren’t stated here, or it may just be considered not worth the complexity until a next generation headset. We’ve reached out to Facebook to get answers on this. Regardless, it looks clear that machine learning may play a role in bringing standalone VR closer to PC VR over the next decade.

The post Facebook Researchers Found A Way To Essentially Give Oculus Quest More GPU Power appeared first on UploadVR.

Snapchat Adds Lava And Water AR Lenses Using Ground Segmentation And Machine Learning

Although it’s still best known as a social media network, Snapchat has rapidly become a leader in real-time augmented reality effects, thanks to Lenses that alter the look of people and landmarks. This week the app is adding two ground replacement Lenses to the mix, enabling users to swap solid pavement, carpeting, or other terrain for bubbling lava or reflective water through a mix of segmentation technology and machine learning.

Both of the new Lenses work the same way, quickly determining which part of the camera’s live video feed is “ground” and swapping it for your preferred form of liquid — plus a little yellow caution sign. The lava version is arguably more convincing, as it uses heat haze, smoke, and particles to mask the edges of the areas it’s replacing, forming little islands of land next to the moving liquid. Snapchat’s water Lens broadly swaps almost all of the ground in front of you for reflective water, which alternates between somewhat believable and obviously artificial, depending on how well the segmentation works.

The real-time ground segmentation system uses machine learning models to understand geometry and semantics, isolating obviously ground-based objects from contrasting backgrounds. While the system does a good job outdoors, lower contrast or more blended indoor environments can lead to segmentation hiccups — generally an over-application of the effect to unwanted areas — suggesting that the machine has some more learning to do. Snapchat says the new Lenses were built using an internal version of Lens Studio and it’s considering bringing the technology to a public version in the future.

Facebook and Google have both open-sourced image segmentation tools that use computer vision to classify whole objects and individual pixels, though in Google’s case the software’s intended to run on dedicated cloud-based hardware with tensor processing units. Snapchat’s recent Lens innovations have been particularly impressive because they run in real time on common mobile devices, enabling at least semi-plausible time-warping of faces and other live augmentations of reality. The company has a deal with Verizon to use 5G for advanced AR features using high-bandwidth, low-latency connections with access to edge processing resources.

This post by Jeremy Horwitz originally appeared on VentureBeat.

The post Snapchat Adds Lava And Water AR Lenses Using Ground Segmentation And Machine Learning appeared first on UploadVR.

Oculus Quest & Rift S Controller Tracking Patched To Work Near Christmas Trees

The controller tracking of the Oculus Quest and Rift S needed to be patched to work properly near Christmas trees and other holiday lights.

Oculus Touch controllers are built with a constellation of infrared LEDs under the plastic of the tracking ring. These lights are tracked by the cameras on the headset in order to determine the position of the controller.

Holiday lights like those on Christmas trees can look a lot like these LEDs to the cameras. This means the algorithm has more sources of light in each frame to analyze, and sometimes it can’t tell the difference between the controller LEDs and the irrelevant LEDs at all. This could make the controller tracking work poorly, showing the wrong position for the controller.

The solution works because the headset tracking algorithm already remembers static landmarks seen by the cameras in the room — that’s how it works without external sensors. By keeping a track of these landmarks the system can reject blobs of light which stay in the same position and don’t move.

This process on its own, however, is not enough to eliminate all the issues. So Facebook also trained a neural network to detect and filter out blobs of lights that are too small or too large to be a controller LED given its last position.

You can read a full technical explanation of the solution in Facebook’s blog post.

This isn’t the first time Facebook improved the controller tracking on Quest and Rift S. Both headsets use the Oculus Insight tracking system and launched on the same day. At launch, the controllers wouldn’t track when brought too close to the headset and tracking could break when one was placed in front of the other. This made games like shooters difficult to play until a patch was released one month after launch which fixed these issues.

The post Oculus Quest & Rift S Controller Tracking Patched To Work Near Christmas Trees appeared first on UploadVR.

Facebook’s AI Research Chief Talks AR Glasses, AI, And Machine Learning

Facebook AI Research chief AI scientist Yann LeCun believes augmented reality glasses are an ideal challenge for machine learning (ML) practitioners — a “killer app” — because they involve a confluence of unsolved problems.

Perfect AR glasses will require the combination of conversational AI, computer vision, and other complex systems capable of operating with a form factor as small as a pair of spectacles. Low-power AI will be necessary to ensure reasonable battery life so users can wear and use the glasses for long periods of time.

Alongside companies like AppleNiantic, and QualcommFacebook this fall confirmed plans to make augmented reality glasses by 2025.

“This is a huge challenge for hardware because you might have glasses with cameras that track your vision in real time at variable latency, so when you move … that requires quite a bit of computation. You want to be able to interact with an assistant through voice by talking to it so it listens to you all the time, and it will talk to you as well. You want to have gesture [recognition] so the assistant [can perform] real-time hand tracking,” he said.

Real-time hand tracking works already, LeCun said, but “we just don’t know how to do it in a tiny form factor with power consumption that will be compatible with AR glasses.”

“In terms of the bigger variant[s] of power and power consumption, and performance and form factor, it’s really beyond what we can do today, so you have to use tricks that people never thought were appropriate. One trick for example is neural nets,” he added.

facebook ai chief yann lecun

Becoming more efficient

LeCun spoke last Friday at the EMC2 energy-efficient machine learning workshop at NeurIPS, the largest machine learning research conference in the world. He talked about how hardware limitations can restrict what researchers allow themselves to imagine is possible and said that good ideas are sometimes abandoned when hardware is too slow, software isn’t readily available, or experiments are not easily reproducible.

He also talked about specific deep learning approaches — like differential associative memory and convolutional neural networks — that pose a challenge and may require new hardware. Differential associative memory, or soft RAM, is a kind of computation that’s currently widely used in natural language processing (NLP) and is beginning to be seen more often in computer vision applications.

“Deep learning and machine learning architectures are going to change a lot in the next few years. You can see a lot of this already, where now with NLP, the only game in town basically is Transformer networks,” he said.

He added that more efficient batch processing and techniques for self-supervised learning that help AI learn more like humans and animals do may also help make more energy-efficient AI.

Following LeCun’s talk, Vivienne Sze, MIT associate professor of electrical engineering and computer science, talked about the need for a systematic way to evaluate deep neural networks. Earlier in the week, Sze’s presentation on efficient deep neural networks garnered some of the most views of any NeurIPS video shared online, according to the SlidesLive website.

“Memories that are larger and farther tend to consume more power,” Sze said. “All weights are not created equal.” Sze also demonstrated Accelergy, a framework for estimating hardware energy consumption developed at MIT.

In addition to the talks, the workshop’s poster session showcased noteworthy low-power AI solutions. They include DistilBERT, a lighter version of Google’s BERT that Hugging Face made especially for fast deployment on edge devices, and a comparison of quantization for deep neural networks by SRI International and Latent AI.

A number of prominent voices are calling for the machine learning community to confront climate change and saying that such a focus can drive innovation. In a panel conversation at NeurIPS last week, another deep learning pioneer, Yoshua Bengio, called for ML researchers to place more value on machine learning that impacts climate change and less on the number of publications they’re getting.

And in an interview with VentureBeat, Google AI chief Jeff Dean said he supports the idea of creating a compute-per-watt standard as a way to encourage more efficient hardware.

Saving power and the planet

Alongside theoretical work at NeurIPS to explain the workings of deep learning algorithms, a number of works at the conference highlighted the importance of accounting for AI’s contribution to climate change. These include a paper titled “Energy Usage Reports: Environmental awareness as part of algorithmic accountability.”

“The carbon footprint of algorithms must be measured and transparently reported so computer scientists can take an honest and active role in environmental sustainability,” the paper reads.

In line with this assertion, earlier at the conference organizers suggested AI researchers who submit work to NeurIPS in 2020 may be required to share the carbon footprint of work they submit for consideration.

The recently released 2019 AI Now Institute report included measuring the carbon footprint of algorithms among a dozen recommendations that it says can lead to a more just society.

In other energy-efficient AI news, machine learning practitioners from Element AI and Mila Quebec AI Institute last week introduced a new tool for calculating the carbon emissions of training AI models with GPUs to predict energy use based on factors like length of use and cloud region.

This drive toward more efficient machine learning could lead to innovations that change the planet. But big ideas and challenges need a focal point — something to make the theoretical feel more practical, with actual, specific problems that need to be solved. According to LeCun, AR glasses may be that ideal use case for machine learning practitioners.


This post by Khari Johnson originally appeared in VentureBeat.

The post Facebook’s AI Research Chief Talks AR Glasses, AI, And Machine Learning appeared first on UploadVR.