Google Releases Real-time Mobile Hand Tracking to R&D Community

Google has released to researchers and developers its own mobile device-based hand tracking method using machine learning, something Google R call a “new approach to hand perception.”

First unveiled at CVPR 2019 back in June, Google’s on-device, real-time hand tracking method is now available for developers to explore—implemented in MediaPipe, an open source cross-platform framework for developers looking to build processing pipelines to handle perceptual data, like video and audio.

The approach is said to provide high-fidelity hand and finger tracking via machine learning, which can infer 21 3D ‘keypoints’ of a hand from just a single frame.

“Whereas current state-of-the-art approaches rely primarily on powerful desktop environments for inference, our method achieves real-time performance on a mobile phone, and even scales to multiple hands,” say in a blog post.

 

Google Research hopes its hand-tracking methods will spark in the community “creative use cases, stimulating new applications and new research avenues.”

 explain that there are three primary systems at play in their hand tracking method, a palm detector model (called BlazePalm), a ‘hand landmark’ model that returns high fidelity 3D hand keypoints, and a gesture recognizer that classifies keypoint configuration into a discrete set of gestures.

SEE ALSO
Indie Dev Experiment Brings Google Lens to VR, Showing Real-time Text Translation

Here’s a few salient bits, boiled down from the full blog post:

  • The BlazePalm technique is touted to achieve an average precision of 95.7% in palm detection, researchers claim.
  • The model learns a consistent internal hand pose representation and is robust even to partially visible hands and self-occlusions.
  • The existing pipeline supports counting gestures from multiple cultures, e.g. American, European, and Chinese, and various hand signs including “Thumb up”, closed fist, “OK”, “Rock”, and “Spiderman”.
  • Google is open sourcing its hand tracking and gesture recognition pipeline in the MediaPipe framework, accompanied with the relevant end-to-end usage scenario and source code, here.

In the future, say Google Research plans on continuing its hand tracking work with more robust and stable tracking, and also hopes to enlarge the amount of gestures it can reliably detect. Moreover, they hope to also support dynamic gestures, which could be a boon for machine learning-based sign language translation and fluid hand gesture controls.

Not only that, but having more reliable on-device hand tracking is a necessity for AR headsets moving forward; as long as headsets rely on outward-facing cameras to visualize the world, understanding that world will continue to be a problem for machine learning to address.

The post Google Releases Real-time Mobile Hand Tracking to R&D Community appeared first on Road to VR.

Google Develop AR Microscope That Can Detect Cancer

Immersive technologies have already seen a variety of uses in healthcare to improve patient outcomes in a number of ways. A team of researchers from Google are taking it further with the reveal of a prototype Augmented Reality Microscope which can help detect cancer in real-time.

The research around the prototype AR microscope was unveiled at the meeting of the American Association for Cancer Research (AACR) in Chicago, Illinois, where Google described the prototype platform as using AR and deep learning tools that could assist pathologists all over the world.

The platform consists of a modified light microscope that enables real-time image analysis and presentation of results directly into the user’s field-of-view. The device can be retrofitted into existing light microscopes, using low-cost components, without need for whole slide digital versions of the analysed tissue.

“In principle, the ARM can provide a wide variety of visual feedback, including text, arrows, contours, heatmaps or animations, and is capable of running many types of machine learning algorithms aimed at solving different problems such as object detection, quantification or classification,” wrote Martin Stumpe (Technical Lead) and Craig Mermel (Product Manager) of the Google Brain Team.

Google has tested the AR microscope to run two different cancer detection algorithms – one for breast cancer metastases in lymph node specimens, and another for prostate cancer in prostatectomy specimens. The results were said to be impressive, though Google said that further studies and assessments needed to be conducted.

“At Google, we have also published results showing that a convolutional neural network is able to detect breast cancer metastases in lymph nodes at a level of accuracy comparable to a trained pathologist,” the Google team said in its blog post. “We believe that the ARM has potential for a large impact on global health, particularly for the diagnosis of infectious diseases, including tuberculosis and malaria, in the developing countries.”

It’s just the latest in a number of developments using immersive technologies in the medtech space and further information can be found on the Google Research blog. For continued coverage of immersive technology use in healthcare, keep watching VRFocus.

Mixed Reality VR Videos Become More Expressive Thanks to Google Research and Daydream Labs

Conveying what it’s actually like in virtual reality (VR) has been one of the biggest hurdles to adoption, with the most common video technique being a combination of green screen and mixed reality (MR) technology. While this helps viewers see what VR players are engaged in there was one other barrier left, the headset itself. Now Google Research and Daydream Labs have unveiled a new digital technique that allows a users face to be seen whilst wearing a head-mounted display (HMD).

What Google has come up with is a way to make a headset seem transparent so that viewers watching an MR video can see the range of emotions being portrayed by the player. To do this the development team uses a combination of 3D vision, machine learning and graphics techniques to build a model of the person’s face, capturing various facial variations. Then a modified HTC Vive is used that contains SMI eye tracking tech, recording gaze related data. This is then blended together to give the illusion for seeing a users face whilst they play in a virtual world.

Daydream_Labs_Research_headset_removal

The Google Research Blog goes into much greater detail with Vivek Kwatra, Research Scientist and Software Engineers, Christian Frueh, Avneesh Sud explaining the future applications of the technology: “Headset removal is poised to enhance communication and social interaction in VR itself with diverse applications like VR video conference meetings, multiplayer VR gaming, and exploration with friends and family. Going from an utterly blank headset to being able to see, with photographic realism, the faces of fellow VR users promises to be a significant transition in the VR world.”

The project will be an ongoing collaboration between Google Research, Daydream Labs and the YouTube team, with the technology set to become available across select YouTube Spaces for creators in the future.

For the latest updates from Google Research and Daydream Labs, keep reading VRFocus.

Google Can Recreate Your Face For Better Mixed Reality Footage

Google Can Recreate Your Face For Better Mixed Reality Footage

We are big fans of mixed reality here at UploadVR. It is a great way of showing what people in VR are doing.

Startups like Owlchemy Labs and LIV are attempting to make the capture process easier while pushing for higher quality, but current approaches are all limited by one major roadblock. The most expressive part of the the human body, the face, is mostly blocked during capture. You largely have to imagine the expressions of people as they interact with a virtual world.

Google, however, showed off some impressive research that takes the technology to the next level. Using a collection of techniques, including a modified HTC Vive with SMI eye tracking, Google digitally recreates your face in place of the VR headset that is blocking it.

This work is the result of an “ongoing collaboration” between Research, Daydream Labs and the YouTube teams at Google. According to a blog post diving into the research, here is how it works:

The core idea behind our technique is to use a 3D model of the user’s face as a proxy for the hidden face. This proxy is used to synthesize the face in the MR video, thereby creating an impression of the headset being removed. First, we capture a personalized 3D face model for the user with what we call gaze-dependent dynamic appearance. This initial calibration step requires the user to sit in front of a color+depth camera and a monitor, and then track a marker on the monitor with their eyes. We use this one-time calibration procedure — which typically takes less than a minute — to acquire a 3D face model of the user, and learn a database that maps appearance images (or textures) to different eye-gaze directions and blinks. This gaze database(i.e. the face model with textures indexed by eye-gaze) allows us to dynamically change the appearance of the face during synthesis and generate any desired eye-gaze, thus making the synthesized face look natural and alive.

Here’s a video showing the approach in action:

Team members behind the project, including research scientist Vivek Kwatra and software engineers Christian Frueh and Avneesh Sud, believe the approach holds enormous promise to “enhance communication and social interaction in VR itself with diverse applications like VR video conference meetings, multiplayer VR gaming, and exploration with friends and family.”

Tagged with: