Talks Tech #34: AI in Mixed Reality Series – Introduction to Extended Reality
Written by Manorama Jha
Women Who Code Talks Tech 34 | Spotify – iTunes – Google – YouTube – Text
Manorama Jha, Software Development Engineer, AI, and Computer Vision, 3DAI at GridRaster Inc., shares her talk entitled “Introduction to Extended Reality.” In it, she discusses the challenges of developing immersive extended-reality experiences while explaining the fundamentals of building the Metaverse.
Definitions
Environment: This is everything that is being experienced. It is divided into two parts.
Synthetic-Virtual Reality: The user is placed in a virtual environment that is generated synthetically using digital assets.
Real-Augmented Reality: The user has an augmented view of the real environment where digital objects coexist seamlessly with real things.
An example of augmented reality would be shopping online and using a digital asset of a piece of furniture like a chair. You can place it digitally into the real-world environment of your home to see how it looks and fits. It reduces travel, saves time, and lets you move the chair around to different parts of the room before buying it.Applications
Education: In India, many places have VR headsets set up, and they allow students to look at 3d models of the body and see where the different organs and body parts are and their labels. This makes the learning process much faster and more interactive. It also makes it a more appealing learning experience, which is important since our audience will mostly be kids. However, it can be fun and engaging for adults as well.
Manufacturing: Many people use VR and AR devices to reduce costs. You can have a physical object in the real world, like a bike, and a digital asset of a motorcycle, and compare the two to understand if there are any defects. This is being used across industries and even in aerospace to reduce time and work and evaluate projects from anywhere worldwide.
Healthcare: AR is being used to train students in surgery before they become doctors. It allows them to see where issues are, in 3D, rotate them and analyze them from every angle. It can be opened from a laptop from anywhere. This even works with pregnant women, allowing doctors to see how the baby is growing each month.
Mixed Reality Hardware Devices
The most used mixed reality device is the Hololens 2. It has spatial mapping or localization and gesture recognition. This uses an RGB or color camera, a depth camera, and IMU sensors.
Second is the display. Two laser projector beams send images directly into the user’s eyes for an immersive experience. The display system tracks the eyes and the head movement of the user and adapts them in real-time for the persistence of vision.
Third is hands-free communication using natural language. Area microphones and studio speakers support a natural language control interface for hands-free communication. This helps create an immersive user experience and lets you implement different use cases, sometimes with just a microphone. You can ask it to move an object or to start recording, and you don’t even have to interact with the hologram.
Analysis of a Typical Mixed Reality Pipeline
1. 3D scene and spatial mapping
The first feature is ground plane detection which detects and segments out the horizontal place where objects can be anchored. This is important for setting up the global coordinate system.
Second is the structure for motion. This constructs the 3D point cloud from a sequence of post-images and scans of an object or environment.
Third is the 3d point cloud segmentation which detects and segments out different categories of objects in a 3D point cloud.
Fourth is tracking. This detects the same object across multiple frames and estimates how the object moves through the scene.
The Fifth is localization. This is estimating your current position or orientation on the map using odometry and tracking.
2. 3D rendering
This is how you create digital objects in an AR or VR experience.
First is anchoring, which attaches digital models to fixed points on real-world surfaces, and makes sure that the digital model does not shift its position or orientation as the user viewpoint changes.
The second is alignment. Alignment overlays a virtual model or an object on top of the real-world instance of the same thing. This ensures the position, orientation, and scale of the virtual world match perfectly with the pose and dimension of the real world.
The third is occlusion. This involves detecting a 3D object correctly, even when part of it is occluded behind an obstacle. This is particularly important for tracking.
The fourth is remote rendering. Rending high-polygon photorealistic 3D objects is crucial for an immersive MR experience, and it’s difficult to achieve with a limited computer budget for the headset. We must render the 3D scenes in the cloud and stream the user’s current view to the headset.
3. Manipulation and Interaction
First is numerical integration. Efficient numerical integration is critical to accurate physics simulations for modeling interactions.
Second is collision detection and contact modeling. Modeling collision and contact is a challenging task that is crucial in many MR tasks.
Third is the material properties. Material properties must be appropriately modeled to represent surfaces of virtual objects correctly and to model interactions between real and virtual objects.
Fourth is constraint implementation which refers to the necessity of the physics simulation to consider physical constraints. These must be detected and implemented rigorously.
Fifth is multimedia gesture recognition, including speech recognition, natural language understanding, and detection of hand gestures, which are all crucial components of an MR system.
This technology is growing, and they often name MR an emerging technology at conferences and forums. This will change the world over the next ten years, not only in the tech industry but in education, healthcare, and more.