Understanding AR, VR, and MR Technologies
Posted by Anonymous and classified in Design and Engineering
Written on in
English with a size of 344.75 KB
Augmented Reality (AR) Fundamentals
AR is a technology that integrates digital information with the user’s real-world environment in real time, where virtual objects are spatially and temporally registered with physical objects and are interactive. Unlike Virtual Reality, which creates a fully artificial environment, AR enhances the real world by overlaying computer-generated content on it.
Characteristics of AR
AR is defined by the following three main characteristics:
- Combines Real & Virtual World: AR blends digital elements such as images, text, and 3D models with the real physical environment instead of replacing it.
- Interactive in Real Time: The AR system responds instantly to the user’s movement and actions, allowing users to interact with both real and virtual objects.
- Registered in 3D: Virtual objects are placed accurately in the real environment. They stay fixed in position and change correctly when the user moves, ensuring proper alignment with the real world.
Applications of AR
- Retail: Customers can try products virtually before purchasing, such as placing furniture in their homes using AR apps like IKEA Place.
- Entertainment & Gaming: AR is used in games and social media filters, such as Pokémon Go and Snapchat effects.
- Architecture & Construction: AR helps architects visualize buildings and designs before construction.
- Navigation: AR shows directions on live camera views of roads to assist users.
AR System Architecture
An AR system architecture describes how different components work together to combine the real world with virtual (digital) content in real time.
Main Components of an AR System
- User: The person who interacts with the AR system. The system is designed to help the user by providing enhanced visual or informational support (e.g., a doctor using AR glasses during surgery).
- Device: The hardware used to run AR, such as a smartphone, tablet, AR glasses, or head-mounted display (HMD). The device contains a camera, sensors, display, and processor.
- Real Content: Real-world information captured by the device, such as physical objects, location, environment, and live camera feeds.
- Tracking: Identifies the position and orientation of the user and real-world objects. It ensures virtual objects are placed in the correct location using cameras, GPS, sensors, and image recognition.
- Virtual Content: Digital information added to the real world, such as 3D models, text, images, videos, and audio.
Mixed Reality (MR)
MR is a technology that blends the real and virtual worlds so that physical and digital objects can exist together and interact with each other in real time. In MR, virtual objects are not just placed on top of the real world; they are aware of the real environment and can sit on real tables, hide behind real walls, or be blocked by real objects.
How MR Works
MR uses cameras, depth sensors, environment scanning, and spatial mapping to understand the shape and position of real-world objects, allowing it to correctly place and control virtual objects within that space. This makes MR more intelligent and immersive than AR.
Comparison: AR, VR, and MR
| Feature | Augmented Reality (AR) | Virtual Reality (VR) | Mixed Reality (MR) |
|---|---|---|---|
| Real world visible | Yes | No | Yes |
| Virtual world visible | Yes | Yes (only virtual) | Yes |
| Interaction with real objects | No | No | Yes |
| Interaction between real & virtual | No | No | Yes |
| User environment | Real + digital overlay | Fully virtual | Real + interactive digital |
| Example | Snapchat filters | VR gaming headset | Microsoft HoloLens |
AR/MR Algorithm Steps
- Capture Real World: The device uses cameras and sensors (GPS, gyroscope, accelerometer) to capture the live environment.
- Detect & Track Environment: The system identifies user position, device orientation, and surfaces. In MR, this includes spatial mapping.
- Recognize Targets: The system looks for images, markers, or objects to determine where virtual content should be placed.
- Generate Virtual Content: Loads 3D models, text, or animations based on the target.
- Registration: Aligns virtual objects with the real world to ensure they appear fixed in position.
- Rendering: Combines the real camera view with virtual objects.
- Interaction: The user interacts via touch, gesture, voice, or movement.
- Display: The final scene is shown on the device screen or headset.
Input Modalities in AR
- Touch Input: Tapping, swiping, and dragging on mobile screens.
- Gesture Input: Hand and body movements tracked by cameras and depth sensors.
- Voice Input: Spoken commands for hands-free interaction.
- Sensor-based Input: Using accelerometers, gyroscopes, and GPS to detect motion and orientation.
- Camera Input: Capturing real-world images to identify markers or objects.
- Eye-tracking Input: Selecting objects by looking at them.
- Tangible Input: Using physical objects as controllers.
Output Modalities
- Visual Output: Overlaying virtual objects, text, and animations on the real world.
- Audio Output: Providing instructions, alerts, or spatial audio.
- Haptic (Touch) Output: Providing physical sensations like vibration or force feedback.
- Tangible Output: Using physical objects to provide touch feedback for virtual content.
Multimodal Displays
Multimodal displays combine multiple sensory channels (vision, hearing, touch) to provide a more immersive experience and prevent information overload. By distributing data across different senses, the system makes interaction more natural and effective.
Visual Perception in AR
AR systems must match human visual perception to ensure virtual objects look realistic. This includes correct depth perception, proper size, brightness, contrast, and accurate alignment. If these factors are not matched, virtual objects may appear to float or misalign, causing eye strain or dizziness.
Tracking Techniques
- Marker-based Tracking: Uses pre-defined visual patterns (like QR codes) for high accuracy in small-scale environments.
- Marker-less Tracking: Uses natural features (edges, textures) and SLAM (Simultaneous Localization and Mapping) to track the environment without pre-placed markers.
- Sensor-based Tracking: Uses hardware sensors (accelerometers, gyroscopes) to measure device orientation and movement.
- Vision-based Pose Tracking: Tracks rigid objects using cameras to detect their 3D pose.
- Body & Skeleton Tracking: Tracks human body movements and gestures using depth sensors.
- Hybrid Tracking: Combines multiple methods to improve accuracy and robustness.
Registration and Calibration
- Registration: The process of accurately aligning virtual objects with real-world objects.
- Calibration: Measuring and adjusting system parameters (camera, display, sensors) to ensure tracking and geometric relationships are accurate.
Homogeneous Coordinate System
The homogeneous coordinate system extends Cartesian coordinates (x, y, z) by adding an extra coordinate (w), allowing all geometric transformations (translation, rotation, scaling) to be expressed as matrix multiplications. This is essential for computer graphics, animation, and perspective projection.
2D Transformations
- Translation: Moving an object from one position to another.
- Rotation: Turning an object about a fixed point (pivot point) by an angle.
- Scaling: Changing the size of an object by enlarging or shrinking it.
Geometric Modeling
Geometric modeling is the process of creating a mathematical representation of an object's shape and structure. It involves creating basic primitives (points, lines, polygons), applying transformations, and combining them into a complete model for rendering or simulation.
Window to Viewport Transformation
This process maps a selected area of the world-coordinate scene (the window) onto a specified area of the display device (the viewport). It ensures that objects are properly scaled and positioned on the screen without distortion.
Virtual Reality (VR) Concepts
VR is a 3D technology that simulates sensory experiences to give the user a feeling of presence. It is based on the 3 I’s:
- Immersion: The feeling of being completely involved in the virtual environment.
- Interaction: The ability to control and manipulate virtual objects.
- Imagination: The potential to design environments beyond real-world limitations.
VR System Components
- Hardware: VR engine (computer), input devices (controllers, gloves), output devices (HMDs, audio), and tracking systems.
- Software: Application software, databases for 3D assets, and development tools (e.g., Unity, Unreal, Maya).
Display Technology: LCD vs. OLED
| Point | LCD | OLED |
|---|---|---|
| Light Source | Uses backlight | Self-emissive pixels |
| Thickness | Thicker display | Thinner and lighter |
| Contrast | Lower contrast | Very high contrast |
| Viewing Angle | Limited viewing angle | Wide viewing angle |
| Power Consumption | Higher due to backlight | Lower, power-efficient |
| Response Time | Slower response | Faster response |
Eye Movements and VR
Human eye movements (saccades, smooth pursuit, vergence) are essential for stable vision. Mismatches between these natural movements and VR display behavior (such as the Vergence-Accommodation Conflict) can lead to eye strain, headaches, and motion sickness.
Frame Rate and Latency
VR requires at least 90 FPS to maintain a smooth experience. Low frame rates and high motion-to-photon latency cause flickering, motion blur, and nausea. High-quality VR displays must prioritize high refresh rates and low latency to ensure immersion.
Depth Perception in VR
Depth perception is achieved through monocular cues (size, height, motion parallax) and binocular cues (stereopsis). In VR, these cues must be accurately rendered to provide a realistic sense of distance and spatial relationships.
Resolution in VR
High resolution is critical in VR because displays are placed very close to the eyes. Low resolution leads to the "screen door effect," where individual pixels become visible, reducing immersion and causing eye strain. High-resolution displays require powerful GPUs and high memory bandwidth.
Orientation Tracking
Orientation tracking measures rotational motion (Yaw, Pitch, Roll) using sensors like gyroscopes, accelerometers, and magnetometers. Sensor fusion algorithms (e.g., Kalman filters) combine this data to provide stable, drift-free tracking essential for preventing motion sickness in VR.