Understanding AR, VR, and MR Technologies

Posted by Anonymous and classified in Design and Engineering

Written on in English with a size of 344.75 KB

Augmented Reality (AR) Fundamentals

AR is a technology that integrates digital information with the user’s real-world environment in real time, where virtual objects are spatially and temporally registered with physical objects and are interactive. Unlike Virtual Reality, which creates a fully artificial environment, AR enhances the real world by overlaying computer-generated content on it.

Characteristics of AR

AR is defined by the following three main characteristics:

  • Combines Real & Virtual World: AR blends digital elements such as images, text, and 3D models with the real physical environment instead of replacing it.
  • Interactive in Real Time: The AR system responds instantly to the user’s movement and actions, allowing users to interact with both real and virtual objects.
  • Registered in 3D: Virtual objects are placed accurately in the real environment. They stay fixed in position and change correctly when the user moves, ensuring proper alignment with the real world.

Applications of AR

  • Retail: Customers can try products virtually before purchasing, such as placing furniture in their homes using AR apps like IKEA Place.
  • Entertainment & Gaming: AR is used in games and social media filters, such as Pokémon Go and Snapchat effects.
  • Architecture & Construction: AR helps architects visualize buildings and designs before construction.
  • Navigation: AR shows directions on live camera views of roads to assist users.

AR System Architecture

An AR system architecture describes how different components work together to combine the real world with virtual (digital) content in real time.

Main Components of an AR System

  1. User: The person who interacts with the AR system. The system is designed to help the user by providing enhanced visual or informational support (e.g., a doctor using AR glasses during surgery).
  2. Device: The hardware used to run AR, such as a smartphone, tablet, AR glasses, or head-mounted display (HMD). The device contains a camera, sensors, display, and processor.
  3. Real Content: Real-world information captured by the device, such as physical objects, location, environment, and live camera feeds.
  4. Tracking: Identifies the position and orientation of the user and real-world objects. It ensures virtual objects are placed in the correct location using cameras, GPS, sensors, and image recognition.
  5. Virtual Content: Digital information added to the real world, such as 3D models, text, images, videos, and audio.

Mixed Reality (MR)

MR is a technology that blends the real and virtual worlds so that physical and digital objects can exist together and interact with each other in real time. In MR, virtual objects are not just placed on top of the real world; they are aware of the real environment and can sit on real tables, hide behind real walls, or be blocked by real objects.

How MR Works

MR uses cameras, depth sensors, environment scanning, and spatial mapping to understand the shape and position of real-world objects, allowing it to correctly place and control virtual objects within that space. This makes MR more intelligent and immersive than AR.

Comparison: AR, VR, and MR

FeatureAugmented Reality (AR)Virtual Reality (VR)Mixed Reality (MR)
Real world visibleYesNoYes
Virtual world visibleYesYes (only virtual)Yes
Interaction with real objectsNoNoYes
Interaction between real & virtualNoNoYes
User environmentReal + digital overlayFully virtualReal + interactive digital
ExampleSnapchat filtersVR gaming headsetMicrosoft HoloLens

AR/MR Algorithm Steps

  1. Capture Real World: The device uses cameras and sensors (GPS, gyroscope, accelerometer) to capture the live environment.
  2. Detect & Track Environment: The system identifies user position, device orientation, and surfaces. In MR, this includes spatial mapping.
  3. Recognize Targets: The system looks for images, markers, or objects to determine where virtual content should be placed.
  4. Generate Virtual Content: Loads 3D models, text, or animations based on the target.
  5. Registration: Aligns virtual objects with the real world to ensure they appear fixed in position.
  6. Rendering: Combines the real camera view with virtual objects.
  7. Interaction: The user interacts via touch, gesture, voice, or movement.
  8. Display: The final scene is shown on the device screen or headset.

Input Modalities in AR

  • Touch Input: Tapping, swiping, and dragging on mobile screens.
  • Gesture Input: Hand and body movements tracked by cameras and depth sensors.
  • Voice Input: Spoken commands for hands-free interaction.
  • Sensor-based Input: Using accelerometers, gyroscopes, and GPS to detect motion and orientation.
  • Camera Input: Capturing real-world images to identify markers or objects.
  • Eye-tracking Input: Selecting objects by looking at them.
  • Tangible Input: Using physical objects as controllers.

Output Modalities

  • Visual Output: Overlaying virtual objects, text, and animations on the real world.
  • Audio Output: Providing instructions, alerts, or spatial audio.
  • Haptic (Touch) Output: Providing physical sensations like vibration or force feedback.
  • Tangible Output: Using physical objects to provide touch feedback for virtual content.

Multimodal Displays

Multimodal displays combine multiple sensory channels (vision, hearing, touch) to provide a more immersive experience and prevent information overload. By distributing data across different senses, the system makes interaction more natural and effective.

Visual Perception in AR

AR systems must match human visual perception to ensure virtual objects look realistic. This includes correct depth perception, proper size, brightness, contrast, and accurate alignment. If these factors are not matched, virtual objects may appear to float or misalign, causing eye strain or dizziness.

Tracking Techniques

  • Marker-based Tracking: Uses pre-defined visual patterns (like QR codes) for high accuracy in small-scale environments.
  • Marker-less Tracking: Uses natural features (edges, textures) and SLAM (Simultaneous Localization and Mapping) to track the environment without pre-placed markers.
  • Sensor-based Tracking: Uses hardware sensors (accelerometers, gyroscopes) to measure device orientation and movement.
  • Vision-based Pose Tracking: Tracks rigid objects using cameras to detect their 3D pose.
  • Body & Skeleton Tracking: Tracks human body movements and gestures using depth sensors.
  • Hybrid Tracking: Combines multiple methods to improve accuracy and robustness.

Registration and Calibration

  • Registration: The process of accurately aligning virtual objects with real-world objects.
  • Calibration: Measuring and adjusting system parameters (camera, display, sensors) to ensure tracking and geometric relationships are accurate.

Homogeneous Coordinate System

The homogeneous coordinate system extends Cartesian coordinates (x, y, z) by adding an extra coordinate (w), allowing all geometric transformations (translation, rotation, scaling) to be expressed as matrix multiplications. This is essential for computer graphics, animation, and perspective projection.

2D Transformations

  • Translation: Moving an object from one position to another.
  • Rotation: Turning an object about a fixed point (pivot point) by an angle.
  • Scaling: Changing the size of an object by enlarging or shrinking it.

Geometric Modeling

Geometric modeling is the process of creating a mathematical representation of an object's shape and structure. It involves creating basic primitives (points, lines, polygons), applying transformations, and combining them into a complete model for rendering or simulation.

Window to Viewport Transformation

This process maps a selected area of the world-coordinate scene (the window) onto a specified area of the display device (the viewport). It ensures that objects are properly scaled and positioned on the screen without distortion.

Virtual Reality (VR) Concepts

VR is a 3D technology that simulates sensory experiences to give the user a feeling of presence. It is based on the 3 I’s:

  • Immersion: The feeling of being completely involved in the virtual environment.
  • Interaction: The ability to control and manipulate virtual objects.
  • Imagination: The potential to design environments beyond real-world limitations.

VR System Components

  • Hardware: VR engine (computer), input devices (controllers, gloves), output devices (HMDs, audio), and tracking systems.
  • Software: Application software, databases for 3D assets, and development tools (e.g., Unity, Unreal, Maya).

Display Technology: LCD vs. OLED

PointLCDOLED
Light SourceUses backlightSelf-emissive pixels
ThicknessThicker displayThinner and lighter
ContrastLower contrastVery high contrast
Viewing AngleLimited viewing angleWide viewing angle
Power ConsumptionHigher due to backlightLower, power-efficient
Response TimeSlower responseFaster response

Eye Movements and VR

Human eye movements (saccades, smooth pursuit, vergence) are essential for stable vision. Mismatches between these natural movements and VR display behavior (such as the Vergence-Accommodation Conflict) can lead to eye strain, headaches, and motion sickness.

Frame Rate and Latency

VR requires at least 90 FPS to maintain a smooth experience. Low frame rates and high motion-to-photon latency cause flickering, motion blur, and nausea. High-quality VR displays must prioritize high refresh rates and low latency to ensure immersion.

Depth Perception in VR

Depth perception is achieved through monocular cues (size, height, motion parallax) and binocular cues (stereopsis). In VR, these cues must be accurately rendered to provide a realistic sense of distance and spatial relationships.

Resolution in VR

High resolution is critical in VR because displays are placed very close to the eyes. Low resolution leads to the "screen door effect," where individual pixels become visible, reducing immersion and causing eye strain. High-resolution displays require powerful GPUs and high memory bandwidth.

Orientation Tracking

Orientation tracking measures rotational motion (Yaw, Pitch, Roll) using sensors like gyroscopes, accelerometers, and magnetometers. Sensor fusion algorithms (e.g., Kalman filters) combine this data to provide stable, drift-free tracking essential for preventing motion sickness in VR. Bwwb273knSgAAAABJRU5ErkJggg==

Related entries: