Motion Detection in VR using Machine Learning & Real Time-Inference

Project Overview

This project explores machine learning for activity detection using VR tracking data. Collected sensor data from a Meta Quest 2 headset and controllers, and developed classification algorithms to detect six different activities (standing, sitting, jogging, arms chopping, arms stretching, and twisting).

Technical Implementation

We worked with a dataset of VR tracking data containing 36 sensor fields from the headset and left/right controllers, including velocity, angular velocity, position, and rotation measurements across x, y, and z dimensions. The dataset consisted of approximately 696 labeled samples (116 per activity) and 174 test samples.

We implemented two classification approaches:

Statistical Threshold Classifier: A rule-based algorithm using means and variances of significant sensor attributes to classify activities without machine learning models
Shallow Learning Classifier: A machine learning approach using non-deep learning models (e.g., SVM, decision tree, random forest) with carefully engineered features extracted from the sensor data

We extended the shallow learning classifier to support continuous activity detection using sliding window evaluation, enabling real-time detection of activity transitions as users switch between different movements.

Finally, we implemented real-time inference in Unity/C# using the statistical threshold approach, processing sensor data from a 3-second sliding window to detect and display the current activity in real-time within the VR environment.

Results

The project involved comprehensive evaluation on validation sets, measuring precision and accuracy for each activity type, as well as latency performance. Both classification methods were tested on continuous sensing scenarios where activities transitioned within single data traces, demonstrating the system's ability to handle real-world usage patterns where users naturally switch between activities.

The real-time Unity implementation successfully detected activities with a maximum lag time of 3 seconds, providing immediate visual feedback to users in the VR environment. This project provided hands-on experience with machine learning for mobile sensing systems, feature engineering, model evaluation, and real-time inference optimization.