Skip to main content

Aria Pilot Dataset Overview

Writing Sample

This page is an archive of technical writing I did for Project Aria. For the most up to date documentation go to Project Aria Docs.

The Aria Pilot Dataset is the first open dataset captured using Project Aria, Meta’s research device used for accelerating machine perception and AI research, developed at Reality-Labs Research.

The dataset provides sequences collected with Project Aria devices from a variety of egocentric scenarios, including cooking, exercising, playing games and spending time with friends, for researchers to engage with the challenges of always-on egocentric vision.

In addition to providing raw sensor data from Project Aria, the Aria Pilot Dataset also contains derived results from machine perception services. This provides additional context to the spatial-temporal reference frames, such as:

  • Per-frame eye tracking
  • Accurate 3D trajectories of users across multiple everyday activities in the same location
  • Shared space-time information between multiple wearers
  • Speech-to-text annotation

The dataset is extensive, providing:

  • 143 recordings for Everyday Activities
  • 16 Recordings for Desktop Activities
  • Over 2.1 million images
  • Over 7.5 accumulated hours

The dataset is split into two subsets:

  • Everyday Activities: Multiple activity sequences where 1-2 users wearing Project Aria devices participate in scenarios to capture time synchronized data in a shared world location.
  • Desktop Activities: Multiple object tracking sequences using one Project Aria device synchronized with a multi-view motion capture system

A further subset is planned for release in the near future that will include outdoor activities. This subset will also include data recorded using Sensor Profile 10, which includes GPS, WiFi and BT data.

Go to the Project Aria website to access the Aria Pilot Dataset.

Everyday Activities

Figure 1: Shared 3D Global Trajectories for Multi-User Activities in the Same Location

The main dataset contains multiple activity sequences for one to two Project Aria device wearers. Each wearer followed scripts that represented a typical scenario people might encounter throughout their day. The scripts provided the wearers with prompts they used while collecting data.

In addition to the raw sensor data, we’ve provided derived meta-data for:

The data has been gathered across five indoor locations. Data for each location is stored in their own folder.

Desktop Activities

For this subset of the dataset a Project Aria wearer manipulated a set of objects on a desktop while being recorded by a multi-view motion capture system. The Project Aria device’s data is synchronized with the multi-view motion capture system to provide additional viewpoints and ground truth motion. Most objects were selected from YCB Object Benchmark.

Figure 2: Object Sorting & Tidying Multi-View

How to Use the Dataset

The Aria Pilot Dataset has been optimized to work with Aria Research Kit: Aria Data Tools.

You can also work with this data using standard VRS commands.

Privacy

All sequences within the Aria Pilot Dataset were captured using fully consented actors in controlled environments. Bystanders and bystander vehicle data was strictly avoided when collecting data. For Desktop Activities, recordings the actor wore a mask. For Everyday Activities, faces were blurred prior to public release.

View Meta's principles of responsible innovation

License

The Aria Pilot Dataset is released by Meta under the Dataset License Agreement.