Buckets:

manughelfi
/

PhysicalAI-Autonomous-Vehicles-bucket

9.41 TB

25,004 files

Updated 6 days ago

Ctrl+K

Name	Size	Uploaded	Xet hash
calibration		6 days ago	+10k items
camera		6 days ago	6,124 items
.gitattributes	2.46 kB xet	6 days ago	19463de8
LICENSE.pdf	89.6 kB xet	6 days ago	c2276cfa
README.md	32.2 kB xet	6 days ago	6ce5d7db
features.csv	7.08 kB xet	6 days ago	7f888651

README.md

PHYSICAL AI AUTONOMOUS VEHICLES

The PhysicalAI-Autonomous-Vehicles dataset provides one of the largest, geographically diverse collections of multi-sensor data empowering AV researchers to build the next generation of Physical AI based end-to-end driving systems. This dataset is ready for commercial/non-commercial AV use per the license agreement.

Data Collection Method
- Automatic/Sensor
Labeling Method
- Automatic/Sensor

This dataset has a total of 1700 hours of driving recorded from planned data-collection drives in 25 countries and 2500+ cities. The data captures diverse traffic, weather conditions, obstacles, and pedestrians in the environment. It consists of 306,152 clips that are each 20 seconds long. The sensor data includes multi-camera coverage for all 306,152 clips, LiDAR coverage for 298,326 clips, and radar coverage for 160,761 clips.

A subset (~41K) of this dataset is immediately explorable through a CDS Preview Experience (requires registration).

Intended Usage

This dataset can be used for autonomous vehicle related use cases only which can be both commercial or non-commercial as long as the mentioned license terms are abided by. The size and diversity of this multi-sensor dataset makes it great for research on end-to-end driving, neural reconstruction, synthetic data generation, scenario mining, and many other autonomous vehicle applications.

Developer Tooling

A python developer kit to support application workflows and additional data format documentation is available at https://github.com/NVlabs/physical_ai_av. On systems using Python >= 3.11, the package can be directly installed using:

pip install physical_ai_av

This package provides direct download capabilities from HuggingFace. To authenticate the system for direct download access,

Create a Hugging Face account (if you don't have one already).
Login and agree to the NVIDIA Autonomous Vehicle Dataset License Agreement visible at the top of the PhysicalAI AV dataset card.
Create a User Access Token (if you don't have one already) and choose a method for authentication.

For data mining and curation, NVIDIA also provides tools like Cosmos Dataset Search (CDS) for multimodal semantic search with text and video queries. A subset of this dataset is explorable through a CDS Preview Experience.

Applications

Developing a model end to end for safe use in the real world requires a combination of massive domain-specific data operations as well as training and testing activities. This dataset enables a number of workflows to bootstrap or expand such development, including but not limited to:

Scenario mining and analysis
Neural reconstruction
Synthetic data generation
Supervised fine tuning
Reinforcement learning

NVIDIA is providing a set of workflows for these activities built upon NVIDIA’s Physical AI AV Dataset for NuRec-based reconstruction, supervised fine tuning, and reinforcement learning. They can be found here:

Alpamayo NuRec development workflows

Note: We currently support NuRec-based reconstruction only for the Hyperion 8.1 sensor rig. These can be identified through filtering of the data collection parquet for hyperion_8.1 value of the platform_class field.

{
    # Other Metadata Entries
    # Vehicle Platform
    'platform_class': str, # platform (hyperion_8/8.1)
}

Additionally, we do not include open maps data. Scenes are not compatible with CARLA unless the user generates their own XODR data for now. However, we are looking to add XODR to enable simulation for CARLA, AlpaSim, and others in the future.

Version History

Version Notes

26.03 Additional signal processing to enable NCore-based NuRec reconstruction workflow and SFT, RL workflows for majority (97%) of clips. Specific changes:
- Renamed/updated metadata/sensor_presence.parquet -> metadata/feature_presence.parquet to match features.csv; radar_config column now moved to metadata/data_collection.parquet.
- Added offline-optimized features egomotion.offline, obstacle.offline, camera_intrinsics.offline, lidar_intrinsics.offline, sensor_extrinsics.offline for 97% of clips.
- Updated lidar_top_360fov for 97% of clips; per-clip lidar presence is now recorded in metadata/feature_presence.parquet.

25.10 Initial release of dataset

Version	Notes
26.03	Additional signal processing to enable NCore-based NuRec reconstruction workflow and SFT, RL workflows for majority (97%) of clips. Specific changes: - Renamed/updated `metadata/sensor_presence.parquet` -> `metadata/feature_presence.parquet` to match `features.csv`; `radar_config` column now moved to `metadata/data_collection.parquet`. - Added offline-optimized features `egomotion.offline`, `obstacle.offline`, `camera_intrinsics.offline`, `lidar_intrinsics.offline`, `sensor_extrinsics.offline` for 97% of clips. - Updated `lidar_top_360fov` for 97% of clips; per-clip lidar presence is now recorded in `metadata/feature_presence.parquet`.
25.10	Initial release of dataset

Previous versions are tagged and associated data remains accessible through the HuggingFace repo history.

License/Terms of Use

NVIDIA Autonomous Vehicle Dataset License Agreement

Dataset Owner(s)

NVIDIA Corporation

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns here.

DATASET

Dataset Quantification

This dataset has a total of 1700 hours of driving recorded from planned data-collection drives in 25 countries and 2500+ cities. The data captures diverse traffic, weather conditions, obstacles, and pedestrians in the environment. It consists of 306,152 clips that are each 20 seconds long. The sensor data includes multi-camera (7) coverage for all clips, LiDAR (1) coverage for 298,326 clips, and radar (up to 10) coverage for 160,761 clips. Ego motion, calibration, and machine labels are also part of the set.

The total size of the dataset is 133TB.

Dataset Diversity: Regional

country	count	country	count	country	count
United States	155360	Croatia	4961	Romania	2719
Germany	43900	Netherlands	4932	Luxembourg	2620
France	10364	Denmark	4581	Latvia	2173
Italy	8658	Slovenia	4301	Hungary	1960
Sweden	7330	Estonia	4128	Bulgaria	932
Spain	6459	Slovakia	4122
Portugal	6101	Belgium	3753
Greece	5885	Czechia	3662
Austria	5451	Lithuania	3392
Finland	5176	Poland	3232

Dataset Diversity: Environmental and Traffic

Traffic density patterns: no traffic, light traffic, medium traffic, and heavy traffic
Road types: highways, urban, residential, and rural roads
Weather: clear, rain, snow, fog
Surface conditions: dry, wet, snow/ice
Time-of-day: daytime, nighttime
Infrastructure elements such as tunnels, bridges, roundabouts, railway crossings, toll booths, inclines, and more

Sensor Data Structure

Data Format

We store the data separately for each sensor (camera, LiDAR and radar). Besides these sensors we also provide ego motion, calibration data, autogenerated (non-GT) machine labels, and other metadata.

Because of the significant size of this dataset, we provide all features (sensor data and autolabels) in chunks of up to 100 clips each. The exception to this chunking is clip-level metadata which we intend for researchers to use to identify which subset of chunks they are interested in downloading according to their target application. Significant storage space and bandwidth savings may be achieved by downloading only chunks corresponding to a subset of sensors, country of collection, dataset split, etc.

A python developer kit to support such workflows and additional data format documentation is available at https://github.com/NVlabs/physical_ai_av.

Sensor Data Organization

We detail the general data file organization. Specific schema descriptions within the parquet files are maintained within the Physical AI python library wiki here:

https://github.com/NVlabs/physical_ai_av/wiki

Camera Data

This sensor captures visual RGB data (i.e., videos) from multiple viewpoints around the vehicle. In our dataset the following seven cameras are included:

Cross left 120 fov
Cross right 120 fov
Front wide 120 fov
Front tele 30 fov
Rear left 70 fov
Rear right 70 fov
Rear tele 30 fov

Directory structure

camera/
├─ camera_front_wide_120fov/
│  ├─ camera_front_wide_120fov.chunk_0000.zip  
│  └─ ...
└─ camera_cross_left_120fov/
└─ ...

Each chunk_xxxx.zip contains approximately 100 1080p mp4 files recorded at 30fps. Each mp4 will be named <clip_uuid>.camera_<field_of_view>.mp4. Users can use this UUID to map across different corresponding views and sensors (provided there is coverage) under the designated sensor directories. The chunks also contain frame timestamps parquets corresponding to the camera mp4 files with a UUID tag in the name.

LiDAR Data

This directory contains 3D point cloud data recorded using a top 360 degree rotating LiDAR.

├─ lidar/
     └─ lidar_top_360fov/
            ├─ lidar_top_360fov.chunk_0000.zip
            ├─ ...
            └─ lidar_top_360fov.chunk_XXXX.zip

Inside lidar_top_360fov.chunk_0000.zip, there are approximately 100 lidar parquet files. Each parquet will be named <clip_uuid>.lidar_top360_fov.parquet and contains approximately 200 lidar spins (i.e. 10Hz capture rate for a 20sec clip).

The point cloud can be decoded, e.g., by using the DracoPy library.

Radar Data

This folder contains 3D radar point clouds data recorded using (up to) 10 radars located in the front bumper center, front left corner, front right corner, left side, right side, rear left corner, rear right corner, rear left, and rear right.

 radar/
   ├─ radar_corner_front_left_srr_0/
   │  ├─ radar_corner_front_left_srr_0.chunk_0000.zip  
   │  ├─ ...
   │  └─ radar_corner_front_left_srr_0.chunk_xxxx.zip
   ├─ radar_corner_front_right_srr_0/   
   └─ ...

Inside chunk_XXXX.zip, there are approximately 100 radar parquet files. Each parquet will be named <clip_uuid>.radar_<field_of_view>_<configuration>.parquet. The letters srr stand for short range radar, mrr for medium range radar, and lrr for long range radar.

Unlike other sensors, for a clip with radar data coverage, the radar sensors for each field of view can have varying model types, depending on the clip. Therefore, the zip files accompany the numerical reference like srr_0, srr_3 at the end to denote the radar model reference.

Calibration Data

Contains [sensor]_intrinsics, sensor_extrinsics, and vehicle_dimension directories. In addition, offline versions of this data are marked with “.offline”. The directories contain chunked parquets.

calibration/
     └─ camera_intrinsics.offline/
            ├─ camera_intrinsics.offline.chunk_0000.parquet
            ├─ ...
            └─ camera_intrinsics.offline.chunk_xxxx.parquet
     └─ camera_intrinsics/
            ├─ camera_intrinsics.chunk_0000.parquet
            ├─ ...
            └─ camera_intrinsics.chunk_xxxx.parquet
     └─ lidar_intrinsics.offline/
            ├─ lidar_intrinsics.offline.chunk_0000.parquet
            ├─ ...
            └─ lidar_intrinsics.offline.chunk_xxxx.parquet
     └─ sensor_extrinsics.offline/
            ├─ sensor_extrinsics.offline.chunk_0000.parquet
            ├─ ...
            └─ sensor_extrinsics.offline.chunk_xxxx.parquet
     └─ sensor_extrinsics/
            ├─ sensor_extrinsics.chunk_0000.parquet
            ├─ ...
            └─ sensor_extrinsics.chunk_xxxx.parquet
     └─ vehicle_dimensions/
            ├─ vehicle_dimensions.chunk_0000.parquet
            ├─ ...
            └─ vehicle_dimensions.chunk_xxxx.parquet

Labels

Contains egomotion, egomotion.offline, and obstacle.offline directories. This data is in a local coordinate frame consistent across all timestamps with the origin located at the ego vehicle's position at timestamp 0, oriented such that there is 0 yaw at timestamp 0 but otherwise attitude (pitch and roll) are estimated with respect to gravity.

├─ labels/
     └─ egomotion.offline/
            ├─ egomotion.offline.chunk_0000.parquet
            ├─ ...
            └─ egomotion.offline.chunk_xxxx.parquet
     └─ egomotion/
            ├─ egomotion.chunk_0000.parquet
            ├─ ...
            └─ egomotion.chunk_xxxx.parquet
     └─ obstacle.offline/
            ├─ obstacle.offline.chunk_0000.parquet
            ├─ ...
            └─ obstacle.offline.chunk_xxxx.parquet

Metadata

Contains feature_presence.parquet and data_collection.parquet files.

feature_presence.parquet: captures the sensor availability per clip.

data_collection.parquet: contains fields to filter clips by, e.g. country where clip was recorded, the month of the year and time of day.

Reasoning Labels

This directory contains human-verified reasoning labels for a curated subset of Out-of-Distribution (OOD) driving scenarios. This release focuses on a highly specific, high-fidelity subset purely based on OOD driving scenarios. These annotations complement the physical trajectory data described in the Labels section by providing action-grounded, temporally aligned Chain of Causation (CoC) data intended to support Vision-Language-Action (VLA) model evaluation. The labels were processed through a rigorous human-in-the-loop curation and quality-control pipeline to verify OOD validity, temporal alignment, and reasoning quality. More details on this process are documented in the dataset wiki.

The CoC reasoning labels are provided as a single Parquet file for efficient processing.

├─ reasoning/
     └─ ood_reasoning.parquet

Dataset Statistics

Split	Clips	Reasoning Labels	Status	Included Annotations
Train	1450	1728	Released	Clip UUID, Event Cluster, Human-Refined CoC, Keyframes
Val	290	349	Released	Clip UUID, Event Cluster, Human-Refined CoC, Keyframes
Test	-	-	Held Out	Reserved for OOD Benchmark Challenge

Total size: 9.41 TB

Files: 25,004

Last updated: May 21

Pre-warmed CDN: US EU US EU