Exploring the Benefits of Multimodal Sensor Fusion in Autonomous Driving: A Comparative Study of Camera and LiDAR Using Transformer Architectures for Object Detection

Doniger, Sammy

Publication:
Exploring the Benefits of Multimodal Sensor Fusion in Autonomous Driving: A Comparative Study of Camera and LiDAR Using Transformer Architectures for Object Detection

datacite.rights	restricted
dc.contributor.advisor	Kornhauser, Alain Lucien
dc.contributor.author	Doniger, Sammy
dc.date.accessioned	2025-08-06T15:55:31Z
dc.date.available	2025-08-06T15:55:31Z
dc.date.issued	2025-04-09
dc.description.abstract	Accurate and robust object detection is critical for advancing autonomous driving systems. In recent years, transformer-based architectures have shown significant promise in this domain, offering improved performance over previous state-of-the-art technologies, largely due to their ability to handle long-range dependencies. This thesis explores the potential benefits of multimodal sensor fusion in autonomous driving by evaluating three transformer-based architectures for object detection tasks, each trained on the nuScenes dataset. The first model, TransFusion, integrates camera and LiDAR data within a unified transformer framework. The second model is a LiDAR-only variant, adapted from the TransFusion implementation to isolate the contribution from the LiDAR sensors. The third model, FCOS3D, is a camera-only model that isolates the contribution from the camera sensors. The primary goal of this research is to identify scenarios in which single-modality models (camera-only or LiDAR-only) produce conflicting detections and to analyze how the fusion-based approach handles these discrepancies. By closely examining these instances, the study evaluates whether LiDAR offers critical advantages over camera-only systems in consumer vehicles. Given the higher cost and complexity associated with LiDAR sensors, understanding whether these advantages justify the integration of LiDAR is vital for automotive manufacturers and researchers seeking to optimize safety, reliability, and system efficiency under cost constraints. Through extensive experimental evaluations, this thesis contributes insights into how multimodal fusion impacts object detection, revealing that while the LiDAR- only variant yields higher overall detection metrics in limited training environments, the camera-only approach excels at identifying near-range objects, and the fusion model effectively refines extraneous predictions. This synergy underscores trade-offs between cost and detection coverage, providing guidance for future sensor design and deployment strategies in the pursuit of a fully autonomous driving system.
dc.identifier.uri	https://theses-dissertations.princeton.edu/handle/88435/dsp01w9505392z
dc.language.iso	en_US
dc.title	Exploring the Benefits of Multimodal Sensor Fusion in Autonomous Driving: A Comparative Study of Camera and LiDAR Using Transformer Architectures for Object Detection
dc.type	Princeton University Senior Theses
dspace.entity.type	Publication
dspace.workflow.startDateTime	2025-04-10T01:58:05.876Z
pu.contributor.authorid	920269205
pu.date.classyear	2025
pu.department	Ops Research & Financial Engr
pu.minor	Finance
pu.minor	Computer Science

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SeniorThesis.pdf
Size:: 10.98 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 100 B
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Operations Research and Financial Engineering, 2000-2025

Publication: Exploring the Benefits of Multimodal Sensor Fusion in Autonomous Driving: A Comparative Study of Camera and LiDAR Using Transformer Architectures for Object Detection

Files

Original bundle

License bundle

Collections

Publication:
Exploring the Benefits of Multimodal Sensor Fusion in Autonomous Driving: A Comparative Study of Camera and LiDAR Using Transformer Architectures for Object Detection