PAPNet: Point-enhanced Attention-aware Pillar Network for 3D Object Detection in Autonomous Driving

Anonymous TASE submission

Abstract

The conversion of raw point clouds into pillar representations has been widely adopted for 3D object detection. Such conversion allows a point cloud to be discretized into structured grids, which enables more efficient spatial representation and faster processing in real-time autonomous driving systems. In existing pillar-based methods, however, discretizing raw point clouds often leads to the misdetection of small objects such as pedestrians and cyclists. This is because the discretization inevitably results in the loss of contextual and multi-resolution information within raw point clouds. To address this issue, we propose a point-enhanced attention-aware pillar network, namely PAPNet. It is mainly composed of a point-pillar cross-attention module (PCM), a pillar-wise dual attention module (PDAM), and a multi-resolution set abstraction module (MSAM). PCM effectively integrates raw point cloud features with pillar features across different dimensions, while the PDAM further guides the network to focus on the intrinsic characteristics of the pillars. Additionally, MSAM retains both high-resolution and low-resolution features while integrating multi-scale information. We conduct extensive experiments on three public datasets and in real-world scenarios to demonstrate the effectiveness and efficiency of PAPNet. Code Available Here.



Pipeline of PAPNet



Architecture of PCM



PAPNet Supplemental Video