ECCV 2016
Is Faster R-CNN Doing Well for Pedestrian Detection?
Liliang Zhang, Liang Lin*, Xiaodan Liang, Kaiming He
ECCV 2016

Abstract


Detecting pedestrian has been arguably addressed as a special topic beyond general object detection. Although recent deep learning object detectors such as Fast/Faster R-CNN [1,2] have shown excellent performance for general object detection, they have limited success for detecting pedestrian, and previous leading pedestrian detectors were in general hybrid methods combining hand-crafted and deep convolutional features. In this paper, we investigate issues involving Faster R-CNN [2] for pedestrian detection. We discover that the Region Proposal Network (RPN) in Faster R-CNN indeed performs well as a stand-alone pedestrian detector, but surprisingly, the downstream classifier degrades the results. We argue that two reasons account for the unsatisfactory accuracy: (i) insufficient resolution of feature maps for handling small instances, and (ii) lack of any bootstrapping strategy for mining hard negative examples. Driven by these observations, we propose a very simple but effective baseline for pedestrian detection, using an RPN followed by boosted forests on shared, high-resolution convolutional feature maps. We comprehensively evaluate this method on several benchmarks (Caltech, INRIA, ETH, and KITTI), presenting competitive accuracy and good speed. Code will be made publicly available.

pipeline

 

 

 

Experiment


Caltech_new2

Fig.1: Comparisons on the Caltech set (legends indicate MR).

Caltech_new2iou70

Fig.2: Comparisons on the Caltech set using IoU > 0.7 to determine True Positives (legends indicate MR).

Caltechnew_new2

Fig.3: Comparisons on the Caltech-New set (legends indicate MR−2 (MR−4)).

INRIA_new2

Fig.4: Comparisons on the INRIA dataset (legends indicate MR).

ETH_new2

Fig.5: Comparisons on the ETH dataset (legends indicate MR).

KITTI

Table 1: Comparisons on the KITTI dataset collected at the time of submission (Feb 2016). The timing records are collected from the KITTI leaderboard. †: region proposal running time ignored (estimated 2s).

 

 

 

 

Reference


  1.  Ross Girshick. Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV), 2015. 
  2.  Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Neural Information Processing Systems (NIPS), 2015.