Abstract
Detecting objects in surveillance videos is an important problem due to its wide applications in traffic control and public security. Existing methods tend to face performance degradation because of false positive or misalignment problems. We propose a novel framework, namely, Foreground Gating and Background Refining Network (FG-BR Net), for surveillance object detection (SOD). To reduce false positives in background regions, which is a critical problem in SOD, we introduce a new module that first subtracts the background of a video sequence and then generates high-quality region proposals. Unlike previous background subtraction methods that may wrongly remove the static foreground objects in a frame, a feedback connection from detection results to background subtraction process is proposed in our model to distill both static and moving objects in surveillance videos. Furthermore, we introduce another module, namely, the background refining stage, to refine the detection results with more accurate localizations. Pairwise non-local operations are adopted to cope with the misalignments between the features of original and background frames. Extensive experiments on real-world traffic surveillance benchmarks demonstrate the competitive performance of the proposed FG-BR Net. In particular, FG-BR Net ranks on the top among all the methods on hard and sunny subsets of the UA-DETRAC detection dataset, without any bells and whistles.
Original language | English |
---|---|
Article number | 8737851 |
Pages (from-to) | 6077-6090 |
Number of pages | 14 |
Journal | IEEE Transactions on Image Processing |
Volume | 28 |
Issue number | 12 |
DOIs | |
Publication status | Published - Dec 2019 |
Externally published | Yes |
Keywords
- background subtraction
- misalignment
- Object detection
- pairwise non-local operation
- surveillance video