The tracking-by-detection tracking framework usually consists of two stages: drawing samples around the target object and classifying each sample as either the target object or background. Current popular trackers under this framework typically draw many samples from the raw image and feed them into the deep neural networks, resulting in high computational burden and low tracking speed. In this article, we propose an adversarial feature sampling learning (AFSL) method to address this problem. A convolutional neural network is designed, which takes only one cropped image around the target object as input, and samples are collected from the feature maps with spatial bilinear resampling. To enrich the appearance variations of positive samples in the feature space, which has limited spatial resolution, we fuse the high-level features and low-level features to better describe the target by using a generative adversarial network. Extensive experiments on benchmark data sets demonstrate that the proposed ASFL achieves leading tracking accuracy while significantly accelerating the speed of tracking-by-detection trackers.
|Journal||IEEE Transactions on Automation Science and Engineering|
|Publication status||Accepted/In press - 1 Jan 2019|
- Adversarial learning
- deep convolution neural network
- feature sampling
- visual tracking.
ASJC Scopus subject areas
- Control and Systems Engineering
- Electrical and Electronic Engineering