TY - GEN
T1 - A New Training Model for Object Detection in Aerial Images
AU - Yang, Geng
AU - Geng, Yu
AU - Li, Qin
AU - You, Jia
AU - Cai, Mingpeng
N1 - Publisher Copyright:
© 2020, Society for Imaging Science and Technology.
PY - 2020/1/26
Y1 - 2020/1/26
N2 - This paper presents a new training model for orientation invariant object detection in aerial images by extending a deep learning based RetinaNet which is a single-stage detector based on feature pyramid networks and focal loss for dense object detection. Unlike R3Det which applies feature refinement to handle rotating objects, we proposed further improvement to cope with the densely arranged and class imbalance problems in aerial imaging, on three aspects: 1) All training images are traversed in each iteration instead of only one image in each iteration in order to cover all possibilities; 2) The learning rate is reduced if losses are not reduced; and 3) The learning rate is reduced if losses are not changed. The proposed method was calibrated and validated by comprehensive for performance evaluation and benchmarking. The experiment results demonstrate the significant improvement in comparison with R3Dec approach on the same data set. In addition to the well-known public data set DOTA for benchmarking, a new data set is also established by considering the balance between the training set and testing set. The map of losses which dropped down smoothly without jitter and overfitting also illustrates the advantages of the proposed newmodel
AB - This paper presents a new training model for orientation invariant object detection in aerial images by extending a deep learning based RetinaNet which is a single-stage detector based on feature pyramid networks and focal loss for dense object detection. Unlike R3Det which applies feature refinement to handle rotating objects, we proposed further improvement to cope with the densely arranged and class imbalance problems in aerial imaging, on three aspects: 1) All training images are traversed in each iteration instead of only one image in each iteration in order to cover all possibilities; 2) The learning rate is reduced if losses are not reduced; and 3) The learning rate is reduced if losses are not changed. The proposed method was calibrated and validated by comprehensive for performance evaluation and benchmarking. The experiment results demonstrate the significant improvement in comparison with R3Dec approach on the same data set. In addition to the well-known public data set DOTA for benchmarking, a new data set is also established by considering the balance between the training set and testing set. The map of losses which dropped down smoothly without jitter and overfitting also illustrates the advantages of the proposed newmodel
UR - http://www.scopus.com/inward/record.url?scp=85095111444&partnerID=8YFLogxK
U2 - 10.2352/ISSN.2470-1173.2020.8.IMAWM-084
DO - 10.2352/ISSN.2470-1173.2020.8.IMAWM-084
M3 - Conference article published in proceeding or book
VL - 2020
T3 - IS and T International Symposium on Electronic Imaging Science and Technology
SP - 1
EP - 5
BT - Proc. SPIE Conf. Imaging and Multimedia Analytics in a Web and Mobile World (IMAWMW), San Francisco, USA, 26-30 Jan.2020
PB - SPIE-International Society for Optical Engineering
T2 - Imaging and Multimedia Analytics in a Web and Mobile World 2020
Y2 - 26 January 2020 through 30 January 2020
ER -