Angle Tokenization Guided Multi-Scale Vision Transformer for Oriented Object Detection in Remote Sensing Imagery

Cong Zhang, Tianshan Liu, Kin Man Lam

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

7 Citations (Scopus)

Abstract

In this paper, an angle tokenization guided multi-scale Trans-former framework is proposed for oriented object detection in remote sensing images. Different from existing detectors that are based on convolutional neural networks (CNNs), our proposed method is based on a pyramid Transformer architecture with a compact and flexible angle tokenization module (ATM) to efficiently learn the orientation knowledge for rotated geospatial objects. The Transformer structure can progressively render long-range dependencies and multi-scale spatial details required for accurate localization, while the ATM provides robust guidance on feature refinement for angle prediction, jointly achieving end-to-end orientation de-tection. To the best of our knowledge, this is the first work to adapt Vision Transformers to remote sensing oriented object detection. Experimental results demonstrate the effectiveness and superiority of our method.

Original languageEnglish
Title of host publicationIGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium
PublisherInstitute of Electrical and Electronics Engineers Inc.
Chapter9883662
Pages3063-3066
Number of pages4
ISBN (Electronic)9781665427920
DOIs
Publication statusPublished - Jul 2022
Event2022 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2022 - Kuala Lumpur, Malaysia
Duration: 17 Jul 202222 Jul 2022

Publication series

NameInternational Geoscience and Remote Sensing Symposium (IGARSS)
Volume2022-July

Conference

Conference2022 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2022
Country/TerritoryMalaysia
CityKuala Lumpur
Period17/07/2222/07/22

Keywords

  • angle tokenization
  • multi-head attention
  • multi-scale vision Transform-ers
  • oriented object detection
  • Remote sensing images

ASJC Scopus subject areas

  • Computer Science Applications
  • General Earth and Planetary Sciences

Fingerprint

Dive into the research topics of 'Angle Tokenization Guided Multi-Scale Vision Transformer for Oriented Object Detection in Remote Sensing Imagery'. Together they form a unique fingerprint.

Cite this