Abstract
Scene text detection has become an important research topic. It can be broadly applied to much industrial equipment, such as smart phones, intelligent scanners, and IoT devices. Many existing scene text detection methods have achieved advanced performance. However, text in scene images is presented with differing orientations and varying shapes, rendering scene text detection a challenging task. This paper proposes a method for detecting texts in scene images. First, four stages of low-level features is extracted using DenseNet121. Low-level features are then merged by transposed convolution and skip connection. Second, the merged feature map is used to generate a score map, box map, and angle map. Finally, the Locality-Aware Non-Maximum Suppression (LANMS) is applied as post-processing to generate the final bounding box. The proposed method achieves an F-measure of 0.826 on ICDAR 2015 and 0.761 on MSRA-TD500, respectively.
| Original language | English |
|---|---|
| Pages (from-to) | 29005-29016 |
| Number of pages | 12 |
| Journal | Multimedia Tools and Applications |
| Volume | 80 |
| Issue number | 19 |
| DOIs | |
| Publication status | Published - Jun 2021 |
Keywords
- Convolutional neural network
- Deep feature merging
- DenseNet121
- Scene text detector
ASJC Scopus subject areas
- Software
- Media Technology
- Hardware and Architecture
- Computer Networks and Communications
Fingerprint
Dive into the research topics of 'A scene text detector based on deep feature merging'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver