TY - GEN
T1 - Word encoding for word-looking DGA-based Botnet classification
AU - Liew, Sea Ran Cleon
AU - Law, Ngai Fong
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023/11
Y1 - 2023/11
N2 - There are two main types of domain name-generating algorithms (DGAs) - random-looking and word-looking. While existing methods can effectively distinguish between the two types of DGAs with high accuracy, classifying different types of word-looking DGAs has proven to be challenging, as they are often mistaken for legitimate domains. To address this issue, previous methods used character encoding with long short-term memory networks (LSTM) or convolutional neural networks (CNN) to model the character distribution of different word-looking DGAs. Since most word-looking DGAs are constructed using various dictionaries, we propose using word encoding instead of character encoding. Word encoding can provide a better characterization as it is based on the usage of different words in the dictionaries and their associations. Experimental results show that the classification accuracy for word-based DGAs increases by more than 7% (from 87% to 94%) using word encoding as compared to character encoding.
AB - There are two main types of domain name-generating algorithms (DGAs) - random-looking and word-looking. While existing methods can effectively distinguish between the two types of DGAs with high accuracy, classifying different types of word-looking DGAs has proven to be challenging, as they are often mistaken for legitimate domains. To address this issue, previous methods used character encoding with long short-term memory networks (LSTM) or convolutional neural networks (CNN) to model the character distribution of different word-looking DGAs. Since most word-looking DGAs are constructed using various dictionaries, we propose using word encoding instead of character encoding. Word encoding can provide a better characterization as it is based on the usage of different words in the dictionaries and their associations. Experimental results show that the classification accuracy for word-based DGAs increases by more than 7% (from 87% to 94%) using word encoding as compared to character encoding.
UR - http://www.scopus.com/inward/record.url?scp=85180007088&partnerID=8YFLogxK
U2 - 10.1109/APSIPAASC58517.2023.10317505
DO - 10.1109/APSIPAASC58517.2023.10317505
M3 - Conference article published in proceeding or book
AN - SCOPUS:85180007088
T3 - 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
SP - 1816
EP - 1821
BT - 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023
Y2 - 31 October 2023 through 3 November 2023
ER -