Lexical Data Augmentation for Text Classification in Deep Learning

Rong Xiang, Emmanuele Chersoni, Yunfei Long, Qin Lu, Chu Ren Huang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

3 Citations (Scopus)


This paper presents our work on using part-of-speech focused lexical substitution for data augmentation (PLSDA) to enhance the prediction capabilities and the performance of deep learning models. This paper explains how PLSDA uses part-of-speech information to identify words and make use of different augmentation strategies to find semantically related substitutions to generate new instances for training. Evaluations of PLSDA is conducted on a variety of datasets across different text classification tasks. When PLSDA is applied to four deep learning models, results show that classifiers trained with PLSDA achieve 1.3% accuracy improvement on average.

Original languageEnglish
Title of host publicationAdvances in Artificial Intelligence - 33rd Canadian Conference on Artificial Intelligence, Canadian AI 2020, Proceedings
EditorsCyril Goutte, Xiaodan Zhu
Number of pages7
ISBN (Print)9783030473570
Publication statusPublished - 1 Jan 2020
Event33rd Canadian Conference on Artificial Intelligence, Canadian AI 2020 - Ottawa, Canada
Duration: 13 May 202015 May 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12109 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference33rd Canadian Conference on Artificial Intelligence, Canadian AI 2020


  • Data augmentation
  • Deep learning
  • Lexical data augmentation
  • Text classification

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this