Comparing Gender Bias in Lexical Semantics and World Knowledge: Deep-learning Models Pre-trained on Historical Corpus

Yingqiu Ge, Jinghang Gu (Corresponding Author), Chu-Ren Huang, Lifu Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

This study investigates the impact of continued pre-training transformer-based deep learning models on historical corpus, focusing on BERT, RoBERTa, XLNet, and GPT-2. By extracting word representations from different layers, we compute gender bias embedding scores and analyze their correlation with human bias scores and real-world occupation participation differences. Our results show that BERT, an encoderonly model, achieves the most substantial improvement in capturing human-like lexical semantics and world knowledge, outperforming traditional static word vectors like Word2Vec. Continued pre-training on historical data significantly enhances BERT’s performance, especially in the lower-middle layers. When historical human biases are difficult to quantify due to data scarcity, continued pre-training BERT on historical corpora and averaging lexical representations up to the 6th layer provides an
accurate reflection of gender-related historical biases and world knowledge.
Original languageEnglish
Title of host publicationProceedings of the 38th Pacific Asia Conference on Language, Information and Computation
EditorsNathaniel Oco, Shirley N. Dita, Ariane Macalinga Borlongan, Jong-Bok Kim
PublisherTokyo University of Foreign Studies
Pages1316-1331
Publication statusPublished - Dec 2024
EventThe 38th Pacific Asia Conference on Language, Information and Computation [PACLIC-38] - Tokyo University of Foreign Studies, Tokyo, Japan
Duration: 7 Dec 20249 Dec 2024

Conference

ConferenceThe 38th Pacific Asia Conference on Language, Information and Computation [PACLIC-38]
Country/TerritoryJapan
CityTokyo
Period7/12/249/12/24

Fingerprint

Dive into the research topics of 'Comparing Gender Bias in Lexical Semantics and World Knowledge: Deep-learning Models Pre-trained on Historical Corpus'. Together they form a unique fingerprint.

Cite this