Multiview Identifiers Enhanced Generative Retrieval

Yongqi Li, Nan Yang, Liang Wang, Furu Wei, Wenjie Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

14 Citations (Scopus)

Abstract

Instead of simply matching a query to preexisting passages, generative retrieval generates identifier strings of passages as the retrieval target. At a cost, the identifier must be distinctive enough to represent a passage. Current approaches use either a numeric ID or a text piece (such as a title or substrings) as the identifier. However, these identifiers cannot cover a passage's content well. As such, we are motivated to propose a new type of identifier, synthetic identifiers, that are generated based on the content of a passage and could integrate contextualized information that text pieces lack. Furthermore, we simultaneously consider multiview identifiers, including synthetic identifiers, titles, and substrings. These views of identifiers complement each other and facilitate the holistic ranking of passages from multiple perspectives. We conduct a series of experiments on three public datasets, and the results indicate that our proposed approach performs the best in generative retrieval, demonstrating its effectiveness and robustness. The code is released at https://github.com/liyongqi67/MINDER.

Original languageEnglish
Title of host publicationLong Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages6636-6648
Number of pages13
ISBN (Electronic)9781959429722
Publication statusPublished - Jul 2023
Event61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada
Duration: 9 Jul 202314 Jul 2023

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume1
ISSN (Print)0736-587X

Conference

Conference61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Country/TerritoryCanada
CityToronto
Period9/07/2314/07/23

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Multiview Identifiers Enhanced Generative Retrieval'. Together they form a unique fingerprint.

Cite this