Template-based processing of incomplete Pinyin codes : with reference to deficiencies in the Microsoft Pinyin input methods

Xiaoheng Zhang

Research output: Journal article publicationJournal articleAcademic research


“不完全拼音码”指在声、韵、调等方面有所省略的拼音输入码。输入法软件在处理不完全拼音码的时候,应该充分利用代码提供的信息,为用户检索出所有符合条件的汉字词语。文章指出并分析了微软最新版的MSPY2003和新注音输入法v6.5在处理声调缺省,韵母缺省以及音节歧义切分等问题时的一些欠妥之处,并根据语言学和辞书知识提出基于拼音码模板的解决策略。实验结果证明,这种方法是相当有效的。||Incomplete Pinyin codes refer to the Chinese character Pinyin input codes with omissions in syllable initials,finals or tones.Chinese input programs supporting incomplete pinyin codes are expected to be capable of presenting all the Chinese character expressions satisfying the conditions set in the codes.The present paper discusses some deficiencies of the two influential Pinyin input methods of Microsoft in their processing of tone omission,final omission as well as ambiguous syllable segmentation,followed by presentation of a template-based remedy employing knowledge from linguistics and dictionaries.The effectiveness of the strategy has been verified by experiments.
Original languageChinese (Simplified)
Pages (from-to)74-76 + 101
Number of pages3
Journal计算机工程与应用 (Computer engineering & application)
Issue number20
Publication statusPublished - 2005


  • Chinese character input
  • Incomplete Pinyin code
  • Template

Cite this