Mining concepts from Wikipedia for ontology construction

Gaoying Cui, Qin Lu, Wenjie Li, Yirong Chen

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

15 Citations (Scopus)

Abstract

An ontology is a structured knowledgebase of concepts organized by relations among them. But concepts are usually mixed with their instances in the corpora for knowledge extraction. Concepts and their corresponding instances share similar features and are difficult to distinguish. In this paper, a novel approach is proposed to comprehensively obtain concepts with the help of definition sentences and Category Labels in Wikipedia pages. N-gram statistics and other NLP knowledge are used to help extracting appropriate concepts. The proposed method identified nearly 50,000 concepts from about 700,000 Wiki pages. The precision reaching 78.5% makes it an effective approach to mine concepts from Wikipedia for ontology construction.
Original languageEnglish
Title of host publicationProceedings - 2009 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT Workshops 2009
Pages287-290
Number of pages4
Volume3
DOIs
Publication statusPublished - 1 Dec 2009
Event2009 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT Workshops 2009 - Milano, Italy
Duration: 15 Sep 200918 Sep 2009

Conference

Conference2009 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT Workshops 2009
Country/TerritoryItaly
CityMilano
Period15/09/0918/09/09

Keywords

  • Concept
  • Ontology construction
  • Wikipedia

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Cite this