OPPCAT: Ontology population from tabular data


ÖZTÜRK Ö.

Journal of Information Science, vol.46, no.2, pp.161-175, 2020 (SCI-Expanded, SSCI, Scopus) identifier

  • Publication Type: Article / Article
  • Volume: 46 Issue: 2
  • Publication Date: 2020
  • Doi Number: 10.1177/0165551519827892
  • Journal Name: Journal of Information Science
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus, Academic Search Premier, FRANCIS, IBZ Online, ABI/INFORM, Aerospace Database, Analytical Abstracts, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EBSCO Education Source, Education Abstracts, Index Islamicus, Information Science and Technology Abstracts, INSPEC, Library and Information Science Abstracts, Library Literature and Information Science, Metadex, Civil Engineering Abstracts, Library, Information Science & Technology Abstracts (LISTA)
  • Page Numbers: pp.161-175
  • Keywords: e-commerce, ontology, ontology population, Semantic Web, tabular data
  • Manisa Celal Bayar University Affiliated: Yes

Abstract

In order to present large amount of information on the Web to both users and machines, it is urgently needed to structure Web data. E-commerce is one of the areas where increasing data bottlenecks on the Web inhibit data access. Ontological display of the product information enables better product comparison and search applications using the semantics of the product specifications and their corresponding values. In this article, we present a framework called OPPCAT, which is used for semi-automatic ontology population from tabular data in e-commerce stores and product catalogues. As a result, OPPCAT allows tabular data to be used for mass production of ontology content. First, we present the common patterns in tabular data which obstruct semi-automatic production of ontologies. Then, we suggest solutions which automatically fix these errors. Finally, we define an algorithm to build ontology content semi-automatically.