Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/5546
Title: Scaling the development of large ontologies : identitas and hypernormalization
Authors: Alshammry, Nizal Khalf
Issue Date: 2021
Publisher: Newcastle University
Abstract: During the last decade ontologies have become a fundamental part of the life sciences to build organised computational knowledge. Currently, there are more than 800 biomedical ontologies hosted by the NCBO BioPortal repository. However, the proliferation of ontologies in the biomedical and biological domains has highlighted a number of problems. As ontologies become large, their development and maintenance becomes more challenging and time-consuming. Therefore, the scalability of ontology development has become problematic. In this thesis, we examine two new approaches that can help address this challenge. First, we consider a new approach to identi ers that could signi cantly facilitate the scalability of ontologies and overcome some related issues with monotonic, numeric identi ers while remaining semantics-free. Our solutions are described, along with the Identitas library, which allows concurrent development, pronounceability and error checking. The library integrated into two ontology development environments, Prot eg e and Tawny-OWL. This thesis also discusses the ways in which current ontological practices could be migrated towards the use of this scheme. Second, we investigate the usage of the hypernormalisation, patternisation and programatic approaches by asking how we could use this approach to rebuild the Gene Ontology (GO). The aim of the hypernormalisation and patternisation techniques is to allow the ontology developer to manage its maintainability and evolution. To apply this approach we had to analyse the ontology structure, starting with the Molecular Function Ontology (MFO). The MFO is formed from several large and tangled hierarchies of classes, each of which describe a broad molecular activity. The exploitation of the hypernormalisation approach resulted in the creation of a hypernormalised form of the Transporter Activity (TA) and Catalytic Activity (CA) hierarchies, together they constitute 78% of all classes in MFO. The hypernormalised structure of the TA and CA are generated based on developed higher-level patterns and novel content-speci c patterns, and exploit ontology logical reasoners. The gen- erated ontologies are robust, easy to maintain and can be developed and extended freely. Although, there are a variety of ontologies development tools, Tawny-OWL is a programmatic interactive tool for ontology creation and management and provides a set of patterns that explicitly support the creation of a hypernormalised ontology. Finally, the investigation of the hypernormalisation highlighted inconsistent classi- cations and identi cation of signi cant semantic mismatch between GO and the Chemical Entities of Biological Interest (ChEBI). Although both ontologies describe the same real entities, GO often refers to the form most common in biology, while ChEBI is more speci c and precise. The use of hypernormalisation forces us to deal with this mismatch, we used the equivalence axioms created by the GO-Plus ontology. To sum up, to address the scalability and ease development of ontologies we propose a new identi er scheme and investigate the use of the hypernormalisation methodology. Together, the Identitas and the hypernormalisation technique should enable the construction of large-scale ontologies in the future.
Description: PhD Thesis
URI: http://hdl.handle.net/10443/5546
Appears in Collections:School of Computing

Files in This Item:
File Description SizeFormat 
Alshammry N 20214.45 MBAdobe PDFView/Open
dspacelicence.pdf43.82 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.