In a previous post, Mark Saccomano described how ontologies can be used to enrich database entries with semantic information to help users find relevant records quickly with the help of a search interface. In this blog post, I describe how ontologies are represented to the computer, and how the use of an ontology differs from a traditional database model.
Ontologies intend to capture facts about the objects of a particular area of knowledge, called a domain. We might be used to recording facts about an object as a list of properties. For example, in a traditional, row-per-record database, every record would be associated with a number of fields:
Each record captures facts about the instrument shown in the image: the name of the instrument, the family to which that instrument belongs, and the number of strings of the particular instrument shown in the image.
A human might recognize that these fields record information at two levels of abstraction: facts about instruments in general (their names, and their membership in the Hornbostel-Sachs classification) and facts about the specific instrument depicted in the image. Yet the “flat” structure of the database representation, which does not explicitly model this hierarchy, fails to capture such a distinction: from the perspective of a row-per-record database, each field assumed to be informative about only the record to which it is attached.
Ontologies correct this failure by making such hierarchies explicit in the domain. To achieve this goal, an ontology consists of the following components:
Individuals are basic objects of the domain: specific places, references to specific images in manuscripts, specific individuals (real and imaginary), musical instruments, etc. Classes are simply collections of individuals, with the additional rider that individuals may be members of several classes. Finally, properties are the component of an ontology that gives it its hierarchical richness: properties represent relations between individuals and/or classes. Often, the first step in designing an ontology is to figure out the individuals and the classes that best capture the structure of the domain. This process is closely related to the preparation of a controlled vocabulary.
By way of example, consider the first record in the table above. The underlying image, angel_playing_harp.png is an individual: only one such image exists in the domain. The fields in the table suggest that the image depicts a performance taking place; our ontology must model this explicitly. The aPerformance (hasPerformance) property captures “depicts a performance” as a relation between a specific image (an individual) and a specific, depicted performance (another individual). This relationship can be expressed in the following form
(subject, predicate, object)
where either individuals or classes may act as subject or object, and the predicate is taken from the properties defined by the ontology. This representation is called a triple, a three-place sequence. Concretely,
(angel_playing_harp, aPerformance, performance1)
might represent one of the many relations that are implicit in the first record of the traditional database example listed above. The object of this relation, the individual performance1 might subsequently be used in a number of other triples, now in the subject position. This characterizes the performance in more detail, describing, for example: the instruments used (if any), the function of the performance within the image, the genre of the performance. In turn, these objects may appear as the subject of further relations: capturing organlogical properties of the instrument, the basis for any iconographical or musicological interpretation, and so on.
The MUSICONIS ontology is designed in accordance with the principles of the Semantic Web, a cross-domain, international effort to define a set of best practices for the sharing of knowledge online. The online publication of freely available ontologies, sometimes also called Linked Data, allows libraries, researchers and web application designers to enrich their exhibits and research software with the expert domain knowledge represented by the ontology. Despite the diversity of syntaxes that are available to define and distribute ontologies online (among them OWL, OWL2, and Turtle), all express this key principle of knowledge modeling: relations between individuals, classes, and properties more accurately capture the hierarchical richness of domain knowledge than a traditional, record-per-line database.