Skip to main content

Ontologies in Model-Prime

What is an ontology?

An ontology is a classification of data into categories, known as classes. For example, one might classify Living things into the classes Animal, Plant, Fungus, and so on. Animal might be further classified as Vertebrate, Arthropod, Mollusc, and so on. Every Vertebrate is an Animal, and every Animal is Living. This is called a subclass relationship, also known as an 'is-a' relationship.

The entire classification, consisting of any number of classes and subclass relationships, is an ontology.

Class Hierarchy

Screenshot of the ontology editor WebProtégé

Why are ontologies important for robotics data?

Model-Prime has an ONTOLOGY data type, which specifies that the values of that field are constrained to the classes of an ontology that you define. This can be useful for a number of reasons:

  • Data values are always enforced to be consistent; never worry about data being entered as "Bicyclist" sometimes and "Cyclist" at other times (if indeed those are the same thing). Instead, you can define a single class having multiple labels, so that those entries will be equivalent.

  • Data values are searchable according to the ontology hierarchy. For instance, you can search for all instances of Fungus including those that were labeled as Mushroom or any other subclass.

  • Ontologies can be maintained over time to add new classes or subclasses, to change their labels, or even to alter the subclass relationships between existing classes. This naturally allows your data models to increase in detail and specificity without having to re-ingest or backfill the data every time a change is made.

Ontology fields can be used for anything from a simple enumeration (e.g. days of the week or states in a finite state machine), to a detailed classifcation of actors recognized by an AI model, to a classification of anomaly or error categories on the robot itself. Be inventive! We encourage the use of richly annotated ontology descriptors instead of "dumb" text labels when possible, so that you can get the most out of your data.