Avoid mixing parthood with cardinality constraints

We frequently have situations where we want to make an assertion combining parthood and cardinality. Here, cardinality pertains to the number of instances.

Section 5.3 of the OWL2 primer has an example of a cardinality restriction,

Individual: John
   Types: hasChild max 4 Parent

This is saying that John has at most 4 children that are parents.

A more natural biological example of a cardinality restriction is stating something like:

Class: Hand
   SubClassOf: hasPart exactly 5 Finger

i.e. every hand has exactly 5 fingers. Remember we are following the open world assumption here – we are not saying that our ontology has to list all 5 fingers. We are stating that in our model of the world, any instance of a hand h entails the existence of 5 distinct finger instances f1…f5, each of which is related to that hand, i.e. fn part-of h. Furthermore there are no other finger instances not in the set f1..5 that are part of h.

The precise set-theoretic semantics and provided in the OWL2 Direct Semantics specification.

This cardinality axiom seems like a perfectly natural, reasonable, and useful thing to add to an ontology (avoiding for now discussions about “canonical hands” and “variant hands”, for example in cases like polydactyly; let’s just assume we are describing “canonical anatomy”, whatever that is).

5 Kids Hand Showing The Number Five Hand Sign Stock Illustration - Download  Image Now - iStock
A canonical hand, with 5 fingers

And in fact there are various ontologies in OBO that do this kind of thing.

However, there is a trap here. If you try and run a reasoner such as HermiT you will get a frankly perplexing and unhelpful error such as this.

An error occurred during reasoning: Non-simple property 'BFO_0000051' or its inverse appears in the cardinality restriction 'BFO_0000051 exactly 5 UBERON_1234567

If you have a background in computer science and you have some time to spare you can go and look at section 11 of the OWL specification (global restrictions on axioms in OWL2 DL) to see what the magical special laws you must adhere to when writing OWL ontologies that conform to the DL profile, but it’s not particularly helpful to understand:

  • why you aren’t allowed to write this kind of thing, and
  • what the solution is.

Why can’t I say that?

A full explanation is outside the scope of this article, but the basic problem arises when combining transitive properties such as part-of with cardinality constraints. It makes the ontology fall outside the “DL” profile, which means that reasoners such as HermiT can’t use it, so rather ignore it HermiT will complain and refuse to reason.

Well I want to say that anyway, what happens if I do?

You may choose to assert the axiom anyway – after all, it may feel useful for documentation purposes, and people can ignore it if they don’t want it, right? That seems OK to me, but I don’t make the rules.

Even if you don’t intend to stray outside DL, an insidious problem arises here: many OBO ontologies use Elk as their reasoner, and Elk will happily ignore these DL violations (as well as anything it can’t reason over, outside it’s variant of the EL++ profile). This in itself is totally fine – its inferences are sound/correct, they just might not be complete. However, we have a problem if an ontology includes these DL violations, and the relevant portion of that ontology is extracted and then used as part of another ontology with a DL reasoner such as HermiT that fails fast when presented with these axioms. In most pipelines, if an ontology can’t be reasoned, it can’t be released, and everything is gummed up until an OWL expert can be summoned to diagnose and fix the problem. Things get worse if an ontology that is an N-th order import has a DL violation, as it may require waiting for all imports in the chain to be re-released. Not good!

Every now and then this happens with an OBO ontology and things get gummed up, and people naturally ask the usual questions, why did this happen, why can’t I say this perfectly reasonable thing, how do I fix this, hence this article.

How do we stop people from saying that?

Previously we didn’t have a good system for stopping people from making these assertions in their ontologies, and the assertions would leak via imports and imports of imports, and gum things up.

Now we have the wonderful robot tool and the relatively new validate-profile command, which can be run like this:

robot validate-profile --profile DL \
  --input my-ontology.owl \
  --output validation.txt

This will ensure that the ontology is in the DL profile. If it isn’t, this will fail, so you can add this to your Makefile in order to fail fast and avoid poisoning downstream ontologies.

This check will soon be integrated into the standard ODK setup.

OK, so how do I fix these?

So you have inadvertently strayed outside the DL profile and your ontology is full of has-parts with cardinality constraints – you didn’t mean it! You were only alerted when a poor downstream ontology imported your axioms and tried to use a DL reasoner. So what do you do?

In all honesty, my advice to you is ditch that axiom. Toss it in the bin. Can it. Flush that axiom down the toilet. Go on. It won’t be missed. It was already being ignored. I guarantee it wasn’t doing any real work for you (here work == entailments). And I guarantee your users won’t miss it.

A piece of advice often given to aspiring writers is to kill your darlings, i.e. get rid of your most precious and especially self-indulgent passages for the greater good of your literary work. The same applies here. Most complex OWL axioms are self-indulgences.

Even if you think you have a really good use case for having these axioms, such as representing stoichiometry of protein complexes or reaction participants, the chances are that OWL is actually a bad framework for the kind of inference you need, and you should be using some kind of closed-world reasoning system, e.g. one based on datalog.

OK, so maybe you don’t believe me, and you really want to say something that approximates your parts-with-numbers. Well, you can certainly weaken your cardinality restriction to an existential restriction (provided the minimum cardinality is above zero; for maxCardinality of zero you can use a ComplementOf). So in the anatomical example we could say

Class: Hand
   SubClassOf: hasPart some Finger

This is still perfectly sound – it is not as complete as your previous statement, but does that matter? What entailments were you expecting from the cardinality axiom. If this is intended more for humans, you can simply annotate your axiom with a comment indicating that humans typically have 5 fingers.

OK, so you still find this unsatisfactory. You really want to include a cardinality assertion, dammit! Fine, you can have one, but you won’t like it. We reluctantly added a sub-property of has-part to RO called ‘has component’:

In all honesty the definition for this relation is not so great. Picking holes in it is not so hard. It exists purely to get around this somewhat ridiculous constraint, and for you to be able to express your precious cardinality constraint, like this:

Class: Hand
   SubClassOf: hasComponent exactly 5 Finger

So how does this get around the constraint? Simple: hasComponent is not declared transitive. (recall that transitivity is not inferred down a property hierarchy). Also it is a different relation (a subproperty) from has-part, so you might not get the inferences you expect. For example, this does NOT prevent me from making an instance of hand that has as parts 6 distinct fingers – it only applies to the more specific relation, which we have not asserted, and is not inferred.

You can make an argument that this is worse than useless – it gives no useful entailments, and it confuses people to boot. I am responsible for creating this relation, and I have used it in ontologies like Uberon, but I now think this was a mistake.

Other approaches

There are other approaches. For a fixed set of cardinalities you could create subproperties, e.g. has-1-part-of, has-2-parts-of, etc. But these would still be less expressive than you would like, and would confuse people.

A pattern that does work in certain cases such as providing logical definitions for things like cells by number of nuclei is to use the EL-shunt pattern (to be covered in a future article) and make characteristic classes in an ontology like PATO.

While this still isn’t as expressive, it allows you to use proxies for cardinality in logical definitions (which do actual work for you), and shunt off the cardinality reasoning to a smaller ontology — where really it’s actually fine to just assert the hierarchy.

But this doesn’t work in all cases. There is no point making characteristics/qualities if they are not reused. It would be silly to do this with the hand example (e.g. making a has5fingers quality).

Isn’t this all a bit annoying?

Yes, it is. In my mind we should be free to state whatever axioms we need for our own use cases, expressivity be damned. I should be able to go outside DL, in fact to use anything from FOL or even beyond. We should not be condemned to live in fear of stepping outside of decidability (which sounds like something you want, but in practice is not useful). There are plenty of good strategies for employing a hybrid reasoning strategy, and in any case, we never use all of DL for most large OBO ontologies anyway.

But we have the technology we do, and we have to live with it for now.

TL;DR

  • don’t mix parthood and cardinality
  • you probably won’t miss that cardinality restriction anyway
  • no really, you won’t
  • use robot to check your profiles

About Chris Mungall
Computer Research Scientist at Berkeley Lab. Interests: AI / Ontologies / Bioinformatics. Projects: GO, Monarch, Alliance, OBOFoundry, NMDC

5 Responses to Avoid mixing parthood with cardinality constraints

  1. Ralf Stephan says:

    What is the evidence for such statements, anyway? While there may be polydactyly statistics I’m skeptical anyone ever counted lost fingers. With chemical reactions it seems a clean case only until you get to polymers, or try to annotate glycosylations of the SARS-Cov-2 Spike protein based on mass spectroscopy evidence (which is inherently statistical). Finally protein complexes can only be supported for specific species, and even then the evidence for homodimer vs homotetramer can be purely statistical.

    On another note, a pragmatic solution for ontology creators might have been to provide different versions of the ontology. Filtering out the respective statements should be no problem, right?

  2. Hi Ralf – what statements are you asking for evidence for? For my general statements about these kinds of complex DL axioms doing no real work, this could potentially be quantified, e.g. by looking at entailment diffs. This might make a nice paper. For now, I am confident making these statements scoped to bio-ontologies based on what I have observed in practice, which is extensive.

    For chemical reactions we are exploring an axiomatization of Rhea that needs to deal with stoichiometry, stay tuned… but I still hold the cardinality inferences don’t buy you so much, and these may be better done outside OWL.

    I am not sure I totally follow the glycosylation example, but I’m interested in it. Ultimately the question is: would being able to represent this faithfully in OWL (if possible) buy you that much? What can you do with that OWL representation. Contrast with all the work that needs to be done with much more basic axiomatization across all bio-ontologies.

    I like your dual version proposal, this would take a bit of work and coordination to implement eg eeveryone would have to adopt naming conventions such as X-full, X-dl, X-el, we would need to implement pipelines that checked against the relevant profiles.. all trivial but so much plumbing, so much documentation to alleviate potential confusion and mis-use, and what would it buy you at the end of the day?

    Thanks for your comments!
    Chris

  3. Damion Dooley says:

    It is great to have you write up the distinction between ‘has part’ and ‘has component’. Where I have found cardinality useful in OWL via ‘has component’ is in the metadata level of expressing data standards – where a data standard indicates a minimum or maximum number of components.

    Maybe it is useful too in logic about the periodic table, keeping track of subatomic particle quantities?

    • Thanks Damion. For representing data standards, is OWL even appropriate for reasoning? Beware the OWA assumption, see https://douroucouli.wordpress.com/2020/09/04/the-open-world-assumption-considered-harmful/. I will do a future post on LinkML (http://linkml.github.io/) which may be more appropriate for linking data standards and ontologies. I may be missing some of the meta-level reasoning use cases though.

      Periodic tables: it would certainly be ontologically correct (if outside DL) to say

      uranium EquivalentTo atom and has_part exactly 92 proton

      But is it useful (even if you could reason over it with a DL reasoner)? Will you have an abox with subatomic particle individuals instantiated?

      Consider an alternative representation with a data property of `atomic_number` (or annotation property, to avoid punning). This is perhaps less aesthetically satisfying, and there are some use cases that won’t be covered (e.g. the abox with enumerated subatomic particles use case). But I am fairly sure this will be more pragmatic. And this can be made semantically sound by treating the data property as a shortcut for a full FOL expression (also outside DL) that is there for ontological completeness and allow us to be very precise, but not intended to be reasoned over in scale.

      As an aside, the main chemical ontology we use, CHEBI, is missing a lot of really easy axiomatization that would be really beneficial, before we even get to more complex things like cardinality.

      You may be interested in this schema for chemistry, which allows for both direct data property axiomatization of chemical elements, as well as expansions to full OWL or in future FOL axioms:

      https://github.com/cmungall/chem-schema

    • actually, it’s not totally clear to me that you need a transitive property for representing the ‘components’ of a standard, it’s not clear that saying (for example) a field F is “part of” a checklist C, you need to use the RO part of relation, in fact a more specific non-transitive relation may be entirely appropriate, regardless of DL restrictions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: