The size of Richard Nixon’s nose, part II
February 18, 2012 Leave a comment
In part 1 we saw how to encode a “big nose phenotype” in such a way that it was neutral with respect to the path the class expression takes through the object graph, subsuming all of:
- any entity with a nose that has the characteristic of being big
- anything that exhibits a bigness that is a characteristic of a nose
Thus masking over the distinctions inherent in a formal ontological representation.
We can take this one step further and make our big nose phenotype encompass the nose itself, and its own bigness characteristic. The simplest way to do this would be to make the relation exhbits reflexive – either with a direct reflexivity characteristic, or a local reflexivity general axiom:
Thing SubClassOf exhibits some Self
Unfortunately this runs afoul of DL expressivity constraints. Fortunately, there is a trick at hand. A really gnarly one, but it works.
First of all we have to declare a “fake” relation – let’s append SELF onto the end:
Now we make this reflexive:
:exhibitsSELF some Self
This is legal, as exhibitsSELF is a “simple” object property. Finally, we add the following:
We have sneaked our reflexivity constraint in via a fake relation. It’s a shame that all this obfuscating machinery is required to do this, it would be nice if there were some OWL syntactic sugar.
We can do the same thing for has_part, which is traditionally reflexive:
With that in place we can revisit our test probe classes from last time:
EquivalentTo: :exhibits some (:big and :characteristic_of some :nose)
EquivalentTo: :exhibits some (:has_part some (:nose and :has_characteristic some :big))
EquivalentTo: :exhibits some (:nose and :has_characteristic some :big)
EquivalentTo: :has_part some (:nose and :has_characteristic some :big)
Now the inferred hierarchy looks like this:
And if we examine our 3 individuals, we see they classify as follows:
- nixon : test1, test2, test3
- nixons_nose : test1, test2, test3
- nixons_nose_size: test1, test2, test3, test4
So using the exhibits relation we can encode a very general notion of phenotype, that of exhibiting some characteristic, which classifies either the organism, the affected part, or the characteristic itself.
The machinery is rather arcane though, and does require stepping outside the EL subset of OWL. In general, it is of course better to decide on a single form. Unfortunately, no one form satisfies all purposes.
An organism-centric representation is intuitive and simple. If the instances you’re classifying are organisms (e.g. humans with disorders, mutant fruitflies, rare butterfly specimens) then this works very well. It also makes it easy to represent “composite” phenotypes such as “organism with big nose and sweaty palms”. However, if we take this to the step of equating the phenotype with this representation, then we have the curious situation where the organism is the phenotype rather than the organism has a phenotype or phenotypes. If we conceive of phenotype as entirely a class level thing, then we have one organism instantiating multiple phenotypes, but we should be clear that in this model the relationship between the phenotype instances and organism instance is identity.
A organism-part centric view is also intuitive and simple. For example “nose and has_characteristic some big”. But note the entailments we get from this – an abnormally big nose is part of an abnormal head, but it’s not a subclass of an abnormal head. This is in contrast to the relation we expect to have between the corresponding phenotypes, which is a subclass relationship (on the evidence of all pre-coordinated phenotype ontologies). So this representation is absolutely fine, but we should be clear that we are representing anatomy (perhaps variant anatomy in particular) rather than phenotypes – the relationship between the two may be trivial, and glossed over using the exhibits pattern above. But for modeling phenotypes ontologically we have to be clear about the distinction.
A characteristic-centric view is perhaps the most unintuitive – it asks us to believe in characteristics/qualities as individuals in the world, which is perfectly fine in the BFO ontology, but people may still have a hard time conceiving of this, in contrast to the more “physical” class expressions above. However, it offers distinct advantages. It allows us to talk directly about the characteristic itself – e.g. the dysplastic characteristic of John’s heart was due to the presence of a particular sequence in his Shh genes. If we try and switch this around we get into trouble; eg. if we equate the “dysplastic heart” phenotype with a class expression “heart and has characteristic dysplatic”, then we say that this phenotype arises from a Shh mutation, we lose the fact that the “dysplaticity” is the characteristic we care about, rather than any of the other characteristics of John’s heart.
One other advantage of the characteristic-centric view is that it corresponds to a more traditional view of phenotypes as the characteristics of an organism.
We adopted the quality/character-centric view for defining the phenotypes in the MP ontology (see our Genome Biology paper) – this worked fairly well when we tested it by recapitulating asserted subclasses via reasoning. However, it worked less well when we used it for HP, which includes many composite phenotypes – e.g. “large flat nose” – these cannot be equated to any single characteristic, it is in fact two characteristics. We can get around this by equating a phenotype with either an individual characteristic, or a collection of (presumably related) characteristics. More of this on the next post on this matter….