OntoTip: Lift/Borrow/Steal Software Engineering Principles

This is one post in a series of tips on ontology development, see the parent post for more details.

The main premise of this piece is that ontology developers can learn from the experience of software engineers. Ontologists are fond of deriving principles based on abstract concepts or philosophical traditions, whereas more engineering-oriented principles such as those found in software engineering have been neglected, which is to our detriment.

Screen Shot 2019-03-09 at 1.31.30 PM

Figure: An appreciation of engineering practice and in particular software development principles is often overlooked by ontologists.


In its decades-long history, software development has matured, encompassing practices such as modular design, version control, design patterns, unit testing, continuous integration, and a variety of methodologies , from waterfall top-down design through to extreme and agile development.  Many of these are relevant to ontology development, more than you might think. Even if a particular practice is not directly applicable, knowledge of it can help; thinking like a software engineer can be useful. For example, most good software engineers have internalized the DRY principle (Don’t Repeat Yourself), and will internally curse themselves if they end up duplicating chunks of code or logic as expedient hacks. They know that they are accumulating technical debt (for example, necessitating parallel updates in multiple places). The DRY principle and the DRY way of thinking should also permeate ontology development. Similarly, software developers cultivate a sense of ‘code smell’,  and will tell you if a piece of code has a ‘bad smell’ (naturally, this is always code written by someone else).

Screen Shot 2019-03-09 at 1.33.40 PM

Don’t worry if you don’t have any experience programming, as the equivalent ontology engineering practices and intuitions can be learned through training, experience, the use of appropriate tools, and sharing experience with others. Unfortunately, tools are not yet as mature for ontology engineering as they are for software, but we are trying to address this with the Ontology Development Kit, an ongoing project to provide a framework for ordinary ontology developers to apply standard engineering principles and practice.

Computer scientists and software engineers are also fortunate in having a large body of literature covering their discipline in a holistic fashion; this includes classics such as The Mythical Man Month, The “Gang of Four” Design Patterns book, Martin Fowler’s blog and his book Refactoring. While there are good textbooks on ontologies these tend to be less engineering-focused, at best analogs of the (excellent, but sometimes theoretical) Structure and Interpretation of Computer Programs. Exceptions include the excellent, practical engineering-oriented ontogenesis blog.

An incomplete list of transferrable software concept, principles, and practice includes:


I hold that all of the above are either directly transferrable or have strong analogies with ontology development. I hope to expand on many of these on this blog and other forums, and encourage others to do so.

Also, I can’t emphasize strongly enough that I’m not saying that engineering principles are more important than other attributes such as an understanding of a domain. Obviously an ontology constructed with either insufficient knowledge of the domain or inattention to users of the ontology will be rubbish. My point is just that a little time spent honing the skills and sensibilities described here can potentially go a very long way to improving sustainability and maintenance of ontologies.

About Chris Mungall
Computer Research Scientist at Berkeley Lab. Interests: AI / Ontologies / Bioinformatics. Projects: GO, Monarch, Alliance, OBOFoundry, NMDC

10 Responses to OntoTip: Lift/Borrow/Steal Software Engineering Principles

  1. Pingback: OntoTips: A series of assorted ontology development guidelines | Monkeying around with OWL

  2. This is a great collection Chris, super job! I have just one question about your list: “Separate source from compiled product (see ROBOT workflows).” The idea of a ‘compiled product’ is a little tricky to relate to the ontology development life cycle. Is it the ontology that gets produced by ROBOT workflow, or the inferred triples that get produced by an inference engine, or something else?

    Looking forward to the coming posts!

  3. mikeleganaaranguren says:

    Very important point. I would also add to the list, or at least think about: static code analysis, test coverage analysis

  4. Phillip Lord says:

    Good post! The two that I could question, I think are “modular” — clearly, yes, but the notion of modularity or at least the OWL support for it, is much weaker than modularity that we have in programming languages. In OWL it just means “stick your stuff in different files”.

    The other one is DRY. This was a mantra in software development for a long time, but I think it has weakened in the last few years, following things like the “leftpad” incident. In software development, I would now say, balance the risk of repeating yourself, against the risk of a tangled dependency graph. Of course, I will be first to admit that BTRRYARTDG is less snappy than DRY, but you can see the point I am sure.

  5. cmungall says:

    Thanks! I think all could be questioned, it’s more about building up a shared vocabulary so we can discuss these things clearly and constructively as a community. I will have a lot to say about modularity in future posts…

    The leftpad/DRY example is great, I have seen a few leftpad-analog examples in OBO, where someone imports a single class X from ontology O, when said ontology O was abandonware, and X was completely out of scope for O anyway. Furthermore, importing the O module had the side-effect of injecting a bunch of poisonous axioms. Here the ontology developer was better rolling their own X.

    In this case the only side-effect was an annoying delayed release of the ontology as an import chain spanning in incohrency was debugged. But when I look at some dependency chains I worry we’re leaving ourselves open for leftpad type incidents in the future.

    Commentary on leftpad: https://www.davidhaney.io/npm-left-pad-have-we-forgotten-how-to-program/

    Of course, the analogy isn’t perfect: for ontologies, reusing IDs avoids any ID mapping which users will thank you for, which doesn’t really have a coding analogy AFAIK.

  6. Pingback: OntoTip: Clearly document your design decisions | Monkeying around with OWL

  7. Pingback: OntoTip: Learn the Rector Normalization technique | Monkeying around with OWL

  8. Pingback: OntoTip: Write simple, concise, clear, operational textual definitions | Monkeying around with OWL

  9. Pingback: OntoTip: Don’t over-specify OWL definitions | Monkeying around with OWL

  10. Pingback: A simple standard for sharing mappings | Monkeying around with OWL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: