New version of Ontology Development Kit – now with Docker support

This is an update to a previous post, creating an ontology project.

Version 1.1.2 of the ODK is available on GitHub.

The Ontology Development Kit (ODK; formerly ontology-starter-kit) provides a way of creating an ontology project ready for pushing to GitHub, with a number of features in place:

A Makefile that specifies your release workflow, including building imports, creating reports and running tests
Continuous integration: A .travis.yml file that configures Travis-CI to check any Pull Requests using ROBOT
A standard directory layout that makes working with different projects easier and more predictable
Standardized documentation and additional file artifacts
A procedure for releasing your ontologies using the GitHub release mechanism

The overall aim is to borrow as much from modern software engineering practice as possible and apply to the ontology development lifecycle.

The basic idea is fairly simple: a template folder contains a canonical repository layout, this is copied into a target area, with template variables substituted for user-supplied ones.

Some recent improvements include:

Upgrade to the new ROBOT v1.1.0
Use of Docker
Inclusion of Design Pattern templates
Standard set of SPARQL queries
Minor feature improvements such as interactive mode

I will focus here on the adoption of Docker within the ODK. Most users of the ODK don’t really need to know much about Docker – just that they have to install it, and it runs their ontology workflow inside a container. This has multiple advantages – ontology developers don’t need to install a suite of semi-independent tools, and execution of workflows becomes more predictable and easier to debug, since the environment is standardized. I will provide a bit more detail here for people who are interested.

What is Docker?

From Wikipedia: Docker is a program that performs operating-system-level virtualization also known as containerization. Docker can run containers on your machine, where each container bundles its own tools and environments.
Docker architecture

Docker containers: from Docker 101

A common use case for Docker is deploying services. In this particular case we’re not deploying a service but are instead using Docker as a means of providing and controlling a standard environment with various command line tools.

The ODK Docker container

The ODK docker container, known as odkfull is available from obolibrary organization on Dockerhub. It comes bundled with a number of tools, as of the latest release:

A standard unix environment, including GNU Make
ROBOT v1.1.0 (Java)
Dead Simple OWL Design Patterns (DOSDP) tools v0.9.0 (Scala)
Associated python tooling for DOSDPs (Python3)
OWLTools (for older workflows) (Java)
The odk seed script (perl)

There are a few different scenarios in which an odkfull container is executed

As a one-time run when setting up a repository using seed-via-docker.sh (which wraps a script that does the actual work)
After initial setup and pushing to GitHub, ontology developers may wish to execute parts of the workflow locally – for example, extracting an import module after adding new seeds for external ontology classes
Travis-CI uses the same container used by ontology developers
Embedding within a larger pipeline

Setting up a repo

Typing

./seed-via-docker.sh

Will initiate the process of making a new repo, depositing the results in the target/ folder. This is all done within a container. The seed process will generate a workflow in the form of a Makefile, and then run that workflow, all in the container. The final step of pushing the repo to GitHub is currently done by the user directly in their own environment, rather than from within the container.

Running parts of the workflow

Note that repos built from the odk will include a one-line script in the src/ontology folder* called “run.sh”. This is a simple wrapper for running the docker container. (if you built your repo from an earlier odk, then you can simply copy this script).

Now, instead of typing

make test

The ontology developer can now type

./run.sh make test

The former requires the user has all the relevant tooling installed (which at least requires Xcode on OS-X, which not all curators have). The latter will only require Docker.

Travis execution

Note that the .travis.yml file generated will be configured to run the travis job in an odkfull container. If you generated your repo using an earlier odk, you can manually adapt your existing travis file.

Is this the right approach?

Docker may seem like quite heavyweight for something like running an ontology pipeline. Before deciding on this path, we did some tests on some volunteers in GO who were not familiar with Docker. These editors had a need to rebuild import files frequently, and having everyone install their own tools has not worked out so well in the past. Preliminary results seem to indicate the editors are happy with this approach.

It may be the case that in future more can be triggered directly from within Protege. Or some ontology environments such as Tawny-OWL are powerful enough to do everything from one tool chain. But for now the reality is that many ontology workflows require a fairly heterogeneous set of tools to operate, and there is frequently a need to use multiple languages, which complicates the install environment. Docker provides a nice way to unify this.

We’ll put this into practice at ICBO this week, in the Phenotype Ontology and OBO workshops.

Acknowledgments

Thanks to the many developers and testers: David Osumi-Sutherland, Nico Matentzoglu, Jim Balhoff, Eric Douglass, Marie-Angelique Laporte, Rebecca Tauber, James Overton, Nicole Vasilevsky, Pier Luigi Buttigieg, Kim Rutherford, Sofia Robb, Damion Dooley, Citlalli Mejía Almonte, Melissa Haendel, David Hill, Matthew Lange.

More help and testers wanted! See: https://github.com/INCATools/ontology-development-kit/issues

Footnotes

* Footnote: the Makefile has always lived in the src/ontology folder, but the build process requires the whole repo, so the run.sh wrapper maps two levels up. It looks a little odd, but it works. In future if there is demand we may switch the Makefile to being in the root folder.

New version of Ontology Development Kit – now with Docker support

What is Docker?

The ODK Docker container

Setting up a repo

Running parts of the workflow

Travis execution

Is this the right approach?

Acknowledgments

Footnotes

Published by Chris Mungall

2 thoughts on “New version of Ontology Development Kit – now with Docker support”

Leave a comment Cancel reply

What is Docker?

The ODK Docker container

Setting up a repo

Running parts of the workflow

Travis execution

Is this the right approach?

Acknowledgments

Footnotes

Share this:

Related

Published by Chris Mungall

Share this:

2 thoughts on “New version of Ontology Development Kit – now with Docker support”

Leave a comment Cancel reply