Archive for the ‘ontology’ Category

An Expedition in Semantic Publishing

May 19, 2012

Overview

To explore what “semantic publishing” means I pushed at the boundries by submitting an ontology of amino acids in the RDF/XML representation of OWL to the Sepublica semantic publishing workshop. The ontology captures the semantics of a domain, it is represented in a Semantic Web language, and the ontology is published on the Web. So, is it a semantic publication? Can a workshop on semantic publication deal with a semantic publication? The upshot is that my provocative submission does seem to count as a semantic publication, but we do need some words around the published semantics to help us out – that is, some narrative. Ultimately, we want semantic literature and literature of semantics.


The Sepublica Narrative

This blog reports on an expedition I made into semantic publishing with my friend Phil Lord. This was all done in the context of the Sepublica 2012 semantic publishing workshop at ESWC in Crete. It all started as a bit of fun testing the boundaries of what I could get away with, but, by provoking discussion, its also had some very interesting effects and reactions. All in all it’s led to something rather good and fun.

What I do on this blog is to report on motivation, what I actually did, the responses to it and what’s come out in the end. First of all, however, I thank the reviewers and the Sepublica organisers for joining in and letting me publish reviews for the “semantic publication” and the email dialogues I had with the Sepublica people (this has made the blog rather longh, but I think it supports this narrative).

So, the back story: Phil and I did a “proper” article for Sepublica about light-weight semantic publishing in the knowledge blog platform. On submission, I noticed the following on the Sepublica instructions to authors web page:

…We also invite submissions in XHTML+RDFa or in the format or YOUR semantic publishing tool. However, to ensure a fair review procedure, authors must additionally export them to PDF.

— Sepublica Organisers

My first reaction was “I wonder what will happen if I submit just an RDF document?”; that is, an OWL ontology in its RDF/XML syntax. This is where the “trying it on” bit comes in; can I take it literally and “publish” a document in RDF as a contribution to this workshop? My reasoning went like this:

  • An OWL ontology captures the semantics of a field of interest;
  • It is a document;
  • it has an RDF serialisation;
  • It has a URI, so it can be on the Web and found.

So, an OWL ontology is a semantic document, in RDF and published on the Web – anything on the Web is published… So, that’s is indeed what I did. The longest bit of the process was choosing the ontology that I had lying around that could work for the “expedition” into semantic publishing; this must be one of the cheapest publications I’ve ever done. I chose the Amino Acids Ontology, which is a small ontology that captures the basic biochemistry of amino acids and does so in a way that exploits automated reasoning.

Here’s the next bit of the story:

  • I chose the amino acids ontology. Phil and I originally made this to show off the wizards in the OWL plugin for Protege 3 and how we could use them to very rapidly create this ontology of amino acids.
  • I put a Dublin Core “title” annotation property to give my document a title;
  • I added myself and Phil as authors (though other people have contributed to the ontology over time as the annotations on the ontology describe);
  • I made my own “abstract” annotation property to give the document an “narrative abstract”.
  • and that was my semantic publication finished.

Here is a fragment of the ontology’s “title page”:

Annotations:
    title "Semantic Publishing of Knowledge about Amino Acids"@en,
    author "Robert Stevens and Phillip Lord",
    abstract "We semantically publish knowledge about the amino acids commonly
    described within biochemistry. The classification of amino acids is based
    on Taylor's article (PMID:3461222) from 1986 published in the Journal of
    Theoretical Biology. The ontology goes further than the static paper
    version; it combines many aspects of the physicochemical properties Taylor
    uses to classify amino acids to give a rich, multi axial classification of
    amino acids. Taylor's original description of the amino acid's
    physicochemical properties are captured with value partitions and
    restrictions on the amino acid classes themselves. A series of defined
    classes then establishes the multi-axial classification. By publishing
    this knowledge about amino acids as a semantic document in the form of an
    ontology we persue an agenda of disruptive technology in publishing. Blogs
    about the published semantics of amino acids may be found at

http://robertdavidstevens.wordpress.com/2010/12/18/an-update-to-the-amino-

acids-ontology/
    and links following."@en,

So, it has some minimal trappings of a traditional publication. This also gives an outline of our atttitude to the ontology as a semantic publication; the ontology is a semantic artefact, but we do link out to some blogs that give some narrative on the ontology…


Submitting a Semantic Publication to Sepublica

Next came submitting the publication to EasyChair. The instructions above said we had to give a PDF version of the submission to ease the review process. As pointed out elsewhere, this is a sad irony of fora on alternative or next generation publishing – they use “lumpen PDF”… So, I saved my ontology as Manchester OWL Syntax and turned it into PDF. This had two motivations – one was “that will show them….” (is there anything as useless as a Manchester syntax dump of an ontology converted to PDF?) and the second was to submit both this ludicrous document with the more sensible RDF version of the ontology. Unfortunately, the EasyChair Sepublica site wasn’t set up to take other than PDF; the organisers changed it for me, but the only way to get RDF in was to zip it up and submit one file. So, I zipped up the RDF and submitted the ontology, but without the silly PDF version.

This is where the dialogue started.

Dear both, I was trying to take a look at the paper you submitted “Semantic Publishing of Knowledge about Amino Acids” the problem I had was that the uncompressed zip file generates an OWL file (nothing wrong with the OWL file, I opened it protege) but there is not an actual paper -as in a PDF file. Could u resubmit and make sure to include the actual paper.

— Sepublica Organisers

and

The ontology is our submission. the workshop pages said that RDF submissions were acceptable and that’s what we submitted. The ontology is a document that captures, in a computational form, the semantics of amino acids. The URI resolves to a web address from which the semantic publication can be read, so I think this counts. As the instructions for authors said, I did produce a PDF version of our publication, but the RDF one works much better. I think we’ve fulfilled the instructions to authors — is there anything else we need to do?

— Robert Stevens

and the reply was:

you are right, no problem

— Sepublica Organisers

Which was the right answer – good for them. At this point the submission was sent to the reviewers.


The Reviews

The intstructions sent to the reviewers were:

please note: This is a true semantic publication.

It does not quite stick to the rules (as the authors didn’t submit a PDF export limited to 12 pages), but nevertheless we (= Alex and me) decided not to reject it before reviews.

We recommend that you treat this submission as if it were a paper describing an ontology.

You can read the ontology with your favorite ontology editor (e.g. Protégé), but we also recommend that you open it in a text editor to see the publication-style “header”. The blog post linked from the “abstract” should also be considered part of this submission.

— Sepublica organisers

Below is what we got back.

Note: This submission has been evaluated as an ontology rather than as a paper. The blog has also been read in order to better evaluate this work.

What is the target research problem? The ontology represents the 20 amino acids used in biology as well as their characteristics such as polarity, size, etc.

What are the strong points and weak points of the paper? The ontology proposed is well documented and highly relevant for bioinformatics and related domains.

Does the paper evaluate its contribution? Is it aware of related work?

Further comments (if applicable) There should be a formal submission of the corresponding paper. I would like to see an evaluation against competency questions. I would also like to know how this ontology has been used. Do the authors have a particular project in mind? How can it be used in conjunction with protein ontologies or others? Why it is better to represent this information in an ontology rather than other formats? I think the ontology itself is interesting but even more would be its use.

Minor issues (if applicable) d be nice to know how that domain can benefit from this ontology.

— Reviewer One

The research problem that the authors are somewhat sarcastically addressing is the question of how to publish machine readable (“semantic”) documents. Though they don’t really spell this out in text, they are proposing that the important knowledge in a publication should be be represented and distributed as an OWL ontology that is completely distinct from a traditional body of text that would be distributed as a PDF for human readers… As they state: “publishing this knowledge about amino acids as a semantic document in the form of an ontology we persue an agenda of disruptive technology in publishing”

One interesting effect of their disruptive submission is that, as a reviewer, I am forced to attempt to examine the knowledge content directly without recourse to complain about grammar, document structure or image quality – which I think is a positive. This raises the problem that, not being a biochemist, I need some way to tell whether their ontology is correct. Sadly, the only real way that I, as a lowly human reader, can evaluate the knowledge content of the ontology is to go back to read the papers from which this knowledge was extracted in the first place…

I like the authors main point if I may guess it as something like: “We have a great knowledge representation language ready for use in publishing called OWL and we should use it directly in the publishing process”. But I don’t think that we can escape from also publishing knowledge in a form that human readers can easily consume.

So, what to do with this submission? I think it would be most useful for the meeting if the authors would make their proposal for OWL-based semantic publishing explicit by writing an editorial style article (in English) that states their case. They should, of course, include an OWL version of this editorial so that we can verify that their reasoning is sound.

— Reviewer Two

This is all very interesting, but before we unpack the reviews, I should come clean about the disappointment of having the “paper” accepted for presentation at Sepublica; Phil and I really wanted it to be rejected on the grounds that there was no publication. This would have enabled us to say that the whole thing is completely ridiculous… However, our bluff was called and the reviewers and the Sepublica organisers have gone with it and good things look like they’re coming out of the whole expedition.

Reviewer One says he/she is reviewing it as an ontology not a paper… even though the ontology (in our view) is a semantic publication. Reviewer one just plays it with a straight bat – let’s just treat it as an ontology. It may be that the two should be indistinguishable – should an ontology be treated any different from a paper? The axioms of the ontology are a theory about the domain; it doesn’t have the form of the traditional scientific paper, but can it be treated as such – is this comment about it not being a paper eventually going to be “old fashioned”? The interesting point is that he/she wants descriptions of the ontology’s use; something that is not in the ontology, it is in the narrative surrounding the ontology (or would be if we had a use other than teaching for this ontology). Phil Lord has talked about literate ontology as an analogy to literate programming; we should be able to have narrative for the ontology surrounding that ontology. There is, however, something to distinguish between what the ontology says about its field of interest and what we want to say about the ontology as an artefact.

Reviewer one also says he/she read the blogs to gget the narrative, which sort of plays to this need for a narrative. however, I stilll think that the basic point that the ontology is a semantic publication holds; it may need more narrative, but I remain to be convinced “There should be a formal submission of the corresponding paper.”. Finally, the comment “Why it is better to represent this information in an ontology rather than other formats?” is fun for a workshop on semantic publishing – is this (our ontology) a good way to publish semantics for a field? I claim that this ontology captures a lot of basic biochemistry of amino acids; the background chemistry belongs elsewhere, but this ontology captures an early lecture in biochemistry. The biological and chemical implications of the amino acid’s characteristics are beyond what we’ve done, but I’m happy to argue that the ontology as it stands is a good way of publishing basic knowledge about the semantics of amino acids. If it isn’t, then we’ve been wasting an awful lot of time on ontologies.

The point about narrative comes out even more in Reviewer two’s review. I’ve never had a paper of mine described as having an element of sarcasm – I’m very proud of this achievment. Reviewer two said:

One interesting effect of their disruptive submission is that, as a reviewer, I am forced to attempt to examine the knowledge content directly without recourse to complain about grammar, document structure or image quality – which I think is a positive.

— Reviewer two

which I do understand, but a good part of this tedious element of reviewing is to make sure the publication can be understood as a publication. Can we do this with an ontology? An ontology should have a tutorial or reference aspect,but we don’t really know how best to present them for many applications. I’m prepared to state, however, that OWL isn’t the way to present an ontology to users (except, perhaps, the authors) and various graph visualisations are only part of the solution, but all of this is another story.

This raises the problem that, not being a biochemist, I need some way to tell whether their ontology is correct. Sadly, the only real way that I, as a lowly human reader, can evaluate the knowledge content of the ontology is to go back to read the papers from which this knowledge was extracted in the first place…

— Reviewer Two

perhaps one day “papers” will be assessed against the ontologies that capture background knowledge. However, this reviewer is right to point out that it is difficult to review an ontology (too much ontology evaluation/review) is based on “would I have done it this way….”. A wider question is how does one evaluate/review any semantic publication?

Finally, we have:

We have a great knowledge representation language ready for use in publishing called OWL and we should use it directly in the publishing process’. But I don’t think that we can escape from also publishing knowledge in a form that human readers can easily consume.

— Reviewer Two

which is and isn’t what we’re saying. I couldn’t write my whole publication in OWL or even all of FOL – and not wish to either. This all gets to the heart of it; we want semantic publishing, but we also want narrative. What counts as a semantic publication – a lump of RDF; a trad paper with a bit of RDF or OWL or FOL in it or a trad paper with some typed links? Whatever the nature of a semantic publication or a semantic scientific publication we do need narrative.

Reviewer Two ended up saying ” …make their proposal for OWL-based semantic publishing explicit by writing an editorial style article (in English) that states their case. They should, of course, include an OWL version of this editorial so that we can verify that their reasoning is sound.”, which is exactly in the right vein; I take my hat off to him/her.

What we did was write a little position paper outlining what we did – this makes the point that semantic publishing needs narrrative. This goes for semantic scientific publishing, but also for data as well. I like the comment about representing the position paper as an OWL ontology to check reasoning. I actually thought about this and it would be a good exercise, but not one I felt I could turn around in the few days available for making our proceedings version – perhaps it’s worth pointing out that writing a trad paper is actually easy compared to writing the ontology version – especially if you take away all the poncing around one has to do when publishing in a trad forum. Getting narrative structure into a semantic document would be fun, or do we want a proper hypertext document where whatever route you take through the structure one gets the same message?


The final bit

The general instructions to authors for Sepublica’s final, camera ready version was:

Dear Robert,

You have already received the comments by the reviewers in a previous email. Please take them carefully into account when preparing your camera-ready paper for the proceedings.

The final paper and the signed copyright form are due on

FRIDAY APRIL 13 23:59 (Hawaii time)

This is a firm deadline for the production of the proceedings.

  1. FINAL PAPER: Please submit the files belonging to your camera-ready paper using your EasyChair author account. Follow the instructions after the login for uploading two files:
    (a) either a zipped file containing all your LaTeX sources
        or a Word file in the RTF format, and
    (b) PDF version of your camera-ready paper.

The final submission must be in LNCS format (instructions: http://www.springer.de/comp/lncs/authors.html). Research papers are strictly limited to 12 pages, position papers to 5 pages, and system/demo descriptions must be between 2 and 5 pages. 2. COPYRIGHT: The copyright form can be found below. It is sufficient for one of the authors to sign the copyright form. You can scan the form into PDF or any other standard image format, but even a text file with your name entered is sufficient.

— Sepublica Organisers

and this was my supplementary message from Sepublica:

Dear authors,

of course the “must be LNCS, N pages” etc. restriction is not applicable in your case.

However, I need something for the old-fashioned PDF version of the proceedings. Please find some suggestions below. If you should just upload the ontology file, I’m going to print it to a PDF from a text editor – but there should be nicer ways.

Maybe a title page in LNCS style, up to and including the abstract.

Then, a pretty-print of your ontology might follow. We are not going to print the proceedings on paper, so we do not really have physical page limits.

Please be innovative! :-)

— Sepublica Organisers

One thing this strongly suggests is that we don’t really know what to do with a semantic publication. I don’t think my PDF of the Manchester Syntax for the ontology either counts as a pretty print or is it really the way to do a semantic publication anyway. This was, however, when the Sepublica organisers turned the tables on me, effectively saying ‘OK, so you’re publishing semantically – let’s get on with it’. Even though I think I’ve published semantically, I sort of gave in at this point and did the aforementioned position paper. I do, however, believe that my ontology is a semantic publication; we just don’t yet know how to handle semantic publications.

We have a feeling that semantic publishing must be a good thing, but it’s all rather uncharted territory at the moment. We want material in our scientific publications to be more computationally accessible. We want semantically described data. but what is a semantic publication? How much semantic content does a publication have to have to be a semantic publication? Perhaps the goal for the next Sepublica is to not have PDF as the output, but to challenge the community to do some semantic form of publication to test some boundries.

The First UK Ontology Network Workshop

April 17, 2012

We have just had what I thought was a successful first full meeting of the UK Ontology Network in Manchester. The idea was put together at a meeting organised by Pierre Grenon and Anthony Galton at the Open University on 30 April 2010. At the 12 April 2012 meeting, organised by James Malone and i, we had 100 people registered and about 80 people attending. This itself was a good thing to see – gathering 100 people shows a bibrant ontology community in the UK. One of the best things about the meeting was that we had people in the audience from many communities that use ontologies, not just my home community of biomedicine. we had people from biomedicine, music, geography, government, the national archives, the BBC and the NHS (and probably some I’ve left out). We also just had a lot of people doing stuff and, I’m pleased to say, talking about putting applications together using OWL and automated reasoners, particularly ELK. We had clusters of 3 people giving 5 minute talks with a little bit of discussion and we then had some software demos after lunch. We then just had a long session of hanging around talking – which was good.

We used the hashtag UKON2012 on twitter (which was active) and our UKON GooglePlus page. We’ll put the presentations up on the UKON site and prepare some other materials from the Tweets, GooglePlus pages and so on, such as issues, capabilities, and themes of work.

Here I’ll just talk a bit about a few of the talks:

Tom graham’s (BBC) talk about using linked data to generate the BBC’s Olympics pages showed an impressive process and its result. A light weight publishing process lets the journalists write their piece, tag it and push it through a pipeline that allows aggregation and rich inter-linking of the BBC’s Olympic content. the take home message was that tom reckons that the Web site couldn’t have come into being in the timely fashion it has without the use of Semantic Web technologies – a sign of increasing maturity.

Phil Lord’s (Newcastle) presentation on the Knowledge Blog’s publication process generated lots of comments during the breaks – using another light-weight publication process to gather light metadata and semantics about the page, its contents and the references. It was a nice show and Phfil’s video should be looked at…

Barry Smith (NCBO) said that most of the content on BioPortal was “crap” and that there were only four good ontologies in the world. All of this was in support of the proposal that lots of gtraining is needed – a reasonable point, though one that is just true. One of the “good” ontologies he named as coming from Aberystwyth (Larisa Solditova was in the audience and asked the question to identify the four). this leaves 3 ontologies and there was some speculation about their identity – we know Barry likes the FMA – so let’s count it as one. This leaves only two other good ontologies in the world. this means that at least all but two of the OBO ontologies are “crap” and presumably contribute some of the “crap” to BioPortal. Presumably, then almost all the OBO Foundry ontologies are “crap” too.

Dave De roure (Oxford) introduced the audience to semantic music. A lot of music data is getting out there as linked data, but with some semantics. Dave told of a music and linked data workshop he set up, expecting 20 participants he got 200. I’d interpret this as an appetite for geting stuff out there and exposed for use. One of the jobs of this UKON community is to get it out there in a form that optimises its usefulness and semantic content. Dave also mentioned work that Sean Bechhofer, Kevin Page, he and I had recently started on an OWL knowledgebase of the outputs of digital analysis of all the songs on Sargeant Pepper’s to give lists of the segmentgs of the songs for query and exploration. Dave ended by the pointting out that the music sector is far along the digitisation and tagging route and that other disciplines could well look to it for lessons.

Ian Horrocks (Oxford) gave a good overview of work at Oxford that included a bit of retrospective. One of the good things that Ian ended on was a lot of collaboration and interest from industry – this is good to see and is an indication of maturity. One of the winners of the day was eLK – the fast OWL EL reasoner – that was mentioned several times as enabling work, and we’re seeing on-line applications using OWL reasoners – which is a good thing and more indications of maturity.

Jeremy Rogers (NHS) gave an entertaining talk about the use of SNOMED in the NHS. He mentioned 30 million annottations of patient records with SNOMED terms as a result of visits to family doctors by people in the UK. He also mentioned the worrying aspect of annotation quality and quality assurance in general – another theme of the day. The under-annotation and mis-annotation was a bit frightening and plays to the need to develop tools and techniques (as well as the ontological/terminological underpinnings that will give better annotations/codings, not only from SNOMED by NHS people, but by all users of ontologies.

Throughout the day there was a call for tooling to support the use of ontologies in the community. There’s a need to enable thedevelopment and use of OWL ontologies with the same level of sophistication as we have for handling the programme code for software applications. Though this wasn’t explicitly mentioned, we are not replete with OWL tools. We have Protege as (probably) the kmost widely used OWL environment – many people depend on it – and it’s funding hangs by a slender thread. The community of which the UKON meeting is evidence, needs to come together to make sure that there is not only a good tool chain, but that the vital elements of that chain are both secure and have safety in numbers. As a community we should stop thinking that the tools are the responsibility of others and help, by whatever means, to make the tools happen. That this UKON meeting can gather 100 registrants with relative ease from within the UK (and a couple from the US) shows that there is a vibrant community from the fundamental of representation language and automated reasoning to a wide range of application domains.

There wil be mmore UKON meetings…

Unlocking OWL Ontologies

March 16, 2012

Ontologies, even when presented in the more user-friendly Manchester OWL syntax (the one used in Protégé), can be impenetrable to the uninitiated. To address this problem, the SWAT project have developed a system (named OntoVerbal) that automatically translates an OWL ontology into natural language text. We would be most grateful if you would help us test the system by participating in a short (20 min) experiment that involves reading 10 such texts and translating them back into the corresponding OWL.

Please click on the link below for further details and instructions:

https://www.surveymonkey.com/s/G7HZ25P

All participants will be entered into a prize draw for Amazon vouchers:

  • 1st Prize: £50 ($80)
  • 2nd Prize: £30 ($50)
  • 3rd Prize: £20 ($30)

Please circulate this request to anyone you think may be interested

Making sure my brother and I have the same grandparents

March 16, 2012

I recently closed off another hole in my Family History Knowledgebase (FHKB). OWL’s open world assumption means that to make many of the desired inferences in an ontology or knowledgebase, one has to make sure the reasoner has no “possibilities for doubt”. I’ve wrritten before about closing down areas of the FHKB with respect to how many children people have. I’ve also closed parts of the amino acid ontology and written in general about closure. The example of grandparents (and parents etc.) in the FHKB is just another example of having to be really tight.

My brother Richard and I have the same parents and grandparents (the latter being William and Iris on my Dad’s side and charles and Violet on my Mum’s side). If I write two defined classes as follows:

Class: GrandparentOfRobert
        EquivalentTo: Person
                that isParentOf some (Person that isParentOf value Robert)

Class: GrandparentOfRichard
        EquivalentTo: Person
                that isParentOf some (Person that isParentOf value Richard)

they both give the same answer of William, iris, charles and Violet. The two defined classes are, however, not themselves infered to be equivalent, even though they appear to have the same extents. This last point is the cruicial one – they only appear to have the same extents; we just have to ignore our domain knowledge. In my description of Person I’ve left it open that there may be more ways of having a parent than having a mother and a father… So it is possible that I have other parents than my Mum and Dad, and thus any old number of grandparents can exist; the FHKB implies that I have at least two parents and at least four grandparents – I could have more.

In the FHKB I have the central class of:

Class: Person
        SubClassOf: hasMother some Woman,
                hasFather some Man

hasMother and hasFather are functional, so an individual person may only hold one of these properties to a distinct individual of Woman and Man. hasMother and hasFather are sub-properties of hasParent. In this little property hierarchy I’ve said there are two known ways to have a parent – by having a mother and by having a father. I haven’t said there are no other ways and to close things tighly I need to do so. In OWL, I can’t put a closure axiom on the hasParent property. I would like to say hasParent EquivalentTo: hasMother or hasFather, just like a closure axiom on a class. I can, however, close down the possibilities of how many ways a Person can have a parent by doing the following:

Class: Person
        SubClassOf: hasParent exactly 2 Person,
                hasMother some Woman,
                hasFather some Man

and then everything works. I’ve said a person can have only one mother and one father and now I’ve said a person can have just two parents; so I’ve closed off possibilities of having other kinds of parents – a person must have two parents adn one must be a father and one must be a woman (making two). The exactly makes the reasoner run like a pig on stilts, but replacing the exactly with a max makes it run sensibly. With this addition to the FHKB, the classses for the grandparents of Richard and Robert are infered to be equivalent. An example FHKB fragment with this closure is available. You can remove the closure axiom on Person to see it run rather slowly on classification.

iKUP wins the Ontologies Come of Age in the Semantic Web grand challenge =

February 12, 2012

The Kidney and urinary Pathway Knowledgebase (KUPKB) (http://www.kupkb.org) its web front-end the iKUP browser has won first prize at the Ontologies come of Age in the Semantic Web grand challenge held at the International Semantic Web Conference in 2012 at Bonn. The KUPKB uses an application ontology make from some OBO ontologies together with extensions and bespoke fragments of ontology to form a schema into which we’ve put annotations on genes, proteins and experiments across various ‘omic levels. The iKUP browser is a GWT built front-end that exposes the KUPKB through a faceted browser that allows users to browse and search the KUPKB’s contents via the various aspects captured in the KUP Ontology.

There are a couple of noteworthy things about our approach:

  1. Julie klein, one of our collaborating biologists in Toulouse, added most of the data from various investigations. We didn’t have her directly adding axioms by hand, but instead used RightField spreadsheets to help her do this task. RightField is a semantic spreadsheet in which menus tied to portions of ontologies can be embedded. This means a standard Excel spreadsheet can be used to add ontology terms to data and have only appropriate terms made available to the user; these marked up spreadsheets are then transformed to KUPKB content by scripts. Julie also added a lot of content to the KUPKB’s ontology (KUPO) with an extension of RightFiel called Populous, a semantic spreadsheet type application for describing entities according to various ontologies and then having the spreadsheet’s contents transformed into axioms via the Ontology Preprocessor Language (OPPL) and put in the KUPKB.
  2. Simon Jupp, who built the KUPKB, also made the iKUP browser. This is a GWT front-end to the KUPKB that allows searching and faceted browsing (based on the KUPO) so that biologists can find genes and proteins together with associated experimental data. The iKUP browser helps construct SPRQL queries against the kUPKB as well as using a bit of OWL reasoning (with HerMiT). The iKUP is the key; it let’s biologists use the KUPKB without necessarily knowing they are using Semantic Web technologies – it is just a Web page.

Our OCAS submission was accompanied by a short paper:

Simon Jupp, Julie Klein, Panagiotis Moulos, Joost Schanstra and Robert Stevens. Ontologies Come of Age with the iKUP Browser. In Alexander Garcia Castro, Ken Baclawski, John Bateman, Kim Viljanen, and Christoph Lange, editors, Proceedings of the Workshop Ontologies Come of Age in the Semantic Web, International Semantic Web Conference, number 809 in CEUR Workshop Proceedings, pages 25-28, Aachen, 2011.

In this paper we outlined some criteria for “coming of age” and how we think KUPKB and iKUP meets them:

  1. When Ontologies / Semantic Web technologies are used outside of their community
  2. When the technology becomes transparent to the user experience
  3. Tools and APIs are mature enough for developers to simply bolt applications together
  4. Questions over performance and scalability go away

The first criterion is analogous to that of the Web; it’s come of age once it’s moved outside those that know how it works and how to write HtML pages by hand (plus when there was something useful on the Web). When my Mum started using the Web to plan days out for herself and my Dad, then the Web had come of age. Similarly, our biology colleagues can come to iKUP, search and browse without knowing that it is an OWL ontology organising some RDF and using SPRQL to get back tables of data. Fulfilling criterion two is helped by criterion one. The iKUP browser makes use of the Semantic Web technologies transparently – it is just a web page in a familiar environment (just as RightField is delivered via a Excel spreadsheet) where keywords are typed in and a faceted browser allows users to both go directly to entities of interest, but also to “just look around”. That the facets are provided by an ontology need not be known (and should not be known) by the user. We can only make the iKUP browser if criterion thre is met – we used a series of standard API and GWT to make the KUPKB available via the iKUP. This really is a sign of “coming of age” – we can “bolt together” applications in fairly short order. Having met our last criterion is harder to justify – KUPKB is relatively small scale, so we don’t have too many performance problems. As RDF gets v big it all gets a bit clunky, but that should change. Anyway, by and large, we feel that the Semantic Web technologies really are coming of age and can do the job (if one’s careful) that they are supposed to do.

As we say in our OCAS paper: “The KUPKB and iKUP show ontologies coming of age by fulfilling some of their promise. The KUPKB has used ontologies to provide a common semantic framework for a broad range of previously semantically heterogeneous data. The use of Semantic Web technologies provides the means to integrate and query these data. The key to the coming of age is the iKUP user interface; without a simple means to access these integrated data, our biologist users would not and could not use the KUPKB; ontologies come of age when they deliver meaningful use to their intended users. This has now happened with the KUPKB, with biologists testing hypotheses generated via the iKUP in laboratories. Though the KUPKB is relatively small, it does what semantic technologies are supposed to do and show what is possible with biology’s rich resource of data once issues of heterogeneity are taken away and the means of delivery to its users is taken into account. “

Call for Submissions – International Conference on Biomedical Ontologies (ICBO 2012)

October 5, 2011

co-located with the 7th International Conference on Formal Ontologies in Information Systems (FOIS 2012)
22-25 July 2012; Graz, Austria.
Conference Web site: http://purl.org/icbofois2012/


Relevant dates

31 December 2011: Workshop and tutorial proposal submission deadline
25 January 2012: Notification of acceptance of workshops and tutorials
31 January 2012: Paper submission deadline
28 February 2012: Notification of paper acceptance
15 March 2012: Poster, early career symposium, software demonstrations and workshop papers submission deadline
15 April 2012: Notification of poster, early career symposium, software demonstrations and workshop paper acceptance
30 June 2012: Deadline for all camera-ready copies for the proceedings


Topics

Ontologies are increasingly used in the semantic description of biological and medical data from creation to publication and consumption by semantically enabled applications. To be effective, such ontologies must be of high quality and work well together. Therefore, issues like techniques for ontology use, good ontology design, ontology maintenance, and ontology coordination need to be addressed. The International Conference on Biomedical Ontology (ICBO) series is designed to meet this need. ICBO 2012, the third event in the highly successful series, will bring together representatives of all major communities involved in ontology use and development in biomedical research, health care, and related areas.


Call for Papers

ICBO 2012 is soliciting submissions of novel (not previously published nor concurrently submitted) research papers in the areas of the application of ontologies to biomedical problems, ontology design and ontology interoperability, at any stage in the process of data creation to its use in applications. Submissions will be welcome from a broad range of approaches to ontology building and use. In particular, we would like to invite contributions on these aspects of biomedical ontologies, as well as on the use of such ontologies in knowledge management, knowledge discovery and next-generation publishing. We will accept full-length papers (5 pages), poster submissions (1 page abstracts), software demonstrations (2 page abstracts or 5 page full application papers), and early career researcher symposium submissions (2 page abstracts). We are also looking for submissions of half-day or full-day workshops and tutorials. Workshops should be working sessions with a specific outcome that can be subsequently presented at the main conference. Tutorials are educational events and may involve hands-on practicals.

Submission is via EasyChair (https://www.easychair.org/conferences/?conf=icbo-2012), according to the templates provided on the conference web site.


Contact

For more information or to offer sponsorship, please send us a note at contact.icbo2012@gmail.com


Organizing Committee

General Chairs: Ronald Cornet (Amsterdam, The Netherlands) and Robert Stevens (Manchester, UK)
Workshops and Tutorials Chair: Melanie Courtot (Vancouver, Canada)
Early Career Consortium Chair: Ludger Jansen (Rostock, Germany)
Software Demonstration Chair: Trish Whetzel (Stanford, USA)
Proceedings Chair: Janna Hastings (Geneva, Switzerland)
Local Chair: Stefan Schulz (Graz, Austria)

A Unicorn escutcheon

September 9, 2011

I realised, that having made the Heraldry Ontology (HO) to have an ontology containing unicorns, that I’d not actually put in an escutcheon using a unicorn. Said beast is there in the HO as a mythical creature that can be used as a charge, but I really neded an actyual escutcheon using a unicorn, rather than just having it implied by the HO. The following blazon is, I think, such an escutcheon:

Vert with unicorn argent attired gules with eroteme argent

this has a green background with a white unicorn, with a red horn, with a white eroteme (question mark) underneath.

I’ve extended the HO in a couple of ways to do this a little more properly:

  1. The ordering of the charges in the blazon indicates where they are on the escutcheon. Exactly how they are placed depends, at least in part, on how many charges are on the escutcheon. So, rather than describing the exact spatial region on the shield where a charge is placed, I’ve simply given a data property with the “charge priority”. This just says where in the blazon the charge occurs; then the rest is up to the designer – as actually happens in reality. The first charge is the “chief2 charge and takes centre stage; secondary charges are arranged appropriately, though, as far as I can tell, with no strict rules.
  2. I’ve also made the sinister/dexter aspect of the beast explicit. We need to know which way the beast faces. There is a default in the interpretation of blazon, but this needs to be captured in the HO explicitly. Charges are dexter by default; that is, facing to the bearer’s right (the viewer’s left). So, this unicorn is dexter. In blazon, only sinister is explicitly mentioned, as dexter is the default.

This is how I build up the complex class expressions with everything closed off so my implementation of the rule of tincture would work:

  1. Do the existential graph.
  2. Where closure is needed for properties that are not functional) I work from the inside out. For each property I change the “some” in the axiom to “onlysome” – a bit of syntactic sugar in Kudu’s implemenntation of Manchester OWL syntax.
  3. Each time I add an “onlysome”, I allow Kudu to reformat the OWL to put in the closure axioms.
  4. Once I’ve got to the outside of the axioms, everything is closed.
  5. The only exception is that I had to fiddle around with the is charged with property and do some horrid cutting and pasting (as there are two of these properties used). The “add closure axiom” function is not working properly in Kudu, so \I had to do it all this way. Below are the various bits of OWL as I build things up.

The existential of the escutcheon:

Individual: 'Vert with unicorn argent attired gules with  eroteme argent'

    Annotations: [in heraldry]
        label "Vert with unicorn argent attired gules with  eroteme argent"^^string

    Types: [in heraldry]
        hasPart some
            (Field
             and ('is charged with' some
                (Eroteme
                 and ('has tincture' some Argent)
                 and ('has charge priority' value 2)))
             and ('is charged with' some
                (Unicorn
                 and ('has creature profile' some Dexter)
                 and ('has tincture' some Argent)
                 and (hasPart some
                    (Horn
                     and ('has tincture' some Gules)))
                 and ('has charge priority' value 1)))
             and ('has tincture' some Vert)),
        Escutcheon

Note the unicorn has a part a horn that is red; that is, “attired gules”. I suppose I should say that it has a part “exactly 1 horn” as it is a unicorn… or maybe that goes on the unicorn class. I should also say how many unicorn and so on, but I haven’t.

Putting an “onlysome” on the has tincture some Gules, we get

Individual: 'Vert with unicorn argent attired gules with  eroteme argent'

    Annotations: [in heraldry]
        label "Vert with unicorn argent attired gules with  eroteme argent"^^string

    Types: [in heraldry]
        hasPart some
            (Field
             and ('is charged with' some
                (Eroteme
                 and ('has tincture' some Argent)
                 and ('has charge priority' value 2)))
             and ('is charged with' some
                (Unicorn
                 and ('has creature profile' some Dexter)
                 and ('has tincture' some Argent)
                 and (hasPart some
                    (Horn
                     and ('has tincture' onlysome Gules)))
                 and ('has charge priority' value 1)))
             and ('has tincture' some Vert)),
        Escutcheon

After some automatic re-formattting we get:

                    (Horn
                     and (('has tincture' some Gules)
                     and ('has tincture' only Gules))))
                 and ('has charge priority' value 1)))
             and ('has tincture' some Vert))

going all the way through, we get:

Individual: 'Vert with unicorn argent attired gules with  eroteme argent'

    Annotations: [in heraldry]
        label "Vert with unicorn argent attired gules with  eroteme argent"^^string

    Types: [in heraldry]
        * 'Coloured Escutcheon and metal charge',
        (hasPart some
            (Field
             and (('has tincture' some Vert)
             and ('has tincture' only Vert))
             and ('is charged with' some
                (Eroteme
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has charge priority' value 2)))
             and ('is charged with' some
                (Unicorn
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has creature profile' some Dexter)
                 and (hasPart some
                    (Horn
                     and (('has tincture' some Gules)
                     and ('has tincture' only Gules))))
                 and ('has charge priority' value 1)))
             and ('is charged with' only
                ((Eroteme
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has charge priority' value 2))
                 or (Unicorn
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has creature profile' some Dexter)
                 and (hasPart some
                    (Horn
                     and (('has tincture' some Gules)
                     and ('has tincture' only Gules))))
                 and ('has charge priority' value 1))))))
         and (hasPart only
            (Field
             and (('has tincture' some Vert)
             and ('has tincture' only Vert))
             and ('is charged with' some
                (Eroteme
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has charge priority' value 2)))
             and ('is charged with' some
                (Unicorn
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has creature profile' some Dexter)
                 and (hasPart some
                    (Horn
                     and (('has tincture' some Gules)
                     and ('has tincture' only Gules))))
                 and ('has charge priority' value 1)))
             and ('is charged with' only
                ((Eroteme
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has charge priority' value 2))
                 or (Unicorn
                 and (('has tincture' some Argent)
                 and ('has tincture' only Argent))
                 and ('has creature profile' some Dexter)
                 and (hasPart some
                    (Horn
                     and (('has tincture' some Gules)
                     and ('has tincture' only Gules))))
                 and ('has charge priority' value 1)))))),
        Escutcheon

Which, I think, is particularly “manly OWL” (or MOWL). It also, perhaps, shows how writing this sort of thing by hand is so beastly or just shouldn’t be done at all… Putting the charge priorities and creature profiles into the already expanded escutchia will be a real pain, but will just be a matter of being systematic.

Anyway, it classifies proply in the HO as a coloured escutcheon with metal charge. This version of the HO is available.

An Ontology of Heraldry

August 22, 2011

BFOites often use unicorns as a stick with which to beat those that question the particular sect of realism portrayed in BFO. The argument goes something along the line of “well if you don’t wish to model only universals then you’re prepared to model “unicorn”".

Just so that I can say I’ve built an ontology that has the notion of “Unicorn” I’ve built an ontology of heraldry. As an ontology it is not really up to much, except (as with any ontology) building it has thrown up some modelling issues. Of course, I have to say now, that I suppose everything in this ontology is some kind of information, so I’m not really claiming I havbe an ontology with a “real” unicorn in it; nevertheless, I have an ontology that describes unicorns and other mythical creatures… Also, I’m not describing anything that wouldn’t be captured somewhere in a BFO compliant ontology.

In heraldry, almost anything (a charge) can be placed upon a field. these can be animals (real and mythical), animal parts, man made objects, geometrical symbols, and so on. A field may also be uncharged, that is, plain. With some twiddly bits, the field plus the optional charge is an escutcheon.

the field and charge have a tincture. The tinctures (colours) are traditionally more restricted and divided into “colours” and “metals”. the rule of tincture stipulates that metal should not be placed on metal and vice versa. This stops an argent (white) horse being charged on a field of Or (yellow). There is an exception where the natural tincture of the charge is placed on a field of the same tincture –”a horse argent proper”; I’ve not attempted this at the moment.

We can capture the rule of tincture in OWL, but it is a bit convoluted. there are three options:

  1. disjointness of two axioms for field with metal tincture and one for charge with metal tincture.
  2. Making owl:nothing equivalent to field hasTincture metal and charge hasTincture metal
  3. adding some restrictions such as hasField hasTincture Metal hasCharge some (charge that hastincture only Colour) — getting the some and only right!

My Heraldry Ontology (The HO) captures the ruel of tincture. I’ve done it using option three in the following way:

  • Added defined classes for plain escutcheon, coloured escutcheon with metal charge and metal field with coloured escutcheon.
  • I then make the class escutcheon be covered by these three defined classes.
  • Any of my individuals that are actual escutchia should be one of these three classes.
  • Breaking the rule of tincture will cause an inconsistency.

The four classes involved are:

1.

Class: Escutcheon

    Annotations: [in heraldry]
        comment "A shield; can have a shield with a field but no charge."^^string,
        label "Escutcheon"^^string

    EquivalentTo: [in heraldry]
        * Escutcheon,
        'Plain escutcheon'
         or 'Coloured Escutcheon and metal charge'
         or 'Metal escutcheon and coloured charge'

    SubClassOf: [in heraldry]
        hasPart only Field

2.

Class: 'Plain escutcheon'

    Annotations: [in heraldry]
        label "Plain escutcheon"^^string,
        comment "A tinctured field with no charges."^^string

    EquivalentTo: [in heraldry]
        * 'Plain escutcheon',
        Escutcheon
         and (hasPart some
            (Field
             and (not ('is charged with' some Charge))
             and ('has tincture' some Tincture)))

Here we need an individual representing a plain escutcheon to have a tincture, and only that tincture, and to be known to have no charge. With the open world assumption just not saying there is a charge doesn’t mean there isn’t one. Our individual will have to say there isn’t a charge on this escutcheon.

3.

 Class: 'Metal escutcheon and coloured charge'

    Annotations: [in heraldry]
        label "Metal escutcheon and coloured charge"^^string

    EquivalentTo: [in heraldry]
        * 'Metal escutcheon and coloured charge',
        Escutcheon
         and (hasPart some
            (Field
             and ('is charged with' some
                (Charge
                 and ('has tincture' some Colour)
                 and ('has tincture' only Colour)))
             and ('has tincture' some Metal)
             and ('is charged with' only
                (Charge
                 and ('has tincture' some Colour)
                 and ('has tincture' only Colour)))
             and ('has tincture' only Metal)))

Here we say that the field must have a metal tincture and can only have a metal tincture, and vice versa for the charge. Note the care with which everything is closed off.

4.

Class: 'Coloured Escutcheon and metal charge'

    Annotations: [in heraldry]
        label "Coloured Escutcheon and metal charge"^^string

    EquivalentTo: [in heraldry]
        * 'Coloured Escutcheon and metal charge',
        Escutcheon
         and (hasPart some
            (Field
             and ('is charged with' some
                (Charge
                 and ('has tincture' some Metal)
                 and ('has tincture' only Metal)))
             and ('has tincture' some Colour)
             and ('is charged with' only
                (Charge
                 and ('has tincture' some Metal)
                 and ('has tincture' only Metal)))
             and ('has tincture' only Colour)))

we can add various individuals with fields and charges and the HO captures them well enough (it is a limited part of blazon that I’ve implemented). If we add an individual for “Argent with latin cross Or” like this:

Individual: 'Argent with latin cross Argent'

    Annotations: [in heraldry]
        comment "Should break rule of tincture"^^string,
        label "Argent with latin cross Argent"^^string

    Types: [in heraldry]
        (hasPart some
            (Field
             and (('is charged with' some
                ('Latin cross'
                 and ('has tincture' some Argent)
                 and ('has tincture' only Argent)))
             and ('is charged with' only
                ('Latin cross'
                 and ('has tincture' some Argent)
                 and ('has tincture' only Argent))))
             and ('has tincture' some Argent)
             and ('has tincture' only Argent)))
         and (hasPart only
            (Field
             and (('is charged with' some
                ('Latin cross'
                 and ('has tincture' some Argent)
                 and ('has tincture' only Argent)))
             and ('is charged with' only
                ('Latin cross'
                 and ('has tincture' some Argent)
                 and ('has tincture' only Argent))))
             and ('has tincture' some Argent)
             and (              'has tincture' only Argent))),
        Escutcheon

Note that everything on this individual is a type assertion to anonymous individuals for charges and their tinctures. Note that this lets me be very closed about what I say; I’ve used universal quantification to help me close this down. I’ve used some functional properties for other qualities of things. Being this closed is really hard, largely from a syntactic point of view. It is just hard plugging all the gaps. Uli Sattler had to help me out after I’d stared at a mass of parentheses for too long and not seen the problem of an “openness gap” that I knew was there.

With this individual, the HO becomes inconsistent. That is, this instance cannot exist in this ontology. this is because it breaks the rule of tincture by having a white cross on a white background; though the reasoner doesn’t say this. I have to use Matt Horridge’s wonderful explanation gadgets. The explanation for this inconsistency is (as a not very beautiful table):

1
Escutcheon EquivalentTo Plain escutcheon or Coloured Escutcheon and metal charge or Metal escutcheon and coloured charge
2
2
Plain escutcheon EquivalentTo Escutcheon and (hasPart some (Field and (not (is charged with some Charge)) and (has tincture some Tincture)))
2
3
Argent with latin cross argent Type (hasPart some (Field and ((is charged with some (Latin cross and (has tincture some Argent) and (has tincture only Argent))) and (is charged with only (Latin cross and (has tincture some Argent) and (has tincture only Argent)))) and (has tincture some Argent) and (has tincture only Argent))) and (hasPart only (Field and ((is charged with some (Latin cross and (has tincture some Argent) and (has tincture only Argent))) and (is charged with only (Latin cross and (has tincture some Argent) and (has tincture only Argent)))) and (has tincture some Argent) and (has tincture only Argent)))
2
4
Argent with latin cross argent Type Escutcheon
2
5
Argent SubClassOf Metal
2
6
Coloured Escutcheon and metal charge EquivalentTo Escutcheon and (hasPart some (Field and (is charged with some (Charge and (has tincture some Metal) and (has tincture only Metal))) and (has tincture some Colour) and (is charged with only (Charge and (has tincture some Metal) and (has tincture only Metal))) and (has tincture only Colour)))
2
7
Metal escutcheon and coloured charge EquivalentTo Escutcheon and (hasPart some (Field and (is charged with some (Charge and (has tincture some Colour) and (has tincture only Colour))) and (has tincture some Metal) and (is charged with only (Charge and (has tincture some Colour) and (has tincture only Colour))) and (has tincture only Metal)))
2
8
Light tincture EquivalentTo Metal
2
9
is charged with Range Charge
1
10
Light tincture DisjointWith Dark tincture
2
11
Dark tincture EquivalentTo Colour
2

The table has the axiom number, the axiom itself, and the final column indicates in how many explanations the axiom is involved. Basically, what this says is that this individual has broken the rule of tincture. The axioms say that there is a field that has a metal tincture and only has a metal tincture. There is a charge that has a metal tincture and only a metal tincture. Escutcheon is covered by 3 escutcheon types, and due to the tinctures of this individual’s charge and field, there is an inconsistency; the two types of tinctyure are disjoint, so Argent must come from one and not the other (as Argent is a kind of “Metal”).

Getting all of this to work is conceptually easy enough, but actually hard to implement. it is all about closing down the open worldness in both the TBox and ABox.

Note that the escutcheon Or with lion regardant passant coward vert has classified just under escutcheon and not under one of the three escutcheon types that cover “Escutcheon”. This is because the individual for this escutcheon is under-constrained. I’ve left it open as to whether either the field or charge can have another tincture to that already specified. As we know it is an escutcheon, but not which of the three types that cover Escutcheon, this individual is, it doesn’t break the constraint of the covering axiom. Only after we’ve ruled out that it can’t be any one of the three covering types wil the covering axiom come into effect and the ontology becomes inconsistent.

The things left to do on the HO (if I choose to) include:

  1. Sorting out positions on the escutcheon. There are implicit rules within blazon for where The trick with gettng things are positioned – “Azure with Gauntlet argent and sword or” implies positions for the gauntlet and sword. Without the ordering of blazon, this has to be made explicit.
  2. Marshalling of several fields and charges on to one escutcheon and how these are arranged.
  3. Indicating whether charges are arranged sinister or dexter.
  4. Dealing with “proper” tinctures.
  5. Lots of other heraldic nonsense.

Capturing Variation in Plant form

August 21, 2011

Returning to my flower anatomy ontology. the variation, even within one kind of plant, is large. The variations cause problems for making descriptions of plants. this becomes a particular problem when making the kind of precise descriptions I’ve tried to do in the flower anatomy. For example, when trying to describe a particular plant I’ve found with an aim to classify the thing against my ontology, I find different kinds of leaves (some ovate, some obovate, etc); I thind different numbers of branches and leaf segments etc. of course, one usually takes some normative view of a plant (a cnceptualisation of some cannonical plant), but it would be good to be able to capture the range and variety within a kind of plant in the TBox description.

We can exemplify this with a fairly randomly picked entry from Stace’s New Flora of the British Isles. The entry is:

“CFOSMOS Cav. Mexican Aster

Annuals; leaves all opposite, 2-3-pinnate with linear to filiform segments; phyllaries in 2 dissimilar rows, the outer narrower, herbaceous with membranous border, the inner membranous; capitula radiate; receptacle flat, with scales; pappus of (0)2(-3) bristles with usually backward-directed barbs; ligules numerous, pinkish-purple, rarely white; disc flowers yellow.”

A New flora of the British Isles 1997 page: 755. New York
— Clive Stace

We can dissect such a description, bearing in mind the universality of OWL’s class level restrictions; that is, each and every instance of class x holds some relationship to at least one instance of some other class.

  • “Annuals”; this is OK; all the plants grow and die within one year.
  • “leaves all opposite”; Again this is OK; all the leaves are “opposite”, that is,at each node there are leaves that are opposite each other – as opposed to alternating from side to side of a twig at each node.
  • “2-3-pinnate with linear to filiform segments”; Here it gets interesting. The leaves are “2 to 3 pinnate”; that is, a single leaf is made of leaflets and these leaflets are themselves made of leaflets (or with another rlevel of nesting). a classic example in britain would bve the mountain ash or rowan. Also that the leaves are in “linear to filiform segments” is difficult. There are a space of leaf shapes that botanists have partitioned into discrete named shapes. Sometimes a leaf shape wil be in between two or more named shapes. In this case, the leaf shape is in between linear to filiform, that is, thinner than linear but longer than hair-like. A similar case is “obovate to ovate”, which implies “elliptical”.
  • “phyllaries in 2 dissimilar rows, the outer narrower, herbaceous with membranous border, the inner membranous”; Not much variation here, but we would need to capture that the inner and outer rows are dissimilar.
  • “capitula radiate; receptacle flat, with scales”; again, a fairly straight-forward “all” description.
  • “pappus of (0)2(-3) bristles with usually backward-directed barbs”; “Pappous” are modified calyx that are bristly or feathery. The notation (0)2(-3) is standard in floras and means “Usually 2 exceptionally 0 and up to 3″. The notions of “usually”, “rarely” and “exceptionally” are just hard. “Up to” could be done with a “max” cardinality constraint…
  • “ligules numerous, pinkish-purple, rarely white”; this statement contains a lot. we need to describe “pinkish – purple”, we have to capture the notion of “numerous” and “rarely”. Exact numbers are rather easy in OWL with qualified cardinality constraints, but bvague numbers present more of a problem. Just creating some class of “numerous” seems like a real cop-out, but in a sense it is just another compromise like value partitions – we have a continuous spectrum that wwe just patition into convenient chunks. this would be just like dividing up a number line into convenient chunks: 1, 2, 3, 4,…. numerous. and “rarely” and “often”. this is so vile…
  • “disc flowers yellow”; this is OK, except for the usual turmoil of describing colour.

Lots to explore in all this; for now, I’l ljust choose “2-3-pinnate with linear to filiform segments”.

  1. the leaf as a whole is divided into parts that are “leaflets”. We could either have two classes “Leaf” and “Leaflet”, with the restriction that all leaflets are part of leaves (but not vice versa) or we just stick with the class of “Leaf” and have a subclass of “Divided leaf” that has parts that are “Leaf”. If we establish the “isPartOf” going in the other direction, we could have a defined class of “Leaflet” THAT IS equivalent to any leaf that is part of a leaf.
  2. A Pinnate leaf can then be defined as either a Leaf that hasPart some Leaflet or as Leaf that hasPart some Leaf. The latter is more economical, but the former more in-line with domain vocabulary.
  3. Sticking with the former, then my 2 pinnate leaf might be Leaf that hasPart some (Leaflet that hasPart some Leaflet)).
  4. My 3 pinnate leaf might be Leaf that hasPart some (Leaflet that hasPart some (Leaflet that hasPart some Leaflet))).
  5. Being 2 – 3 pinnate then becomes a disjunction of these two class expressions. Clumsy, but not totally horrid.
  6. The problem, of course, comes with larger amounts of variability. If I had 2 – 4 pinnate, my class expression becomes increasingly clumsy.

One could model the “pinnateness” of a leaf and model a range of pinnateness with max and min cardinalities. this, however, doesn’t capture the physical nature of the sub-divided leaves — especially if I wished to say something about the subm-divisions. For instance, in the mountain ash, I want to say that the final division is arranged “herring-bone” fashion.

At the moment, the clumsy pattern I’ve outlined is the best I’ve come up with. At some point I’ll actually try and do it, then put the ontology up for inspection. meanwhile I wil think more and write more on the other aspects of describing the variation in plant descriptions.

A Simple Knowledge Organisation tool (SKOT)

July 18, 2011

Protege is a complex tool. By deafult it offers a user all that it is possible to do with OWL (relationships, quantifiers, class expressions, and so on) and there can be choices at any point. Often Protege is too complex, especially at an early stage of authoring an ontology where one might simply wish to “sketch” something out; perhaps for migrating to a more sophisticated form later. Andrew gibson, when he worked in our group, wanted a simple tool for “sketching” an ontology. Such a tool would be based on a simple “blob and line” model, corresponding to classes, subclass axioms and existential restrictions. Matt Horridge developed a tool called Montage from Andrew’s specification; it was only ever a prototype and never saw the light of day outside that particular office.

Inspired by this, I offered a third year undergraduate project to develop a “simple knowledge organisation tool” — SKOT. A student called Mark Jordan took this project on and has done a good job, given the restrictions of time involved in a University of Manchester Computer Science third year undergraduate project.

Mark developed a tool called SKOT – the Simple Knowledge Organisation Tool. It is an open source project under an LGPL licence and the SKOT code is available on source forge.

The picture shows SKOT at a point just before the user is about to export the sketch into an OWL file. There is a “term list”, where words or terms that might form the blobs or classes in the ontology are “stored”. The example is the traditional “university” modelling example, with the terms “University”, “Person”, “Student”, “Undergraduate”, “Mark”, “Postgraduate”, “Teaching Assistant”, “Lecturer”, “Lecture” and “TA Lecture”. These terms can be selected and dragged on to the canvass, where they become blogs that represent classes or individuals (for the term “Mark” in the list above). %Relationships are created by selecting a blob, choosing to create a relationship, moving to the “target” and then finishing the interaction. Relationship sub- super-class or of other types as specified by the users. To do this, SKOT takes the following approach:

  1. There is a canvas on which blob and line pictures can be drawn; blobs are classes and lines are relationships.
  2. New blobs and lines can be created on the canvas.
  3. Words or terms can be dragged from a word list onto the canvas, where they form new blobs.
  4. The diagram can be exported as OWL through the OWL API.
  5. SKOT projects can be saved and re-loaded.
  6. It is possible to load in the existential graph portion of an existing OWL ontology, extend it in SKOT and re-save it in the original file.

There’s a lot of user interface work involved in SKOT. groups of blobs can be selected and each member of the group forms a subclass relationship to a selected superclass. The layout is in the hands of the user.

James Eales used SKOT to make a toy ontology of fish, starting by just typing in a load of words about fish. This is fine, but hooking SKOT up to an automatic term recognition tool, as well as hand-typing, would be good. Once in the word list, they are ready to drag into the window where the “ontology” can be sketched.

SKOT with a list of words about fish.

Next, James moved terms from the list on to SKOT’s sketch canvass and made a basic hierarchy of terms. Note that classes and instances are differentiated. Other types of relationship than the subclass are possible, but not used here.

The words now arranged in a simple tree.

This was then saved into an OWL file, imported into Protege and shown using OWLViz.

Note that the export from SKOT to OWL appears to have gone wrong – Cod is now a warm water fish, where in SKOT it was a cold water fish; I’m sure this is easily remedied.

Mark did a basic evaluation, getting some people to install and use SKOT to draft some ontologies. Two of these made ontologies — one an onmtology of guitars and one an ontology of fish. All the users were basically impressed, but also gave long lists of things to do — one user, for example, just found it difficult to work out what to do on start up; however, once he got going, all was basically OK.

SKOT is currently a stand-alone application and it really should be a Protege plugin. there’s also a lot more to do on SKOT, both little things and big things. On the list of little things are to fix various labels on the UI to make better sense. On the larger side of things, we need:

  1. It all connected to an automatic term recognition tool, especially with a PDF to text converter;
  2. We need to have regular expression searches over the term lists and editing of the list;
  3. We need to be able to save the list and import into the list from various sources;
  4. One of the main issues with SKOT is the scalability of the drawing of the blobs and lines. Some zooming would probably be useful. Montage had a facility to “fold away” portions of the sketch that wern’t currently the focus of attention. Andrew gibson had a nice design for how to deal with many of these issues, but those are not here, but some are in the Montage prototype and may see the light of day eventually. There are lots of UI tricks to be played here, but I also suspect that the utility of such a tool lies in the small scale aspect and that such things are inherently very difficult to scale.

Mark’s report on SKOT and how it was built is available.


Follow

Get every new post delivered to your Inbox.

Join 89 other followers