<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Robert Stevens&#039; Blog</title>
	<atom:link href="http://robertdavidstevens.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://robertdavidstevens.wordpress.com</link>
	<description>Feed your head</description>
	<lastBuildDate>Thu, 16 May 2013 12:59:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='robertdavidstevens.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Robert Stevens&#039; Blog</title>
		<link>http://robertdavidstevens.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://robertdavidstevens.wordpress.com/osd.xml" title="Robert Stevens&#039; Blog" />
	<atom:link rel='hub' href='http://robertdavidstevens.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Which is used most for biomedical ontologies: OBO Format or OWL?</title>
		<link>http://robertdavidstevens.wordpress.com/2013/05/16/which-is-used-most-for-biomedical-ontologies-obo-format-or-owl/</link>
		<comments>http://robertdavidstevens.wordpress.com/2013/05/16/which-is-used-most-for-biomedical-ontologies-obo-format-or-owl/#comments</comments>
		<pubDate>Thu, 16 May 2013 10:21:07 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[Ontologies]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=204</guid>
		<description><![CDATA[I was reading robert Hoehndorf et al&#8216;s paper Relations as patterns: bridging the gap between OBO and OWL and was rather struck by the opening sentence: &#8220;The OBO Flatfile Format [1] is used to represent most biomedical ontologies, among them the Gene Ontology (GO) [2] and most of the OBO Foundry ontologies [3].&#8221; the bit [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=204&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>I was reading  robert Hoehndorf <em>et al</em>&#8216;s paper  <a href="http://www.biomedcentral.com/1471-2105/11/441">Relations as patterns: bridging the gap between OBO and OWL</a> and was rather struck by the opening sentence:</p>
<blockquote><p>&#8220;The  OBO  Flatfile  Format  [1]  is  used  to  represent  most  biomedical ontologies, among them the Gene Ontology (GO)  [2]  and  most  of  the  OBO  Foundry  ontologies  [3].&#8221;</p>
<p align="right">
</blockquote>
<p>the bit &#8220;The  OBO  Flatfile  Format  [1]  is  used  to  represent  most  biomedical ontologies,&#8230;&#8221; struck me as unlikely (at least on face value). So, I had a look. Using the RESTful API to BioPortal, Nico Matentzoglu (one of our group&#8217;s Ph.D. students) downloaded all the publically available ontologies (the API lets you get both public and private, but we didn&#8217;t use the private ones). We got a total of 347 ontologies that used the representations as follows:</p>
<p>OBO     114 <br /> OWL     161<br /> OWL-DL  32<br /> OWL-FULL        9<br /> PROTEGE frames  2<br /> RRF     26 <br /> UMLS-RELA       3</p>
<p>So, OBO Format has 114  and OWL (the different flavours of OWL are apparently different ontologies) has 202. I don&#8217;t need to do the stats &#8211; there are more OWL ontologies than OBO Format ontologies. I&#8217;m assuming that BioPortal  is a representative sample of biomedical ontologies. With this assumption, Rob&#8217;s statement is wrong.</p>
<p>Can we change the statement to &#8220;the OBO Format is the representation of the most widely used biomedical ontologies&#8221;? The Gene Ontology (and other OBO Format ontologies) have a large corpus of annotations; I have no numbers across the board, but GO has 3898904 annotations (number of filtered annotations from the <a href="http://www.geneontology.org/GO.downloads.annotations.shtml">Gene Ontology Annotations page</a> on 15 May 2013) and is also widely used in gene over expression analysis etc. This is a big number &#8211; and other OBO format ontologies are used for annotations too, though to what extent I don&#8217;t yet know.</p>
<p>If we look at some OWL ontologies like SNOMED and NCIT (we can probably argue about whether SNOMED is natively OWL, but we&#8217;ll go with it for now), we also probably have some big numbers. The nature of SNOMED annotations of health records means it may be difficult to get the numbers and even though the &#8220;mandate&#8221; for use and actual use may be different, I suspect the numbers will stil be quite big. Anyway, let&#8217;s make something up &#8211; UK health records are annotated (I think with Reid codes which are now part of SNOMED) and there are 60 million UK people and, assuming 1 code per person&#8217;s record, we&#8217;ve got 60 million annotations &#8211; quite big. The experimental Factor Ontology (EFO) is more bio and is used for some 636k anotations  in the Gene Expression Atlas  (thanks to James Malone for the numbers) &#8211; not GO sized, but getting on for a biggish number.</p>
<p>So,  in terms of numbers OWL ontologies are widely used.</p>
<p>What happens if we take the medical ones out? Then the numbers will start to look much less healthy for OWL ontologies. Nevertheless, we&#8217;ve got a lot of OWL ontologies and fewer OBO ontologies and, even if we have fewer OWL ontologies actually used, we&#8217;ve got a lot of use of biomedical OWL ontologies. taking the medical ones out of the OWL set, I suspect we&#8217;ve still got more OWL bio-ontologies, but the OBO ones are used more widely in bio (and the &#8220;important&#8221; ones are OBO). Taking a look at BioPortal&#8217;s OWL ontologies, one gets the suspicion that a lot of them are &#8220;toy&#8221; ontologies (I&#8217;m sure some OBO Format ontologies come into this category too).  this will reduce the number of OWL ontologies, but I don&#8217;t want to do the categorisation.</p>
<p>Despite this blog having deteriorated from firm numbers to speculation, I think we could go with an opening sentence for Rob&#8217;s paper of</p>
<blockquote><p>&#8220;At the time of writing, most of the widely used bio-ontologies use the OBO Format&#8230;.&#8221;</p>
<p align="right">
</blockquote>
<p>or</p>
<blockquote><p>At the time of writing, most of the important bio-ontologies that are extensively used for description of data use the OBO format representation&#8230;.</p>
<p align="right">
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/204/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=204&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2013/05/16/which-is-used-most-for-biomedical-ontologies-obo-format-or-owl/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
		<item>
		<title>Easing the pain of ontology building with Populous</title>
		<link>http://robertdavidstevens.wordpress.com/2012/11/13/easing-the-pain-of-ontology-building-with-populous/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/11/13/easing-the-pain-of-ontology-building-with-populous/#comments</comments>
		<pubDate>Tue, 13 Nov 2012 20:46:09 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[Ontologies]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=201</guid>
		<description><![CDATA[Our BMC Bioinformatics paper on Populous has been published. This first had an outing in the 2010 SWAT4LS workshop. Populous is Simon Jupp&#8217;s spreadsheet based tool for &#8220;populating&#8221; or adding content to ontologies. It builds upon the use of spreadsheets as templates for filling out what we know about entities and then transforming that information [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=201&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>Our <a href="http://www.biomedcentral.com/1471-2105/13/S1/S5/">BMC Bioinformatics paper on Populous</a> has been published. This first had an outing in the 2010 <a href="http://">SWAT4LS  workshop</a>. Populous is Simon Jupp&#8217;s  spreadsheet based tool for &#8220;populating&#8221; or adding content to ontologies. It builds upon the use of spreadsheets as templates for <em>filling out</em> what we know about entities and then transforming that information into axioms within the ontology &#8211; we just think that Populous does it rather well. The thing about Populous is that it isn&#8217;t just filling in bits of OWL in a spreadsheet; it is set up to hide as much of that as possible, as a way of getting non-ontologists close to ontology building without &#8220;frightening the horses&#8221;.</p>
<p>We used Populous in the construction of the <a href="http://robertdavidstevens.wordpress.com/2011/07/17/a-kidney-and-urinary-pathway-knowledgebase/">KUPKB</a>. We needed to extend the <a href="http://www.e-lico.eu/public/kupo/">Kidney and Urinary Pathway Ontology (KUPO)</a> to include more kidney cells that were not in the <a href="http://www.obofoundry.org/cgi-bin/detail.cgi?id=cell">Cell Type Ontology</a>, but Julie Klein, one of our collaborating  kidney specialists was not and would not be an OWL or Prot&eacute;g&eacute; user &#8211; even if we supplied the template of axioms for her to fill in. This is why we went for spreadsheets  as a way of collecting the classes that go into the predefined framework of properties. For the <a href="http://www.kupkb.org">KUPKB</a>, we wanted to describe the anatomical location of the cells and the biological processes that they fostered. This is a simple template to put in a spreadsheet, but leaves the problem of  which classes to put into the spreadsheet&#8217;s cells &#8211; we don&#8217;t want users having to go back and forth to the GO (or whatever) to choose cells, transcribe or cut-and-paste the ID or just make up terms as all this is too much like hard work.</p>
<p>This is where Populous comes in; it supports the scenario above, but puts menus in place that have the appropriate terms from the &#8220;supporting&#8221; ontologies in place. Populous also has support for OPPL, the <a href="http://oppl2.sourceforge.net/">Ontology Pre-processing Language</a>, that takes rows from the spreadsheet, creates the axioms and squirts them into the growing ontology via the <a href="http://owlapi.sf.net">OWL API</a>.</p>
<hr />
<h2><a name="_how_populous_works"></a>how populous works</h2>
<p>Populous is an extension of <a href="http://www.sysmo-db.org/RightField">RightField</a>. This is a Manchester product  from Carole Goble&#8217;s group based on spreadsheets. RightField is a spreadsheet generator. RightField is used to choose portions of vocabulary to go into a column; these are presented as menus to a user in the Excel spreadsheet that RightField produces. RightField manipulates Excel&#8217;s ability to create such menus. The generated spreadsheets   are then handed out to biologists to use. The important point is that it is just using an Excel spreadsheet &#8211; no other software need be installed (a barrier to use and, of course, another maintenance issue). RightField was developed for  systems biologists to <em>standardise</em> their annotations of data. It is &#8220;semantics by stealth&#8221; as the whole idea is to make it easier to do the right thing (use terms from a vocabulary), rather than just making terms up (because that is easier than looking for the terms in a vocabulary). In RightField, the spreadsheet designer has already chosen the right area of vocabulary to be used. It doesn&#8217;t stop wrong or new terms being used (after all, vocabularies are rarely complete). However, it does afford a mechanism for spotting non- vocabulary terms and dealing with them &#8211; either changing the spreadsheet or fixing the ontology itself. RightField help data to be marked up; it can then be transformed to whatever one wants.</p>
<div> <img src="http://robertdavidstevens.files.wordpress.com/2012/11/dataspreadsheet_metadata.png?w=450" style="border-width:0;">
<p><b>Figure 1. </b>A RightField spreadsheet showing one of the &#8220;ontology driven&#8221; menus</p>
</p></div>
<pre style="padding:.5em;color:gray;"> The picture in Figure  1 is of RightField showing one of the "ontology
menus" for describing some data that went into the

http://www.kupkb.org[KUPKB].

In the first column, we describe what type of metadata we want; e.g.
compound list, experiment condition, species In the second column, the
cells are yellow, they are the actual RightField constraint cells. For
example, for the metadata about maturity we can see that the cell
expands in a menu where the user can choose amongst the appropriate
ontological terms: adolescent, adult, fetal</pre>
<div> <img src="http://robertdavidstevens.files.wordpress.com/2012/11/dataspreadsheet_data.png?w=450" style="border-width:0;" alt="A RightField spreadsheet showing some  annotated data">
<p><b>Figure 2. </b>A RightField spreadsheet showing some  annotated data</p>
</p></div>
<p>The picture in Figure 2. shows some data inside the RightField spreadsheet; it  is some experimental KUP data that went into the  KUPKB. We can just copy/paste lots of tabulated data into the spreadsheet cells. Here, for example, we have a list of genes, identified with their EntresGeneId (so the list of identifiers has been pasted into the EntrezGeneId column). But if you have other identifiers, such as GeneSymbol, UniprotId, HMDBId, or microRNAId, you can just copy/paste in the appropriate column. Then you have a column to describe how each gene is modified in the disease; this is the Differential expression column and for each gene you can say if it is Up or Down regulated &#8211; offered via ontology terms from an Excel menu. Finally, there are 3 columns to enter numerical values such as Ratio, pValue and FDR (False discovery Rate).</p>
<p>Populous re-purposes RightField to make it do ontologies. The mechanism is basically the same, but Populous adds a pane for the <a href="http://oppl2.sf.net">OPPL</a> scripts to be made to transform the spreadsheet&#8217;s contents to OWL axioms and put them in the target ontology (there is a wizard to take one through the process of mapping columns to variables and so on). Instead of annotating data, we&#8217;re annotating the entities to appear in an ontology. For the <a href="http://www.e-lico.eu/kupo">Kidney and Urinary Pathway Ontology</a> (KUPO) we made a Populous sheet for cells (we needed to augment the Cell type Ontology with Kidney cells). We wanted a patterns something like:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">Class: KUPOKidneyCell
        SubClassOf: Cell,
        isCapableOf some biologicalProcess,
        isPartOf some GrossAnatomicalEntity</pre>
</td>
</tr>
</table>
<p>(though that isn&#8217;t exactly what we ended up with.) So, the left-hand column of the sheet is the <tt>biological cell</tt>; the next two columns are the <tt>isPartOf</tt> and the <tt>isCapableOf</tt> property and the entries in these columns are the existential fillers for these properties.</p>
<p><img src="populous.png" style="border-width:0;" alt="A picture of Populous showing cells being marked up with their anatomical location and GO biological process"></p>
<p>The KUPO Excel spreadsheets can be taken away and filled in (for KUPO Julie took them away and added information about kidney cells). The picture above shows some kidney cells being ontologically described  in Populous. Each column is filled with terms from one of the ontologies used for KUPO. In the first column, this is terms about Cell Types using the Cell Type Ontology. Most of the terms are red because we had to create them; they were not present before in CTO. Only podocyte and juxtaglomerular cell are in green because they existed in CTO. In the third column it is terms about Anatomy&#8217; using the Mouse Anatomy Ontology. Here it is the opposite, most of the terms are green (i.e. existed in MAO) except for renal corpuscule and bowmans capsule. In the last column are terms about Biological Processes using GO &#8211; and there all are validated &#8220;green&#8221;. An export of a populated KUPO spreadsheet can be viewed <a href="https://docs.google.com/spreadsheet/ccc?key=0AtLCO-XI1IQEdER2emZsVWV1Z3JoVlpYRmxla2gtMkE">here</a>.</p>
<p>On loading into Populous, items in spreadsheet cells not in the chosen vocabulary  are highlighted in red and OPPL has a way of dealing with such &#8220;new terms&#8221; &#8211; after all, we do want to collect new vocabulary, though this highlighting does allow some errors to be fixed.  The code below shows the OPPL pattern that actually takes items from the spreadsheet, puts them into the variables, fills in the pattern template and puts them, via the OPPL and OWL API into the ontology (in this case KUPO).</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">?cell:CLASS,
?anatomyPart:CLASS,
?partOfRestriction:CLASS = part_of some ?anatomyPart,
?anatomyIntersection:CLASS = createIntersection(?partOfRestriction.VALUES)
BEGIN
ADD ?cell equivalentTo CL_0000000 and ?anatomyIntersection
END;</pre>
</td>
</tr>
</table>
<p>The OPPL script (at its most basic) adds words like <tt>add</tt>, <tt>remove</tt> and <tt>select</tt> to the Manchester OWL Syntax; it then has its own interpreter and runs the OPPL script, via the OWL API, on a given ontology. In Populous, we have a series of variables mapped to columns; each row is taken in turn and added to the script, which then adds the appropriate axioms to the ontology.</p>
<p>(Populous also does lots of sensible things about identifiers, labels, multiple values in cells and so on.) A slightly out-of-date <a href="http://www.youtube.com/watch?v=MQ_roJ7n2pc">Populous video</a> can be viewed here.</p>
<hr />
<h2><a name="_why_spreadsheets_are_good"></a>why spreadsheets are good.</h2>
<p>By using Populous we had Julie Klein, our collaborating biologist, ontologically describing cells  just by choosing items from a menu to fill a pattern implicit within the spreadsheet she was using. Julie added some 183 kidney cell descriptions to KUPO without pressing a single button in Prot&eacute;g&eacute; .</p>
<p>Why is this technique a good thing? first of all it uses spreadsheets. the Excel spreadsheet is almost ubiquitous in science; so just join in. This is what RightField has done &#8211; trying to get one&#8217;s user base to install another bit of software, and one with which they are unfamiliar, is putting oneself onto the backfoot from the start. by using the Excel spreadsheet, we&#8217;re adopting a familiar environment. we just dish out the spreadsheet and that&#8217;s it. the Populous and OPPL bits can be done off-line and by someone else.</p>
<p>We can also change the patterns by which we add the descriptions to an ontology by changing the OPPL script. OPPL is a means by which the ontology can be transformed programmatically without resorting to the depths of the OWL API &#8211; and this is also a good thing. We can also automatically add  all that tedious but important stuff like author tags (though I suspect we didn&#8217;t), but we should have done.</p>
<p>Third, we have a record of what was added; we can expand the spreadsheet with relative ease and just add more stuff to the ontology.</p>
<p>Most of the KUPO was built with few button presses in Prot&eacute;g&eacute; . Simon Jupp built the framework from OBO ontologies, plus a few bits of our own. KUPO extensions and the experimental data that form the knowledgebase are added using RightField and Populous. There were few button presses and this  is a good thing. It reduces the possibility for error and it makes the use of patterns ruthlessly consistent. However, the principle thing is that we had Julie really contributing to the ontology, but she didn&#8217;t have to do OWL, Prot&eacute;g&eacute;  or &#8220;ontologising&#8221;.</p>
<p>One thing missing from the paper is an evaluation. The problem is we didn&#8217;t really know what to evaluate. We could have sat down some biologists (that didn&#8217;t know how to use OWL or Prot&eacute;g&eacute; ) in front of an Excel spreadsheet and Prot&eacute;g&eacute; , ask them to describe cells and measured the result. I suspect Julie&#8217;s reaction (and others) would have been &#8220;bof&#8221;. This would have been a pointless thing to do. We could have tested the usability of Populous/RightField, but that&#8217;s not the claim we&#8217;re making &#8211; we&#8217;re talking about making building ontologies &#8220;easier&#8221; &#8211; always a bad word to use, but it&#8217;s what we&#8217;re doing. We know that it works from a technical point of view, but that&#8217;s a bit ordinary. We could compare manual building against auto- building to look at error or slip rates. this is hard to measure (except with Eleni Mikroyannidi&#8217;s regularity Inspector for Ontologies); this would be useful, but again isn&#8217;t really testing the claim that it is easier for non-ontology builders to use something like a spreadsheet.</p>
<p>We&#8217;re continuing to use Populous. we recently ran an event expanding the <a href="http://theswo.sf.net">Software Ontology</a>; Duncan Hull made a RightField spreadsheet with which attenders could use the SWO to describe software, then he wrote OPPL to add the stuff to the SWO. Our attenders were not ontologists and so, again, we had domain experts directly contributing to axiomatic descriptions of classes in an ontology without the trauma of &#8220;doing ontology&#8221; &#8211; which must be a good thing.</p>
<p>(this blog had lots of input from Simon Jupp and Julie Klein)</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/201/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=201&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/11/13/easing-the-pain-of-ontology-building-with-populous/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>

		<media:content url="http://robertdavidstevens.files.wordpress.com/2012/11/dataspreadsheet_metadata.png" medium="image" />

		<media:content url="http://robertdavidstevens.files.wordpress.com/2012/11/dataspreadsheet_data.png" medium="image">
			<media:title type="html">A RightField spreadsheet showing some  annotated data</media:title>
		</media:content>
	</item>
		<item>
		<title>Ontologies are not the only fruit</title>
		<link>http://robertdavidstevens.wordpress.com/2012/10/13/ontologies-are-not-the-only-fruit/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/10/13/ontologies-are-not-the-only-fruit/#comments</comments>
		<pubDate>Sat, 13 Oct 2012 14:41:01 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[Ontologies]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=196</guid>
		<description><![CDATA[With apologies to Jeanette Winterson. My fantasy funding would be to be given a large wad of cash to go and build an ontology for the sake of building an ontology; just as a record of what we know about a domain &#8211; modelling for fun. This might display a belief of an ontology being [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=196&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>With apologies to <a href="http://en.wikipedia.org/wiki/Oranges_Are_Not_the_Only_Fruit">Jeanette Winterson</a>.</p>
<p>My fantasy funding would be to be given a large wad of cash to go and build an ontology for the sake of building an ontology; just as a record of what we know about a domain &#8211; modelling for fun. This might display a belief of an ontology being an &#8220;end in itself&#8221;; the truth is that I just like building ontologies. I do, however, believe strongly that this &#8220;ontology as an end in itself&#8221; is a bad thing and should be left as a hobby.</p>
<p>The authoring of an ontology is not an end in itself; it is a means to an end. In biology the &#8220;end&#8221; is to help us manage and analyse biology&#8217;s data with greater ease, reliability, replicability and so on. To this end the ontology is the servant and not the master; too much &#8220;truth and beauty&#8221; instead of &#8220;making things work&#8221;, whilst it can be fun, can result in <a href="http://en.wikipedia.org/wiki/How_many_angels_can_dance_on_the_head_of_a_pin%3F">debating upon  how many angels can dance on the head of a pin</a> &#8211; that is, largely fruitless explorations of how to resolve issues that appear so  marginal that I could already be using something that did  the job (but was full of ontological compromises ) while still waiting for the perfect solution.</p>
<p>In a similar vein, for the <em>ontology evangelist</em>, there is an ontological hammer for every representational nail. There&#8217;s so much more to knowledge representation than what Alan Rector and I  have called <a href="http://ontogenesis.knowledgeblog.org/1074">&#8220;universal knowledge&#8221;</a> in an Ontogenesis blog. This universal knowledge is the static things that are true for all instances of a class. There&#8217;s so much more that one can say about stuff though: things true for some instances of a class; stuff is tue contingent on some other knowledge; rules; higher-order statements, probabilistic knowledge and so on. All this stuff is not really &#8220;ontological&#8221;, but needs to be in semantic applications and is part of KR. Alan rector expanded on all of this in his <a href="http://www.cs.man.ac.uk/%7Erector/presentations/ICBO-2012-rector.pptx">keynote talk at the 2012 International Conference of Biomedical Ontology</a>. (It is useful to note that one of Alan&#8217;s &#8220;definitions&#8221; of an ontology was along the lines of &#8220;whatever an ontology is, it isn&#8217;t that <em>anything</em> written in OWL&#8221; is an ontology- and this is certainly true and plays to this notion that there&#8217;s more to knowledge representation than ontology and OWL.</p>
<p>As a side-note, if we restrict the use of &#8220;ontology&#8221; to that which is about statements that are universally true about entities in a domain, then we change the acceptance criteria for what is an ontology. This then gets us out of the hole where all too often we&#8217;ll accept anything called an ontology as an ontology, then apply very strict evaluation  criteria for what is a good ontology. If we can get to the state where  <a href="http://en.wikipedia.org/wiki/Medical_Subject_Headings">MeSH</a> is no longer criticised for being a bad ontology we&#8217;ll have got somewhere. MeSH is not an ontology; it&#8217;s a thesaurus &#8211; criticise it for being a bad thesaurus if it is a bad one, but don&#8217;t criticise it for being a bad ontology. This plea for a narrow interpretation of &#8220;ontology&#8221; is, however, not the same as saying all ontologies (of universal knowledge) should follow a particular ontological dogma.</p>
<p>Layered on top of these many kinds of statements in Kr world are views of knowledge for certain communities and purposes. Simon Jupp did a paper on <a href="http://www.cs.man.ac.uk/~stevensr/papers/icbo-views.pdf">view management in ontology</a>. Here the aim was to separate out the ontological component of an OWL document from that which is used just to create effects in the application; typically this might be classes inserted to &#8220;gather&#8221; terms together for presentation purposes; hiding detail or abstraction that detracts from the users&#8217; goals; use of  relationships to provide appropriate navigation for  application users. Both are necessary components of a KR system, but they shouldn&#8217;t all live in the OWL, ontological world. Simon separated them out into a SKOS layer for the view and navigation stuff, leaving the   ontology component a bit cleaner. Other view mechanisms usually leave everything in the ontology&#8230;.</p>
<p>So, rather than just the ontological hammer, we need a knowledge representation toolkit, of which the OntoHammer is just one piece (we also need the KR saw, plane, gimlet and sprocket wrench). Keeping our ontologies clean descriptions of our static knowledge of biology and using that as a framework upon which to:</p>
<ul>
<li> Generate views for different communities and usage profiles; </li>
<li> hang  rules and probablistic knowledge </li>
<li> Combined with annotated data (and other types of knowledge) to form knowledgebases </li>
<li> All sorts of other things&#8230; </li>
</ul>
<p>Pragmatism and bredth of view are the order of the day. Ontology is necessary for KR but not sufficient.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/196/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/196/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=196&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/10/13/ontologies-are-not-the-only-fruit/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
		<item>
		<title>Querying the Gene- Mammalian Phenotype and Human Disease Ontologies with GOAL</title>
		<link>http://robertdavidstevens.wordpress.com/2012/09/14/querying-the-gene-mammalian-phenotype-and-human-disease-ontologies-with-goal/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/09/14/querying-the-gene-mammalian-phenotype-and-human-disease-ontologies-with-goal/#comments</comments>
		<pubDate>Fri, 14 Sep 2012 17:19:16 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[Ontologies]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=193</guid>
		<description><![CDATA[Simon Jupp, Robert Hoehndorf and I have realised something that we&#8217;ve been wanting to do for years and finally made the time to do (this has been written up in a JBMS paper on the Logical Gene Ontology). We&#8217;ve made annotations on mouse gene products for the Gene Ontology, Mammalian Phenotype Ontology and the Human [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=193&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>Simon Jupp, Robert Hoehndorf and I have realised something that we&#8217;ve been wanting to do for years and finally made the time to do (this has been written up in a <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3337258/">JBMS paper on the Logical Gene Ontology</a>). We&#8217;ve made annotations on mouse gene products for the <a href="http://www.geneontology.org">Gene Ontology</a>, <a href="http://obofoundry.org/cgi-bin/detail.cgi?id=mammalian_phenotype">Mammalian Phenotype Ontology</a> and the <a href="http://www.obofoundry.org/cgi-bin/detail.cgi?id=disease_ontology">Human Disease Ontologies</a> available as a large OWL ontology that is interactively queryable  on-line. (We used mouse as it is well annotated with concepts from more than one ontology.) This is the <a href="http://owl.cs.manchester.ac.uk/goal">Logical Gene Ontology Annotations</a>   or GOAL. We think that it shows the utility of delivering ontologies and their queries via OWL and automated reasoning, but, perhaps more importantly, it shows that we can do this kind of thing interactively on-line; OWL tools are now sufficiently mature  that this is possible.</p>
<p>The idea for GOAL is simple:</p>
<ol type="1">
<li> We create a class of <tt>Gene product</tt>; </li>
<li> For each mouse protein we create a primitive subclass of <tt>Gene product</tt> with the MGI id as URI fragment and name as label; </li>
<li> Each gene product is connected to its relevant GO  aspects by extracting  data from the <a href="http://www.geneontology.org/goa">Gene Ontology Annotations</a>; </li>
<li> We also do this for the annotations from MGI for the Mammalian Phenotype Ontology and Human Disease Ontology. </li>
<li> This results in a load of gene products with lots of restrictions to things we know about those proteins; </li>
<li> For each of the classes in the &#8220;supporting ontologies&#8221;, we create a defined <tt>Gene product</tt> class  along the lines of <tt>Class: <em>X gene product</em> EquivalentTo: <em>Gene product</em> that hasRelationshipWith some X</tt>, where &#8220;X&#8221; is the class from the supporting ontology and <tt>hasRelationshipWith</tt> is some suitable relationship; </li>
<li> We classify the GOAL ontology, which is in the OWL 2 EL profile, so we can use the <a href="http://code.google.com/p/elk-reasoner/">elk-reasoner</a> which does the reasoning over the OWL super-fast; </li>
<li> We provide a <a href="http://owl.manchester.ac.uk/goal">GWT based user interface</a> to use the GOAL ontology to query these various annotations.  We already have a defined class for each gene product linking it to the supporting ontology in question. Each of these defined classes <em>recognises</em> the various mouse gene products as appropriate. This re-builds each of the &#8220;supporting&#8221; ontologies underneath the <tt>gene product</tt> class. We can build more complex DL queries by creating intersections of two or more of these classes. So, we can ask for intra-cellular membrane bound gene products   involved in  abnormal cytokine secretion, are ion binding, participate in an inflammatory response  with the following DL-query: <tt>x and y and z</tt>. The <a href="http://owl.manchester.ac.uk/goal">GOAL web page</a> is set up to allow this. Gene product classes can be browsed to. Each gene product is &#8220;added&#8221; to the query. When complete, the &#8220;Go&#8221; button is pressed. </li>
<li> The DL-query that&#8217;s been constructed is added to the ontology using the <a href="http://owlapi.sourceforge.net/">OWL API</a>, then the ontology is re-classified by the elk-reasoner to compute all subclasses of the query, and a table of results displayed. </li>
<li> For each DL-query that returns gene products, we make an addition of a defined class to the GOAL ontology. So, the query above becomes the class below as part of a &#8220;query&#8221; module that, at some point,  can be added to the ontology via an import statement. So, the GOAL ontology grows lazily. </li>
</ol>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">Class: `xyz gene product'
EquivalentTo: x and y and z</pre>
</td>
</tr>
</table>
<p>or for the DL query for the example above:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">'immune system disease gene product' and
'abnormal cytokine secretion gene product' and
'ion binding gene product' and
'inflammatory response gene product' and
'intracellular membrane-bounded organelle gene product'</pre>
</td>
</tr>
</table>
<p>This is a simple and straight-forward use of OWL to gain access  to the rich resource of annotations of gene products with various ontologies, especially the GO. One of the tricks, just as it is with the <a href="http://www.kupkb.org">iKUP interface to the KUPKB</a>, is to provide some kind of reasonable user interface to the knowledge resource. I make no great claims for this user interface, but it does hide the OWL and the need to write potentially complex queries. No one should really see OWL. The <a href="http://owlapi.sourceforge.net/">OWL API</a> gives  us a good platform upon which to build applications and we  have reasoners that are fast enough to do the job. The GOAL ontology has  just under 150000 classes; it is in the OWL 2 EL profile, so this is really why we can do this size of ontology  with the interactive, dynamic reasoning.</p>
<p>We&#8217;ve also been straight-forward in the ontological aspects of GOAL. We&#8217;ve said that these gene products have these functions, participate in these processes and display or are involved in these  phenotypes etc. Presumably we&#8217;d have more properly said that information about these gene products has been annotated with these descriptions of functions, activities and so on. This would, I think, have added nothing to the questions being asked (apart from to make it more clumsy). I should, however, pay some thought to drawing in some of the <a href="http://www.geneontology.org/GO.evidence.shtml">evidence codes</a> into this set-up, so queries can take advantages of this information. There are also things like the <tt>part_of</tt> relationships in GO that could be used with sub-property chains to say that a  protein capable of a process that is part of another process is capable of the second process by implication.</p>
<p>The future for this work could be interesting. As well as exploring GOAL for interesting biology, we&#8217;d also like to exploit the resource to look for inconsistencies in annotation and redundancy  in the annotations. This will mean adding more information to the ontology&#8201;&#8212;&#8201;for example, disjoints between various components. There is an increasing degree of axiomatisation in GO (and others) and it would be good to exploit this in queries. As we say in <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3337258/">the GOAL paper</a>, what we&#8217;ve currently done cannot detect or stop nonsense questions of gene products that cannot have a such or such location etc. Using these <a href="http://wiki.geneontology.org/index.php/Ontology_extensions">Gene Ontology Extensions</a> would be a good thing.</p>
<p>With GOAL you can do a query like:</p>
<ol type="1">
<li> Navigate down to <em>obesity gene product</em> (<em>DiseaseGeneProduct</em> &gt; <em>disease of metabolism gene product</em>&gt; <em>acquired metabolic disease gene product</em> &gt; <em>nutrition disease gene product</em> &gt; <em>overnutrition gene product</em> &gt; <em>obesity gene product</em>) or simply enter <em>obesity gene product</em> into the search box. Press &#8220;Search&#8221; to show the results table. </li>
<li> Under <em>PhenotypeGeneProduct</em> navigate down <em>mammalian phenotype gene product</em> &gt; <em>digestive/alimentary phenotype gene product</em> &gt;&#8217;abnormal digestive system physiology gene product&#8217;,  and add this to the previous query. Press &#8220;Search&#8221; and view the results (See fig1). </li>
<li> Select the CPE gene and select the view superclass hierarchy button. When looking at the superclasses of CPE we see it is annotated with the phenotype  <em>decreased circulating adrenocorticotropin level gene product</em> (See fig2).  Add this to the query (deleting that from step 2) to query for <em>obesity gene product</em> and <em>decreased circulating adrenocorticotropin level gene product</em>. Inspect the results. </li>
</ol>
<div> <img src="http://robertdavidstevens.files.wordpress.com/2012/09/fig1.png?w=450" style="border-width:0;" alt="fig1.png">
<p><b>Figure 1. </b>GOAL user interface showing query results for the DL query (<em>obesity gene product</em> and <em>digestive/alimentary phenotype gene product</em>)</p>
</p></div>
<div> <img src="http://robertdavidstevens.files.wordpress.com/2012/09/fig2.png?w=450" style="border-width:0;" alt="Fig two">
<p><b>Figure 2. </b>GOAL user interface highlighting <em>decreased circulating adrenocorticotropin level gene product</em> as a superclass for the DL query (<em>obesity gene product</em> and <em>digestive/alimentary phenotype gene product</em>)</p>
</p></div>
<p>The GOAL UI relies on browsing and this makes it rather clumsy. We need to add the ability to search and do things like term completion to get around the need to start from the top of each ontology and find what is wanted. Never-the-less it starts to show what can be done by adding the various ontological annotations the community has made with the OBO ontologies together to explore what could be complex biological interactions. On the technical side, it shows we can actually deliver OWL based solutions to application needs.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/193/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=193&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/09/14/querying-the-gene-mammalian-phenotype-and-human-disease-ontologies-with-goal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>

		<media:content url="http://robertdavidstevens.files.wordpress.com/2012/09/fig1.png" medium="image">
			<media:title type="html">fig1.png</media:title>
		</media:content>

		<media:content url="http://robertdavidstevens.files.wordpress.com/2012/09/fig2.png" medium="image">
			<media:title type="html">Fig two</media:title>
		</media:content>
	</item>
		<item>
		<title>The Gene Ontology as a language: Investigating Gene Ontology annotations with Ziph&#8217;s law</title>
		<link>http://robertdavidstevens.wordpress.com/2012/07/28/doctype-html-public-w3cdtd-xhtml-1-1en/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/07/28/doctype-html-public-w3cdtd-xhtml-1-1en/#comments</comments>
		<pubDate>Sat, 28 Jul 2012 08:13:46 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[Ontologies]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=183</guid>
		<description><![CDATA[Ontologies are key for communication, but are they fit for purpose? Do they allow us to comunicate? The main driver for ontology authoring remains the need to communicate what we know about an entity, often a gene product. In this communication the speakers are usually annotators and the listeners are usually (or are the intended [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=183&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>Ontologies are key for communication, but are they fit for purpose? Do they allow us to comunicate? The main driver for ontology authoring remains the need to communicate what we know about an entity, often a gene product. In this communication the <em>speakers</em> are usually annotators and the  <em>listeners</em> are usually (or are the intended audience) biologists.  We can then think of something like  the <a href="http://www.geneontology.org">Gene Ontology</a> as a means of communitcation between an annotator speaker and a biology listener, and so the Gene Ontology should act like a language and annotations are utterances in that language.</p>
<p>We&#8217;ve recently had a paper exploring these questions with the <a href="http://www.geneontology.org">Gene Ontology</a>. That is, does the GO act like a language and do <a href="http://www.geneontology.org/GO.downloads.annotations.shtml">GO annotations</a> behave like utterances in that language? The paper is</p>
<blockquote><p>Leila Kalankesh, Robert Stevens, and Andy Brass. <a href="http://www.biomedcentral.com/1471-2105/13/127/abstract">The language of gene ontology: a zipf&#8217;s law analysis</a>. BMC Bioinformatics, 13(1):127, 2012. <a href="http://www.biomedcentral.com/1471-2105/13/127/abstract">http://www.biomedcentral.com/1471-2105/13/127/abstract</a></p>
<p align="right">
</blockquote>
<p>and is the work of Leila Kalankesh, one of Andy Brass&#8217; Ph.D. students. In this work wwe used <a href="http://en.wikipedia.org/wiki/Zipf%27s_law">Ziph&#8217;s law</a> to explore the language-like characteristics of GO; this  distribution (a kind of power law) is a characteristic of human languages as well as many other phenomena. In a corpus of text, if all the words are ordered by their frequency, a ziphian distribution is seen. the first most highly ranked word is twice that of the second and thrice that of the third, and so on (frequency is inversely proportional to rank). This gives one of those curves that decline very steeply and have a long, long tail.</p>
<p>In a log log plot of rank and frequency the gradient can be revealing about the &#8220;effort&#8221; used in encoding and  decoding utterances in that language. A gradient of 1.6 is suggestive of a child-like language and a gradient of 2.4 is that of a sophisticated reader?</p>
<p>So, as GO terms are used to describe some biological phenomena and are a form of communication (where the annotator is the speaker and the user of those annotations the &#8220;listener&#8221;), do the ranked frequencies of GO terms in annotation corpora follow a Ziphian distribution; that is, do they behave like utterances in a language? If so, do the gradients of the curves tell us anything about this communication process? If GO annotations behave like statements in a language then we could start to think of applying all the tools of computational linguistics to ontology annotations. If GO annotations do not behave like a language we might wish to ask ourselves why. Finally, the gradient of the plots of rank vs frequency might be able to tell us something about the quality of the communication between  anotator and user.</p>
<p>So, this is what we did:</p>
<ol type="1">
<li> Download GO annotations for a range of model organisms. </li>
<li> Plotted the Curve for rank vs frequency for each GO sub-ontology. </li>
<li> We also separated out the GO evidence code subsets that indicated the highest confidence&#8230; </li>
</ol>
<p>In overview, this is what we found (look at the paper for details):</p>
<ol type="1">
<li> Most of the species annotations with GO look Ziphian; </li>
<li> Most molecular function and cellular component annotations have a mean slope of around 1.8 and those with biological process one of 2.1. </li>
<li> things look more ziphian and steeper slope with the annotations that  have a higher confidence. </li>
<li> the gradient is not a function of ontology or genome size. </li>
</ol>
<p>So, what does all of this tell us? Well, in general, we know that GO annotations behave like statements in a language and, by extension, GO is a language. We also see that annotations in the BP ontology appear to be more &#8220;sophisticated&#8221; utterances than those for MF and CC annotations. We can speculate  why this might be: there is less to say about function and location; they are smaller sub-languatges. For BP there is much more to say &#8211; a gene product may be involved in a large number of processes and there are many more processes than there are functions and location &#8211; there&#8217;s just more to say. We might also see this happening in, for instance the <a href="http://obofoundry.org/wiki/index.php/PATO:Main_Page">Phenotypic quality Ontology</a>  where there is (probably) a lot to say about the phenotype of entities.</p>
<p>this work established that we can view GO annotations (and probably other annotations) as communications in a language. we&#8217;d like to explore whether we can use this kind of approach as a means of investigating the <em>quality</em> of an ontology and/or statements made using that ontology. We saw that the <em>D. rerio</em> genome bP annotations had a significantly lower gradient at 1.8 than the mean of 2.2; why? that is, the communication between annotator and user may be impaired. This may be because the annotations  are not of high quality, which itself may be for a variety of reasons, including the state of our knowledge (there are fewer papers for this model organism). There is literature (see the paper) that talks about the gradient of the Ziphian distribution indicating a degree of effort or &#8220;willingness&#8221; to communicate. the linkage to communication effectiveness is controversial, but it remains an attractive thought that we can measure annotation quality (and perhaps indirectly ontology quality) through this kind of simple computation.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/183/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=183&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/07/28/doctype-html-public-w3cdtd-xhtml-1-1en/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
		<item>
		<title>An Expedition in Semantic Publishing</title>
		<link>http://robertdavidstevens.wordpress.com/2012/05/19/an-expedition-in-semantic-publishing/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/05/19/an-expedition-in-semantic-publishing/#comments</comments>
		<pubDate>Sat, 19 May 2012 10:18:21 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=170</guid>
		<description><![CDATA[Overview To explore what &#8220;semantic publishing&#8221; means I pushed at the boundries by submitting an ontology of amino acids in the RDF/XML representation of OWL to the Sepublica semantic publishing workshop. The ontology captures the semantics of a domain, it is represented in a Semantic Web language, and the ontology is published on the Web. [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=170&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<hr />
<h2><a name="_overview"></a>Overview</h2>
<p>To explore what &#8220;semantic publishing&#8221; means I pushed at the boundries by submitting an ontology of amino acids in the RDF/XML representation of OWL to the <a href="http://sepublica.mywikipaper.org/drupal/">Sepublica semantic publishing workshop</a>. The ontology captures the semantics of a domain, it is represented in a Semantic Web language, and the ontology is published on the Web. So, is it a semantic publication? Can a workshop on semantic publication deal with a semantic publication? The upshot is that my provocative submission does seem to count as a semantic publication, but we do need some words around the published semantics to help us out &#8211; that is, some narrative. Ultimately, we want semantic literature and literature of semantics.</p>
<hr />
<h2><a name="_the_sepublica_narrative"></a>The Sepublica Narrative</h2>
<p>This blog reports on an expedition I made into semantic publishing with my friend <a href="http://www.russet.org.uk/blog">Phil Lord</a>. This was all done in the context of the <a href="http://sepublica.mywikipaper.org/drupal/">Sepublica 2012 semantic publishing workshop</a> at <a href="http://2012.eswc-conferences.org/">ESWC</a> in Crete. It all started as a bit of <em>fun</em> testing the boundaries of what I could get away with, but, by provoking discussion, its also had some very interesting effects and reactions. All in all it&#8217;s led to something rather good and fun.</p>
<p>What I do on this blog is to report on motivation, what I actually did, the responses to it and what&#8217;s come out in the end. First of all, however, I thank the reviewers and the Sepublica organisers for joining in and letting me publish reviews for the &#8220;semantic publication&#8221; and the email dialogues I had with the Sepublica people (this has made the blog rather longh, but I think it supports this narrative).</p>
<p>So, the back story: Phil and I did a <a href="http://www.russet.org.uk/blog/2012/04/three-steps-to-heaven/">&#8220;proper&#8221; article</a> for Sepublica about light-weight semantic publishing in the <a href="http://www.knowledgeblog.org">knowledge blog platform</a>. On submission, I noticed the following on the <a href="http://sepublica.mywikipaper.org/drupal/node/23">Sepublica instructions to authors web page</a>:</p>
<blockquote><p>&#8230;We also invite submissions in XHTML+RDFa or in the format or YOUR semantic publishing tool. However, to ensure a fair review procedure, authors must additionally export them to PDF.</p>
<p align="right"> &#8212; Sepublica Organisers </p>
</blockquote>
<p>My first reaction was &#8220;I wonder what will happen if I submit just an RDF document?&#8221;; that is, an OWL ontology in its RDF/XML syntax. This is where the &#8220;trying it on&#8221; bit comes in; can I take it literally and &#8220;publish&#8221; a document in RDF as a contribution to this workshop? My reasoning went like this:</p>
<ul>
<li> An OWL ontology captures the semantics of a field of interest; </li>
<li> It is a document; </li>
<li> it has an RDF serialisation; </li>
<li> It has a URI, so it can be on the Web and found. </li>
</ul>
<p>So, an OWL ontology is a semantic document, in RDF and published on the Web &#8211; anything on the Web is published&#8230; So, that&#8217;s is indeed what I did. The longest bit of the process was choosing the ontology that I had lying around that could work for the &#8220;expedition&#8221; into semantic publishing; this must be one of the cheapest publications I&#8217;ve ever done. I chose the <a href="http://robertdavidstevens.wordpress.com/2010/12/18/an-update-to-the-amino-acids-ontology/">Amino Acids Ontology</a>, which is a small ontology that captures the basic biochemistry of amino acids and does so in a way that exploits automated reasoning.</p>
<p>Here&#8217;s the next bit of the story:</p>
<ul>
<li> I chose the amino acids ontology. Phil and I originally made this to show off the wizards in the OWL plugin for Protege 3 and how we could use them to very rapidly create this ontology of amino acids. </li>
<li> I put a Dublin Core &#8220;title&#8221; annotation property to give my document a title; </li>
<li> I added myself and Phil as authors (though other people have contributed to the ontology over time as the annotations on the ontology describe); </li>
<li> I made my own &#8220;abstract&#8221; annotation property to give the document an &#8220;narrative abstract&#8221;. </li>
<li> and that was my semantic publication finished. </li>
</ul>
<p>Here is a fragment of the ontology&#8217;s &#8220;title page&#8221;:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">Annotations:
    title "Semantic Publishing of Knowledge about Amino Acids"@en,
    author "Robert Stevens and Phillip Lord",
    abstract "We semantically publish knowledge about the amino acids commonly
    described within biochemistry. The classification of amino acids is based
    on Taylor's article (PMID:3461222) from 1986 published in the Journal of
    Theoretical Biology. The ontology goes further than the static paper
    version; it combines many aspects of the physicochemical properties Taylor
    uses to classify amino acids to give a rich, multi axial classification of
    amino acids. Taylor's original description of the amino acid's
    physicochemical properties are captured with value partitions and
    restrictions on the amino acid classes themselves. A series of defined
    classes then establishes the multi-axial classification. By publishing
    this knowledge about amino acids as a semantic document in the form of an
    ontology we persue an agenda of disruptive technology in publishing. Blogs
    about the published semantics of amino acids may be found at

http://robertdavidstevens.wordpress.com/2010/12/18/an-update-to-the-amino-

acids-ontology/
    and links following."@en,</pre>
</td>
</tr>
</table>
<p>So, it has some minimal trappings of a traditional publication. This also gives an outline of our atttitude to the ontology as a semantic publication; the ontology is a semantic artefact, but we do link out to some blogs that give some narrative on the ontology&#8230;</p>
<hr />
<h2><a name="_submitting_a_semantic_publication_to_sepublica"></a>Submitting a Semantic Publication to Sepublica</h2>
<p>Next came submitting the publication to <a href="http://www.easychair.org">EasyChair</a>. The instructions above said we had to give a PDF version of the submission to ease the review process. As pointed out <a href="http://www.russet.org.uk/blog/2012/04/three-steps-to-heaven/">elsewhere</a>, this is a sad irony of fora on alternative or next generation publishing &#8211; they use &#8220;lumpen PDF&#8221;&#8230; So, I saved my ontology as Manchester OWL Syntax and turned it into PDF. This had two motivations &#8211; one was &#8220;that will show them&#8230;.&#8221; (is there anything as useless as a Manchester syntax dump of an ontology converted to PDF?) and the second was to submit both this ludicrous document with the more sensible RDF version of the ontology. Unfortunately, the EasyChair Sepublica site wasn&#8217;t set up to take other than PDF; the organisers changed it for me, but the only way to get RDF in was to zip it up and submit one file. So, I zipped up the RDF and submitted the ontology, but  without the silly PDF version.</p>
<p>This is where the dialogue started.</p>
<blockquote><p>Dear both, I was trying to take a look at the paper you submitted &#8220;Semantic Publishing of Knowledge about Amino Acids&#8221; the problem I had was that the uncompressed zip file generates an OWL file (nothing wrong with the OWL file, I opened it protege) but there is not an actual paper -as in a PDF file. Could u resubmit and make sure to include the actual paper.</p>
<p align="right"> &#8212; Sepublica Organisers </p>
</blockquote>
<p>and</p>
<blockquote><p>The ontology is our submission. the workshop pages said that RDF submissions were acceptable and that&#8217;s what we submitted. The ontology is a document that captures, in a computational form, the semantics of amino acids. The URI resolves to a web address from which the semantic publication can be read, so I think this counts. As the instructions for authors said, I did produce a PDF version of our publication, but the RDF one works much better. I think we&#8217;ve fulfilled the instructions to authors&#8201;&#8212;&#8201;is there anything else we need to do?</p>
<p align="right"> &#8212; Robert Stevens </p>
</blockquote>
<p>and the reply was:</p>
<blockquote><p>you are right, no problem</p>
<p align="right"> &#8212; Sepublica Organisers </p>
</blockquote>
<p>Which was the right answer &#8211; good for them. At this point the submission was sent to the reviewers.</p>
<hr />
<h2><a name="_the_reviews"></a>The Reviews</h2>
<p>The intstructions sent to the reviewers were:</p>
<blockquote><p>please note: This is a true semantic publication.</p>
<p>It does not quite stick to the rules (as the authors didn&#8217;t submit a PDF export limited to 12 pages), but nevertheless we (= Alex and me) decided not to reject it before reviews.</p>
<p>We recommend that you treat this submission as if it were a paper describing an ontology.</p>
<p>You can read the ontology with your favorite ontology editor (e.g. Prot&eacute;g&eacute;), but we also recommend that you open it in a text editor to see the publication-style &#8220;header&#8221;. The blog post linked from the &#8220;abstract&#8221; should also be considered part of this submission.</p>
<p align="right"> &#8212; Sepublica organisers </p>
</blockquote>
<p>Below is what we got back.</p>
<blockquote><p>Note: This submission has been evaluated as an ontology rather than as a paper. The blog has also been read in order to better evaluate this work.</p>
<p>What is the target research problem? The ontology represents the 20 amino acids used in biology as well as their characteristics such as polarity, size, etc.</p>
<p>What are the strong points and weak points of the paper? The ontology proposed is well documented and highly relevant for bioinformatics and related domains.</p>
<p>Does the paper evaluate its contribution? Is it aware of related work?</p>
<p>Further comments (if applicable) There should be a formal submission of the corresponding paper. I would like to see an evaluation against competency questions. I would also like to know how this ontology has been used. Do the authors have a particular project in mind? How can it be used in conjunction with protein ontologies or others? Why it is better to represent this information in an ontology rather than other formats? I think the ontology itself is interesting but even more would be its use.</p>
<p>Minor issues (if applicable) d be nice to know how that domain can benefit from this ontology.</p>
<p align="right"> &#8212; Reviewer One </p>
</blockquote>
<blockquote><p>The research problem that the authors are somewhat sarcastically addressing is the question of how to publish machine readable (&#8220;semantic&#8221;) documents. Though they don&#8217;t really spell this out in text, they are proposing that the important knowledge in a publication should be be represented and distributed as an OWL ontology that is completely distinct from a traditional body of text that would be distributed as a PDF for human readers&#8230; As they state: &#8220;publishing this knowledge about amino acids as a semantic document in the form of an ontology we persue an agenda of disruptive technology in publishing&#8221;</p>
<p>One interesting effect of their disruptive submission is that, as a reviewer, I am forced to attempt to examine the knowledge content directly without recourse to complain about grammar, document structure or image quality &#8211; which I think is a positive. This raises the problem that, not being a biochemist, I need some way to tell whether their ontology is correct. Sadly, the only real way that I, as a lowly human reader, can evaluate the knowledge content of the ontology is to go back to read the papers from which this knowledge was extracted in the first place&#8230;</p>
<p>I like the authors main point if I may guess it as something like: &#8220;We have a great knowledge representation language ready for use in publishing called OWL and we should use it directly in the publishing process&#8221;. But I don&#8217;t think that we can escape from also publishing knowledge in a form that human readers can easily consume.</p>
<p>So, what to do with this submission? I think it would be most useful for the meeting if the authors would make their proposal for OWL-based semantic publishing explicit by writing an editorial style article (in English) that states their case. They should, of course, include an OWL version of this editorial so that we can verify that their reasoning is sound.</p>
<p align="right"> &#8212; Reviewer Two </p>
</blockquote>
<p>This is all very interesting, but before we unpack the reviews, I should come clean about the disappointment of having the &#8220;paper&#8221; accepted for presentation at Sepublica; Phil and I really wanted it to be rejected on the grounds that there was no publication. This would have enabled us to say that the whole thing is completely ridiculous&#8230; However, our bluff was called and the reviewers and the Sepublica organisers have gone with it and good things look like they&#8217;re coming out of the whole expedition.</p>
<p>Reviewer One says he/she is reviewing it as an ontology not a paper&#8230; even though the ontology (in our view) is a semantic publication. Reviewer one just plays it with a straight bat &#8211; let&#8217;s just treat it as an ontology. It may be that the two should be indistinguishable &#8211; should an ontology be treated any different from a paper? The axioms of the ontology are a theory about the domain; it doesn&#8217;t have the form of the traditional scientific paper, but can it be treated as such &#8211; is this comment about it not being a paper eventually going to be &#8220;old fashioned&#8221;? The interesting point is that he/she wants descriptions of the ontology&#8217;s use; something that is not in the ontology, it is in the narrative surrounding the ontology (or would be if we had a use other than teaching for this ontology). Phil Lord has talked about <a href="http://www.russet.org.uk/blog/2009/05/literate-owl-well-on-blogs/">literate ontology</a> as an analogy to literate programming; we should be able to have narrative for the ontology surrounding that ontology. There is, however, something to distinguish between what the ontology says about its field of interest and what we want to say about the ontology as an artefact.</p>
<p>Reviewer one also says he/she read the blogs to gget the narrative, which sort of plays to this need for a narrative. however, I stilll think that the basic point that the ontology is a semantic publication holds; it may need more narrative, but I remain to be convinced &#8220;There should be a formal submission of the corresponding paper.&#8221;. Finally, the comment &#8220;Why it is better to represent this information in an ontology rather than other formats?&#8221; is fun for a workshop on semantic publishing &#8211; is this (our ontology) a good way to publish semantics for a field? I claim that this ontology captures a lot of basic biochemistry of amino acids; the background chemistry belongs elsewhere, but this ontology captures an early lecture in biochemistry. The biological and chemical implications of the amino acid&#8217;s characteristics are beyond what we&#8217;ve done, but I&#8217;m happy to argue that the ontology as it stands is a good way of publishing basic knowledge about the semantics of amino acids. If it isn&#8217;t, then we&#8217;ve been wasting an awful lot of time on ontologies.</p>
<p>The point about narrative comes out even more in Reviewer two&#8217;s review. I&#8217;ve never had a paper of mine described as having an element of sarcasm &#8211; I&#8217;m very proud of this achievment. Reviewer two said:</p>
<blockquote><p>One interesting effect of their disruptive submission is that, as a reviewer, I am forced to attempt to examine the knowledge content directly without recourse to complain about grammar, document structure or image quality &#8211; which I think is a positive.</p>
<p align="right"> &#8212; Reviewer two </p>
</blockquote>
<p>which I do understand, but a good part of this tedious element of reviewing is to make sure the publication can be understood as a publication. Can we do this with an ontology? An ontology should have a tutorial or reference aspect,but we don&#8217;t really know how best to present them for many applications. I&#8217;m prepared to state, however, that OWL isn&#8217;t the way to present an ontology  to users (except, perhaps, the authors) and various graph visualisations are only part of the solution, but all of this is another story.</p>
<blockquote><p>This raises the problem that, not being a biochemist, I need some way to tell whether their ontology is correct. Sadly, the only real way that I, as a lowly human reader, can evaluate the knowledge content of the ontology is to go back to read the papers from which this knowledge was extracted in the first place&#8230;</p>
<p align="right"> &#8212; Reviewer Two </p>
</blockquote>
<p>perhaps one day &#8220;papers&#8221; will be assessed against the ontologies that capture background knowledge. However, this reviewer is right to point out that it is difficult to review an ontology (too much ontology evaluation/review) is based on &#8220;would I have done it this way&#8230;.&#8221;. A wider question is how does one evaluate/review any semantic publication?</p>
<p>Finally, we have:</p>
<blockquote><p>We have a great knowledge representation language ready for use in publishing called OWL and we should use it directly in the publishing process&#8217;. But I don&#8217;t think that we can escape from also publishing knowledge in a form that human readers can easily consume.</p>
<p align="right"> &#8212; Reviewer Two </p>
</blockquote>
<p>which is and isn&#8217;t what we&#8217;re saying. I couldn&#8217;t write my whole publication in OWL or even all of FOL &#8211; and not wish to either. This all gets to the heart of it; we want semantic publishing, but we also want narrative. What counts as a semantic publication &#8211; a lump of RDF; a trad paper with a bit of RDF or OWL or FOL in it or a trad paper with some typed links? Whatever the nature of a semantic publication or a semantic scientific publication we do need narrative.</p>
<p>Reviewer Two ended up saying &#8221; &#8230;make their proposal for OWL-based semantic publishing explicit by writing an editorial style article (in English) that states their case. They should, of course, include an OWL version of this editorial so that we can verify that their reasoning is sound.&#8221;, which is exactly in the right vein; I take my hat off to him/her.</p>
<p>What we did was write a <a href="http://www.cs.man.ac.uk/~stevensr/papers/sepublica-2012.pdf">little position paper</a> outlining what we did &#8211; this makes the point that semantic publishing needs narrrative. This goes for semantic scientific publishing, but also for data as well. I like the comment about representing the position paper as an OWL ontology to check reasoning. I actually thought about this and it would be a good exercise, but not one I felt I could turn around in the few days available for making our proceedings version &#8211; perhaps it&#8217;s worth pointing out that writing a trad paper is actually easy compared to writing the ontology version &#8211; especially if you take away all the poncing around one has to do when publishing in a trad forum. Getting narrative structure into a semantic document would be fun, or do we want a proper hypertext document where whatever route you take through the structure one gets the same message?</p>
<hr />
<h2><a name="_the_final_bit"></a>The final bit</h2>
<p>The general instructions to authors for Sepublica&#8217;s final, camera ready version was:</p>
<blockquote><p>Dear Robert,</p>
<p>You have already received the comments by the reviewers in a previous email. Please take them carefully into account when preparing your camera-ready paper for the proceedings.</p>
<p>The final paper and the signed copyright form are due on</p>
<p>FRIDAY APRIL 13 23:59 (Hawaii time)</p>
<p>This is a firm deadline for the production of the proceedings.</p>
<ol type="1">
<li> FINAL PAPER: Please submit the files belonging to your camera-ready paper using your EasyChair author account. Follow the instructions after the login for uploading two files:
<pre style="padding:.5em;color:gray;">(a) either a zipped file containing all your LaTeX sources
    or a Word file in the RTF format, and
(b) PDF version of your camera-ready paper.</pre>
</li>
</ol>
<p>The final submission must be in LNCS format (instructions: <a href="http://www.springer.de/comp/lncs/authors.html">http://www.springer.de/comp/lncs/authors.html</a>). Research papers are strictly limited to 12 pages, position papers to 5 pages, and system/demo descriptions must be between 2 and 5 pages. 2. COPYRIGHT: The copyright form can be found below. It is sufficient for one of the authors to sign the copyright form.  You can scan the form into PDF or any other standard image format, but even a text file with your name entered is sufficient.</p>
<p align="right"> &#8212; Sepublica Organisers </p>
</blockquote>
<p>and this was my supplementary message from Sepublica:</p>
<blockquote><p>Dear authors,</p>
<p>of course the &#8220;must be LNCS, N pages&#8221; etc. restriction is not applicable in your case.</p>
<p>However, I need something for the old-fashioned PDF version of the proceedings. Please find some suggestions below. If you should just upload the ontology file, I&#8217;m going to print it to a PDF from a text editor &#8211; but there should be nicer ways.</p>
<p>Maybe a title page in LNCS style, up to and including the abstract.</p>
<p>Then, a pretty-print of your ontology might follow.  We are not going to print the proceedings on paper, so we do not really have physical page limits.</p>
<p>Please be innovative! <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p align="right"> &#8212; Sepublica Organisers </p>
</blockquote>
<p>One thing this strongly suggests is that we don&#8217;t really know what to do with a semantic publication. I don&#8217;t think my <a href="http://www.cs.man.ac.uk/~stevensr/papers/amino-acids-ontology.pdf">PDF of the Manchester Syntax</a> for the ontology either counts as a pretty print or is it really the way to do a semantic publication anyway. This was, however, when the Sepublica organisers turned the tables on me, effectively saying &#8216;OK, so you&#8217;re publishing semantically &#8211; let&#8217;s get on with it&#8217;. Even though I think I&#8217;ve published semantically, I sort of gave in at this point and did the aforementioned <a href="http://www.cs.man.ac.uk/~stevensr/papers/sepublica-2012.pdf">position paper</a>. I do, however, believe that my ontology is a semantic publication; we just don&#8217;t yet know how to handle semantic publications.</p>
<p>We have a feeling that semantic publishing must be a good thing, but it&#8217;s all rather uncharted territory at the moment. We want material in our scientific publications to be more computationally accessible. We want semantically described data. but what is a semantic publication? How much semantic content does a publication have to have to be a semantic publication? Perhaps the goal for the next Sepublica is to not have PDF as the output, but to challenge the community to do some semantic form of publication to test some boundries.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/170/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=170&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/05/19/an-expedition-in-semantic-publishing/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
		<item>
		<title>The First UK Ontology Network Workshop</title>
		<link>http://robertdavidstevens.wordpress.com/2012/04/17/the-first-uk-ontology-network-workshop/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/04/17/the-first-uk-ontology-network-workshop/#comments</comments>
		<pubDate>Tue, 17 Apr 2012 07:57:06 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=164</guid>
		<description><![CDATA[We have just had what I thought was a successful first full meeting of the UK Ontology Network in Manchester. The idea was put together at a meeting organised by Pierre Grenon and Anthony Galton at the Open University on 30 April 2010. At the 12 April 2012 meeting, organised by James Malone and i, [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=164&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>We have just had what I thought was a successful first full meeting of the <a href="http://www.ukontology.org/">UK Ontology Network</a> in Manchester. The idea was put together at a meeting organised by Pierre Grenon and Anthony Galton   at the Open University on 30 April 2010. At the 12 April 2012  meeting, organised by James Malone and i, we had 100 people registered and about 80 people attending. This itself was a good thing to see &#8211; gathering 100 people shows a bibrant ontology community in the UK. One of the best things about the meeting  was that we had people in the audience from many communities that use ontologies, not just my home community of biomedicine. we had  people from biomedicine, music, geography,  government, the national archives, the BBC and the NHS (and probably some I&#8217;ve left out). We also just had a lot of people doing stuff and, I&#8217;m pleased to say, talking about putting applications together using OWL and automated reasoners, particularly <a href="http://code.google.com/p/elk-reasoner/">ELK</a>. We had clusters of 3 people giving 5 minute talks with a little bit of discussion and we then had some software demos after lunch. We then just had a long session of hanging around talking &#8211; which was good.</p>
<p>We used the hashtag UKON2012 on twitter (which was active) and our <a href="https://plus.google.com/u/0/113255085666768835976">UKON GooglePlus page</a>. We&#8217;ll put the <a href="http://www.ukontology.org/2012_workshop_program.html">presentations up on the UKON site</a> and prepare some other materials from the  Tweets, GooglePlus pages and so on, such as issues, capabilities, and themes of work.</p>
<p>Here I&#8217;ll just talk a bit about a few of the talks:</p>
<p>Tom graham&#8217;s (BBC) talk about using linked data to generate the BBC&#8217;s Olympics pages showed an impressive process and its result. A light weight publishing process lets the journalists write their piece, tag it and push it through a pipeline that allows aggregation and rich inter-linking of the BBC&#8217;s Olympic content. the take home message was that tom reckons that the Web site couldn&#8217;t have  come into being in the timely fashion it has without the use of Semantic Web technologies &#8211; a sign of increasing maturity.</p>
<p>Phil Lord&#8217;s (Newcastle) presentation on the <a href="http://www.knowledgeblog.org">Knowledge Blog&#8217;s</a> publication process generated lots of comments during the breaks &#8211; using another light-weight publication process to gather light metadata and semantics about the  page, its contents and the references. It was a nice show and Phfil&#8217;s video should be looked at&#8230;</p>
<p>Barry Smith (NCBO) said that most of the content on <a href="http://bioportal.bioontology.org/">BioPortal</a> was &#8220;crap&#8221; and that there were only four good ontologies in the world. All of this was in support of the proposal that lots of gtraining is needed &#8211; a reasonable point, though one that is just true. One of the &#8220;good&#8221; ontologies he named as coming from Aberystwyth (Larisa Solditova was in the audience and asked the question to identify the four). this leaves 3 ontologies and there was some speculation about their identity &#8211; we know Barry likes the FMA &#8211; so let&#8217;s count it as one. This leaves only two other good ontologies  in the world. this means that at least all but two of the OBO ontologies are &#8220;crap&#8221; and presumably contribute some of the  &#8220;crap&#8221;  to BioPortal. Presumably, then almost all the <a href="http://www.obofoundry.org">OBO Foundry</a> ontologies are &#8220;crap&#8221; too.</p>
<p>Dave De roure  (Oxford) introduced  the audience to semantic music. A lot of music data is getting out there as linked data, but with some semantics. Dave told of a music and linked data workshop he set up, expecting  20 participants he got 200. I&#8217;d interpret this as an appetite for geting stuff out there and exposed for use. One of the jobs of this UKON community is to get it out there in a form that optimises its usefulness and semantic content. Dave also mentioned work that Sean Bechhofer, Kevin Page, he  and I had recently started on an OWL knowledgebase of the outputs of digital analysis of all the songs on Sargeant Pepper&#8217;s to give lists of the segmentgs of the songs for query and exploration.   Dave ended by the pointting out  that  the music sector is far along the digitisation and tagging route and that other disciplines could well look to it for lessons.</p>
<p>Ian Horrocks (Oxford) gave a good overview of work at Oxford that included a bit of retrospective. One of the good things that Ian ended on was a lot of collaboration and interest from industry &#8211; this is good to see and is an indication of maturity. One of the winners of the day was eLK &#8211; the fast OWL EL reasoner &#8211; that was mentioned several times as enabling work, and we&#8217;re seeing on-line applications using OWL reasoners &#8211; which is a good thing and more indications of maturity.</p>
<p>Jeremy Rogers (NHS) gave an entertaining talk about the use of SNOMED in the NHS. He mentioned 30 million annottations of patient records with  SNOMED terms  as a result of visits to family doctors by people in the UK. He also mentioned the worrying aspect of annotation quality and quality assurance in general &#8211; another theme of the day. The under-annotation and mis-annotation was a bit frightening and plays to the need to develop tools  and techniques (as well as the ontological/terminological underpinnings that will give better annotations/codings, not only from SNOMED by NHS people, but by all users of  ontologies.</p>
<p>Throughout the day there was a call for tooling to support the use of ontologies in the community. There&#8217;s a need to enable thedevelopment and use of OWL ontologies with the same level of sophistication as we have for handling the programme code for software applications. Though this wasn&#8217;t explicitly mentioned, we are not replete with OWL tools. We have Protege as (probably) the kmost widely used OWL environment &#8211; many people depend on it &#8211; and it&#8217;s funding hangs by a slender thread. The community of which the UKON meeting is evidence, needs to come together to make sure that there is not only a good tool chain, but that the  vital elements of that chain are both secure and have safety in numbers. As a community we should stop thinking that the tools are the responsibility of others and help, by whatever means, to make the tools happen. That this UKON meeting can gather 100 registrants with relative ease from within the  UK (and a couple from the US) shows that there is a vibrant community from the fundamental of representation language and automated reasoning to a wide range of application domains.</p>
<p>There wil be mmore UKON meetings&#8230;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/164/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=164&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/04/17/the-first-uk-ontology-network-workshop/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
		<item>
		<title>Unlocking OWL Ontologies</title>
		<link>http://robertdavidstevens.wordpress.com/2012/03/16/unlocking-owl-ontologies/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/03/16/unlocking-owl-ontologies/#comments</comments>
		<pubDate>Fri, 16 Mar 2012 11:39:40 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=155</guid>
		<description><![CDATA[Ontologies, even when presented in the more user-friendly Manchester OWL syntax (the one used in Prot&#233;g&#233;), can be impenetrable to the uninitiated. To address this problem, the SWAT project have developed a system (named OntoVerbal) that automatically translates an OWL ontology into natural language text. We would be most grateful if you would help us [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=155&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>Ontologies, even when presented in the more user-friendly Manchester OWL syntax (the one used in Prot&eacute;g&eacute;), can be impenetrable to the uninitiated. To address this problem, the <a href="http://swatproject.org/">SWAT project</a> have developed a system (named OntoVerbal) that automatically translates an OWL ontology into natural language text. We would be most grateful if you would help us test the system by participating in a short (20 min) experiment that involves reading 10 such texts and translating them back into the corresponding OWL.</p>
<p>Please click on the <a href="https://www.surveymonkey.com/s/G7HZ25P">link below for further details and instructions</a>:</p>
<p><a href="https://www.surveymonkey.com/s/G7HZ25P">https://www.surveymonkey.com/s/G7HZ25P</a></p>
<p>All participants will be entered into a prize draw for Amazon vouchers:</p>
<ul>
<li> 1st Prize: &pound;50 ($80) </li>
<li> 2nd Prize: &pound;30 ($50) </li>
<li> 3rd Prize: &pound;20 ($30) </li>
</ul>
<hr />
<h2><a name="_please_circulate_this_request_to_anyone_you_think_may_be_interested"></a>Please circulate this request to anyone you think may be interested</h2>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/155/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=155&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/03/16/unlocking-owl-ontologies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
		<item>
		<title>Making sure my brother and I have the same grandparents</title>
		<link>http://robertdavidstevens.wordpress.com/2012/03/16/making-sure-my-brother-and-i-have-the-same-grandparents/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/03/16/making-sure-my-brother-and-i-have-the-same-grandparents/#comments</comments>
		<pubDate>Fri, 16 Mar 2012 09:49:32 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=152</guid>
		<description><![CDATA[I recently closed off another hole in my Family History Knowledgebase (FHKB). OWL&#8217;s open world assumption means that to make many of the desired inferences in an ontology or knowledgebase, one has to make sure the reasoner has no &#8220;possibilities for doubt&#8221;. I&#8217;ve wrritten before about closing down areas of the FHKB with respect to [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=152&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a>
<p>I recently closed off another hole in my <a href="http://robertdavidstevens.wordpress.com/2010/05/04/the-family-history-knowledge-base/">Family History Knowledgebase (FHKB)</a>. OWL&#8217;s open world assumption means that to make many of the desired inferences in an ontology or knowledgebase, one has to make sure the reasoner has no &#8220;possibilities for doubt&#8221;. I&#8217;ve wrritten before about <a href="http://robertdavidstevens.wordpress.com/2011/02/09/closing-the-fhkbs-abox-of-family-members/">closing down areas of the FHKB</a> with respect to how many children people have. I&#8217;ve also closed parts of the <a href="http://robertdavidstevens.wordpress.com/2010/12/18/an-update-to-the-amino-acids-ontology/">amino acid ontology</a> and <a href="http://ontogenesis.knowledgeblog.org/1001">written in general about closure</a>. The example of grandparents (and parents etc.) in the  FHKB is just another example of having to be really tight.</p>
<p>My brother Richard and I have the  same parents and grandparents (the latter being William and Iris on my Dad&#8217;s side and charles and Violet on my Mum&#8217;s side). If I write two defined classes as follows:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">Class: GrandparentOfRobert
        EquivalentTo: Person
                that isParentOf some (Person that isParentOf value Robert)

Class: GrandparentOfRichard
        EquivalentTo: Person
                that isParentOf some (Person that isParentOf value Richard)</pre>
</td>
</tr>
</table>
<p>they both give the same answer of William, iris, charles and Violet. The two defined classes are, however, not themselves infered to be equivalent, even though they appear to have the same extents. This last point is the cruicial one &#8211; they only appear to have the same extents; we just have to ignore our domain knowledge. In my description of <tt>Person</tt> I&#8217;ve left it open that there may be more ways of having a parent than having a mother and a father&#8230; So it is possible that I have other parents than my Mum and Dad, and thus any old number of grandparents can exist; the FHKB implies that I have at least two parents and  at least four grandparents &#8211; I could have more.</p>
<p>In the FHKB I have the central class of:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">Class: Person
        SubClassOf: hasMother some Woman,
                hasFather some Man</pre>
</td>
</tr>
</table>
<p><tt>hasMother</tt> and <tt>hasFather</tt> are functional, so an individual person may only hold one of these properties to a distinct individual of <tt>Woman</tt> and <tt>Man</tt>. <tt>hasMother</tt> and <tt>hasFather</tt> are sub-properties of <tt>hasParent</tt>. In this little property hierarchy I&#8217;ve said there are two known ways to have a parent &#8211; by having a mother and by having a father. I haven&#8217;t said  there are no other ways and to close things tighly I need to do so. In OWL, I can&#8217;t put a closure axiom on the <tt>hasParent</tt> property. I would like to say <tt>hasParent EquivalentTo: hasMother or hasFather</tt>, just like a <a href="http://ontogenesis.knowledgeblog.org/1001">closure axiom on a class</a>. I can, however, close down the possibilities of how many  ways a <tt>Person</tt> can have a parent by doing the following:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:.2em 0;">
<tr>
<td style="padding:.5em;">
<pre style="margin:0;padding:0;">Class: Person
        SubClassOf: hasParent exactly 2 Person,
                hasMother some Woman,
                hasFather some Man</pre>
</td>
</tr>
</table>
<p>and then everything works. I&#8217;ve said a person can have only  one mother and one father and now I&#8217;ve said a person can have just two parents; so I&#8217;ve closed off possibilities of having other kinds of parents &#8211; a person must have two parents adn one must be a father and one must be a woman (making two). The <tt>exactly</tt> makes the reasoner run like a pig on stilts, but replacing the <tt>exactly</tt> with a <tt>max</tt> makes it run sensibly. With this addition to the FHKB, the classses for the grandparents of Richard and Robert are infered to be equivalent.  An <a href="http://www.cs.man.ac.uk/~stevensr/ontology/grandparent.owl">example FHKB fragment with this closure</a> is available. You can remove the closure axiom on <tt>Person</tt> to see it run rather slowly on classification.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/152/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=152&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/03/16/making-sure-my-brother-and-i-have-the-same-grandparents/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
		<item>
		<title>iKUP wins the Ontologies Come of Age in the Semantic Web grand challenge =</title>
		<link>http://robertdavidstevens.wordpress.com/2012/02/12/ikup-wins-the-ontologies-come-of-age-in-the-semantic-web-grand-challenge/</link>
		<comments>http://robertdavidstevens.wordpress.com/2012/02/12/ikup-wins-the-ontologies-come-of-age-in-the-semantic-web-grand-challenge/#comments</comments>
		<pubDate>Sun, 12 Feb 2012 10:40:37 +0000</pubDate>
		<dc:creator>Robert Stevens</dc:creator>
				<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://robertdavidstevens.wordpress.com/?p=149</guid>
		<description><![CDATA[The Kidney and urinary Pathway Knowledgebase (KUPKB) (http://www.kupkb.org) its web front-end the iKUP browser has won first prize at the Ontologies come of Age in the Semantic Web grand challenge held at the International Semantic Web Conference in 2012 at Bonn. The KUPKB uses an application ontology make from some OBO ontologies together with extensions [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=149&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a name="preamble"></a></p>
<p>The <a href="http://www.kupkb.org">Kidney and urinary Pathway Knowledgebase (KUPKB)</a> (<a href="http://www.kupkb.org">http://www.kupkb.org</a>) its web front-end the iKUP browser has won first prize at the <a href="http://ocas.mywikipaper.org/">Ontologies come of Age in the Semantic Web</a> grand challenge held at the International Semantic Web Conference in 2012 at Bonn. The KUPKB uses an application ontology make from some OBO ontologies together with extensions and bespoke fragments of ontology to form a schema into which we’ve put annotations on genes, proteins and experiments across various &#8216;omic levels. The iKUP browser is a GWT built front-end that exposes the KUPKB through a faceted browser that allows users to browse and search the KUPKB’s contents via the various aspects captured in the <a href="http://www.e-lico.eu/kupo">KUP Ontology</a>.</p>
<p>There are a couple of noteworthy things about our approach:</p>
<ol type="1">
<li>Julie klein, one of our collaborating biologists in Toulouse, added most of the data from various investigations. We didn’t have her directly adding axioms by hand, but instead used <a href="http://www.sysmo-db.org/rightfield">RightField</a> spreadsheets to help her do this task. RightField is a semantic spreadsheet in which menus tied to portions of ontologies can be embedded. This means a standard Excel spreadsheet can be used to add ontology terms to data and have only appropriate terms made available to the user; these marked up spreadsheets are then transformed to KUPKB content by scripts. Julie also added a lot of content to the <a href="http://www.e-lico.org/kupo">KUPKB’s ontology (KUPO)</a> with an extension of RightFiel called <a href="http://www.populous.org.uk">Populous</a>, a semantic spreadsheet type application for describing entities according to various ontologies and then having the spreadsheet’s contents transformed into axioms via the <a href="http://oppl2.sourceforge.net/">Ontology Preprocessor Language (OPPL)</a> and put in the KUPKB.</li>
<li>Simon Jupp, who built the KUPKB, also made the <a href="http://www.kupkb.org">iKUP browser</a>. This is a GWT front-end to the KUPKB that allows searching and faceted browsing (based on the KUPO) so that biologists can find genes and proteins together with associated experimental data. The iKUP browser helps construct SPRQL queries against the kUPKB as well as using a bit of OWL reasoning (with HerMiT). The iKUP is the key; it let’s biologists use the KUPKB without necessarily knowing they are using Semantic Web technologies &#8211; it is just a Web page.</li>
</ol>
<p>Our OCAS submission was accompanied by a short paper:</p>
<p>Simon Jupp, Julie Klein, Panagiotis Moulos, Joost Schanstra and Robert Stevens. <a href="http://www.cs.man.ac.uk/~stevensr/papers/ocas-2011.pdf">Ontologies Come of Age with the iKUP Browser</a>. In Alexander Garcia Castro, Ken Baclawski, John Bateman, Kim Viljanen, and Christoph Lange, editors, Proceedings of the Workshop Ontologies Come of Age in the Semantic Web, International Semantic Web Conference, number 809 in CEUR Workshop Proceedings, pages 25-28, Aachen, 2011.</p>
<p>In this paper we outlined some criteria for &#8220;coming of age&#8221; and how we think KUPKB and iKUP meets them:</p>
<ol type="1">
<li>When Ontologies / Semantic Web technologies are used outside of their community</li>
<li>When the technology becomes transparent to the user experience</li>
<li>Tools and APIs are mature enough for developers to simply bolt applications together</li>
<li>Questions over performance and scalability go away</li>
</ol>
<p>The first criterion is analogous to that of the Web; it’s come of age once it’s moved outside those that know how it works and how to write HtML pages by hand (plus when there was something useful on the Web). When my Mum started using the Web to plan days out for herself and my Dad, then the Web had come of age. Similarly, our biology colleagues can come to iKUP, search and browse without knowing that it is an OWL ontology organising some RDF and using SPRQL to get back tables of data. Fulfilling criterion two is helped by criterion one. The iKUP browser makes use of the Semantic Web technologies transparently &#8211; it is just a web page in a familiar environment (just as RightField is delivered via a Excel spreadsheet) where keywords are typed in and a faceted browser allows users to both go directly to entities of interest, but also to &#8220;just look around&#8221;. That the facets are provided by an ontology need not be known (and should not be known) by the user. We can only make the iKUP browser if criterion thre is met &#8211; we used a series of standard API and GWT to make the KUPKB available via the iKUP. This really is a sign of &#8220;coming of age&#8221; &#8211; we can &#8220;bolt together&#8221; applications in fairly short order. Having met our last criterion is harder to justify &#8211; KUPKB is relatively small scale, so we don’t have too many performance problems. As RDF gets v big it all gets a bit clunky, but that should change. Anyway, by and large, we feel that the Semantic Web technologies really are coming of age and can do the job (if one’s careful) that they are supposed to do.</p>
<p>As we say in our <a href="http://www.cs.man.ac.uk/~stevensr/papers/ocas-2011.pdf">OCAS paper</a>: &#8220;The KUPKB and iKUP show ontologies coming of age by fulfilling some of their promise. The KUPKB has used ontologies to provide a common semantic framework for a broad range of previously semantically heterogeneous data. The use of Semantic Web technologies provides the means to integrate and query these data. The key to the coming of age is the iKUP user interface; without a simple means to access these integrated data, our biologist users would not and could not use the KUPKB; ontologies come of age when they deliver meaningful use to their intended users. This has now happened with the KUPKB, with biologists testing hypotheses generated via the iKUP in laboratories. Though the KUPKB is relatively small, it does what semantic technologies are supposed to do and show what is possible with biology’s rich resource of data once issues of heterogeneity are taken away and the means of delivery to its users is taken into account. &#8220;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/robertdavidstevens.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/robertdavidstevens.wordpress.com/149/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=robertdavidstevens.wordpress.com&#038;blog=11592931&#038;post=149&#038;subd=robertdavidstevens&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://robertdavidstevens.wordpress.com/2012/02/12/ikup-wins-the-ontologies-come-of-age-in-the-semantic-web-grand-challenge/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/0ba008d946882c2cc919d7869864d16a?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robertdavidstevens</media:title>
		</media:content>
	</item>
	</channel>
</rss>
