Returning to my flower anatomy ontology. the variation, even within one kind of plant, is large. The variations cause problems for making descriptions of plants. this becomes a particular problem when making the kind of precise descriptions I’ve tried to do in the flower anatomy. For example, when trying to describe a particular plant I’ve found with an aim to classify the thing against my ontology, I find different kinds of leaves (some ovate, some obovate, etc); I thind different numbers of branches and leaf segments etc. of course, one usually takes some normative view of a plant (a cnceptualisation of some cannonical plant), but it would be good to be able to capture the range and variety within a kind of plant in the TBox description.
We can exemplify this with a fairly randomly picked entry from Stace’s New Flora of the British Isles. The entry is:
“CFOSMOS Cav. Mexican Aster
Annuals; leaves all opposite, 2-3-pinnate with linear to filiform segments; phyllaries in 2 dissimilar rows, the outer narrower, herbaceous with membranous border, the inner membranous; capitula radiate; receptacle flat, with scales; pappus of (0)2(-3) bristles with usually backward-directed barbs; ligules numerous, pinkish-purple, rarely white; disc flowers yellow.”
A New flora of the British Isles 1997 page: 755. New York
— Clive Stace
We can dissect such a description, bearing in mind the universality of OWL’s class level restrictions; that is, each and every instance of class x holds some relationship to at least one instance of some other class.
- “Annuals”; this is OK; all the plants grow and die within one year.
- “leaves all opposite”; Again this is OK; all the leaves are “opposite”, that is,at each node there are leaves that are opposite each other – as opposed to alternating from side to side of a twig at each node.
- “2-3-pinnate with linear to filiform segments”; Here it gets interesting. The leaves are “2 to 3 pinnate”; that is, a single leaf is made of leaflets and these leaflets are themselves made of leaflets (or with another rlevel of nesting). a classic example in britain would bve the mountain ash or rowan. Also that the leaves are in “linear to filiform segments” is difficult. There are a space of leaf shapes that botanists have partitioned into discrete named shapes. Sometimes a leaf shape wil be in between two or more named shapes. In this case, the leaf shape is in between linear to filiform, that is, thinner than linear but longer than hair-like. A similar case is “obovate to ovate”, which implies “elliptical”.
- “phyllaries in 2 dissimilar rows, the outer narrower, herbaceous with membranous border, the inner membranous”; Not much variation here, but we would need to capture that the inner and outer rows are dissimilar.
- “capitula radiate; receptacle flat, with scales”; again, a fairly straight-forward “all” description.
- “pappus of (0)2(-3) bristles with usually backward-directed barbs”; “Pappous” are modified calyx that are bristly or feathery. The notation (0)2(-3) is standard in floras and means “Usually 2 exceptionally 0 and up to 3”. The notions of “usually”, “rarely” and “exceptionally” are just hard. “Up to” could be done with a “max” cardinality constraint…
- “ligules numerous, pinkish-purple, rarely white”; this statement contains a lot. we need to describe “pinkish – purple”, we have to capture the notion of “numerous” and “rarely”. Exact numbers are rather easy in OWL with qualified cardinality constraints, but bvague numbers present more of a problem. Just creating some class of “numerous” seems like a real cop-out, but in a sense it is just another compromise like value partitions – we have a continuous spectrum that wwe just patition into convenient chunks. this would be just like dividing up a number line into convenient chunks: 1, 2, 3, 4,…. numerous. and “rarely” and “often”. this is so vile…
- “disc flowers yellow”; this is OK, except for the usual turmoil of describing colour.
Lots to explore in all this; for now, I’l ljust choose “2-3-pinnate with linear to filiform segments”.
- the leaf as a whole is divided into parts that are “leaflets”. We could either have two classes “Leaf” and “Leaflet”, with the restriction that all leaflets are part of leaves (but not vice versa) or we just stick with the class of “Leaf” and have a subclass of “Divided leaf” that has parts that are “Leaf”. If we establish the “isPartOf” going in the other direction, we could have a defined class of “Leaflet” THAT IS equivalent to any leaf that is part of a leaf.
- A Pinnate leaf can then be defined as either a Leaf that hasPart some Leaflet or as Leaf that hasPart some Leaf. The latter is more economical, but the former more in-line with domain vocabulary.
- Sticking with the former, then my 2 pinnate leaf might be Leaf that hasPart some (Leaflet that hasPart some Leaflet)).
- My 3 pinnate leaf might be Leaf that hasPart some (Leaflet that hasPart some (Leaflet that hasPart some Leaflet))).
- Being 2 – 3 pinnate then becomes a disjunction of these two class expressions. Clumsy, but not totally horrid.
- The problem, of course, comes with larger amounts of variability. If I had 2 – 4 pinnate, my class expression becomes increasingly clumsy.
One could model the “pinnateness” of a leaf and model a range of pinnateness with max and min cardinalities. this, however, doesn’t capture the physical nature of the sub-divided leaves — especially if I wished to say something about the subm-divisions. For instance, in the mountain ash, I want to say that the final division is arranged “herring-bone” fashion.
At the moment, the clumsy pattern I’ve outlined is the best I’ve come up with. At some point I’ll actually try and do it, then put the ontology up for inspection. meanwhile I wil think more and write more on the other aspects of describing the variation in plant descriptions.