Archive for February, 2014

Competence questions, user stories and testing

February 2, 2014

The notion of competency questions as a means of gathering requirements for and a means of evaluation of an ontology comes from a paper by Gruninger and Fox in 1994: “These requirements, which we call competency questions, are the basis for a rigorous characterization of the problems that the enterprise model is able to solve, providing a new approach to benchmarking as applied to enterprise modelling and business process engineering. Competency questions are the benchmarks in the sense that the enterprise model is necessary and sufficient to represent the tasks specified by the competency questions and their solution. They are also those tasks for which the enterprise model finds all and only the correct solutions. Tasks such as these can serve to drive the development of new theories and representations and also to justify and characterize the capabilities of existing theories for enterprise modelling.” And “We use a set of problems, which we call competency questions that serve to characterize the various ontologies and microtheories in our enterprise model. The microtheories must contain a necessary and sufficient set of axioms to represent and solve these questions, thus providing a declarative semantics for the system.” Here we read “enterprise model” as ontology (or, more correctly, that an enterprise model may have as a part an ontology, as a KR can have other thins than an ontology…).

 

Below you can see examples of what we gathered as competency questions during some Pizza tutorials. They mostly take the form of example questions:

 

  • Find me pizza with hot spicy peppers
  • What sorts of pizza base are there?
  • What vegetarian pizzas are there?
  • What pizzas are there with more than one type of cheese?
  • What kinds of pizza contain anchovy?

     

     

    What we usually do is to cluster these in several ways to find the major categories we need in the ontology; we also extract example class labels and so on. It also feeds into the abstractions; gathering together Vegetable, fish, meat as types of ingredient. The CQ can also pull out things like qualities of these ingredients – spiciness and so on. Usually there are many versions of the same kind of questions. A few examples are:

     

  • Find pizza with ingredient x
  • Find pizza with ingredient x, but not y
  • Find pizza without ingredient z
  • Find pizza with ingredient that has some quality or other

 

We can view the informal, natural language competency questions like user stories in agile software engineering techniques. We use a typical template for a user story:

 

As a role I want to do task for a benefit

 

Usually, “benefit” boils down to money. We can adapt the “five whys” technique for problem solving; ask why the role holder of the user story why they want the task (return on investment) and, when applied with some skill, one can get to a root justification for the user story. Often it is money, but sometimes users ask for edge cases – this is often especially true of ontology types – some fun, intricate or complex modelling or logic can ensue for no real return. I’ve done this kind of thing a bit and found it rather useful at weeding out spurious user stories, but also getting better justifications and thus higher priorities for a user story.

 

I’ll carry on this blog with the CQ

 

“Find pizza with anchovy but not capers”

 

We could take our example CQ and do the following (in the context of the Intelligent Pizza Finder):

 

“As a customer I wish to be able to find pizza that have anchovy but no capers, because I like anchovy and don’t like capers”

 

And abstract to

 

As a customer I want to find pizzas with and without certain ingredients to make it easier to choose the pizza I want.

 

The benefit here bottoms out in money (spending money on something that is actually desired), but goes through customer satisfaction through finding what pizza to buy with more ease. Such a user story tells me that my ontology must describe pizza in terms of their ingredients, and therefore have a description (hierarchy) of ingredients, as well as needing to close down descriptions of pizzas (a pizza has this and that, and only this and that (that is, no other). Other CQ user stories give me other requirements:

 

As a vegetarian customer I want to be able to ask for vegetarian pizzas, otherwise I won’t be able to eat anything.

 

This suggests I need abstractions over my ingredients. User stories can imply other stories; an epic user story can be broken down into smaller (in terms of effort) user stories and this would seem like a sensible thing to do. If CQ are thought of in terms of user stories, then one can bring in techniques of effort estimation and do some planning poker. We did this quite successfully in the Software Ontology.

 

In engineering, and especially Agile software engineering, these CQ or user stories also gives me some acceptance tests – those things by which we can test if the product is acceptable. A competence question obviously fits neatly into this – my ontology should be competent to answer this question. Acceptance tests are run against the software, with inputs and expected outputs; a user story is not complete until the acceptance test(s) is passed. For competence questions as acceptance tests, input data doesn’t really make sense, though results of the competence question do make sense as “output” data.

 

If we take a natural language CQ such as

 

Find me pizza with anchovy, but no capers

 

We may get a DL query like

 

Pizza and (hasTopping some AnchovyTopping) and (hasTopping only not CaperTopping) 

 

Which I can use as a test. I was stumped for a while about how not necessarily having any ontology and not knowing the answer makes running the test “before” and knowing whether it has passed or failed hard. However, it may all fall out easily enough (and may have been already done in some environments); here’s the scenario

 

  1. I have my query: Pizza and (hasTopping some AnchovyTopping) and (hasTopping only not CaperTopping) and no ontology; I’m setting up the ontology and a “test before” testing style.
  2. The test fails; I can pass the test by adding Pizza, hasTopping AnchovyTopping and CaperTopping to my currently empty ontology; the test passes in that the query is valid
  3. I also add pizzas to my test that I expect to be in the answer – NapolitanaPizza; again, the test fails
  4. I add NapolitanaPizza and the test is valid in that the entities are there in the ontology, but I need to add NapolitanaPizza as a subclass of Pizza for there to be any chance of a pass.
  5. I do the addition, but still the test fails; I need to re-factor to add the restrictions from NapolitanaPizza to its Ingredients (tomatotopping, CheeseTopping, Olivetopping, Anchovytopping and Capertopping)
  6. My test passes

 

 

I’m bouncing between the test itself passing a validity test and the ontology passing the test. It’s easier to see these as tests all working in a test after scenario, but it can work in a test before scenario, but seems a bit clunky. This could perhaps be sorted out in a sensible environment. I could even (given the right environment) mock up parts of the ontology and supply the query with some test data.

 

My example query does imply other test queries as it implies other bits of pizza ontology infra-structure. There’s an implication of a pizza hierarchy and an ingredients hierarchy. We’d want tests for these. Also, not all test need be DL Queries – we have SPRQL too (see, as an example, tests for annotations on entities below).

 

There are other kinds of test too:

  1. Non-logical tests – checks that all classes and properties have labels; there’s provenence information and so on.
  2. Tests that patterns are complied with – normalisation, for instance, could include test for trees of classes with pairwise disjoint siblings.
  3. Tests to check that classes could be traced back to some kind of top-level ontology class.
  4. Tests on up-to-dateness with imported portions of ontology (Chris Mungall and co describe continuous integration testing in GO).

 

Some or all of which can probably be done and are being done in some set-ups. However, as I pointed out in a recent blog about the new wave of ontology environments needs to make these types of testing as easy and as automatable and reproducible as it is in many software development environments.