Archive for May, 2014

Exploring what authors actually do when creating an ontology

May 12, 2014

 

Following our qualitative investigation into the issues people have in using ontologies, we’ve been delving further into what a group of experienced ontology authors actually do when they’re creating an ontology. This is all part of a wider goal of exploring the HCI of ontology authoring so that we can gain a better understanding of how people go about authoring, their patterns of activity and when they need support in understanding the consequences of their actions in a cognitively and perceptually challenging situation. This is all part of the EPSRC funded “What if…?” project where we’re looking at “what if..?” questions – that is, “what happens if I add this axiom….?”; The work reported here has been done by Markel Vigo, Caroline jay and myself. In brief, what we’ve done is to:

  • Instrument a version of Protégé 4 so that all actions (button presses, edits, mouse movements etc.) are recorded and time-stamped in a log-file – this is Protégé for user studies (Protégé4US);
  • We used an eye-tracker to record at what authors were looking at while they were authoring their ontology;
  • Capture full screen and audio recordings;
  • Ask our participants to undertake a series of ontology authoring tasks.

The tasks we asked the authors to undertake were based on creating an ontology of potato varieties that described their cropping time, yield and culinary role. One of the tricks about this kind of experiment is to find examples that are reasonably accessible to a wide variety of authors; this example arrived because I was planting this year’s potatoes. In three stages, we asked authors to add the classes and restrictions necessary for each aspect of potato varieties, increasingly complex defined classes, and then to reason and look at the ontology.

The paper entitled “Protégé4US: Harvesting Ontology Authoring Data with Protégé” at the Human Semantic-Web Interaction (HSWI) workshop at ESWC 2014 is available and gives more details on the work, below I pick out a few high-lights. The pictures are generated from the log files, which enable re-construction of what the author has done, from button presses, mouse clicks, etc. to the OWL constructs used in the tasks.

For instance, the picture below reconstructs the authoring events visualised as a web diagram, where the thickness of the arrow indicates a higher frequency of transitions between events (circles indicate reflexive transitions). For this particular user, we can observe how some interesting transitions stand out:

  • The expansion of a class hierarchy is followed by another expansion; similarly, the selection of an entity is followed by another selection. This suggests that users are drilling down the hierarchy; also, they click on classes to view their description.
  • The reasoner is invoked when entities have been modified. For instance, adding a property to a class or making a class defined is often followed by saving the file and invoking the reasoner.


The time diagram below shows a complementary visualisation of the same participant, where the Y-axis indicates the event and the X-axis is the time elapsed in minutes. The blue blocks denote the time in between events and the red dots are mouse events such as mouse hovers or mouse clicks. These have been plotted as well so that we know when there is user activity.

 


In the strategies described above we said that users click on classes to view their descriptions. This is a hypothesis supported by our preliminary data analysis and by our observations. Eye-tracking data will accurately shed light on what users do during the periods of interaction inactivity, especially in situations in which users are looking to the consequences of their actions in Protégé.

As well as the log-files, we also collected self-reported OWL and Protégé expertise. With this information we were able to explore, for example, correlations between expertise and task completion, time and task completion, number of actions and task completion time, the ngrams of UI actions, the Protégé tabs used, and, as described above, patterns of activity in what authors are doing as they create an ontology. More things are reported in the paper, but this rich recording of what authors do enables us to explore many aspects of authoring and suggests hypotheses for further investigation.

The HSWI paper shows the kinds of analysis it is feasible to do with a tool such as Protégé4US and the things we pull out in the paper are:

  • We identified two types of users based on how they use the tabs of Protégé;
  • Find correlates between interaction events and performance metrics that corroborates our initial insights: a higher number of times the reasoner is invoked and the class hierarchy is expanded indicates trouble and thus, longer completion times;
  • Visualise emerging activity patterns: e.g. an ontology is saved before invoking the reasoner and after modifying an entity.

This suggests that Protégé4US has potential to deliver data whose analysis will expand our knowledge about the ontology authoring process, identify its pitfalls, propose design guidelines and develop intelligent authoring tools that anticipate user actions in order to support ontology authoring in the future. Next comes more analysis and doing more interesting ontology authoring tasks and eventually looking at authors actually doing their ontology day jobs on the ontologies they actually create. There’s not much HCI around in the ontology and OWL field, especially in the evaluation of ontology tools and looking at what users actually do in fine detail. This work and Protégé4US is a first step in this direction.