It’s nearly twenty years ago since I did my D.Phil work on the principles for auditory access to complex notations. I’ve moved away from the field of research into the HCI of accessibility since then, but it’s always been in my mind and recent conversations have prompted me to write a bit about some of my thoughts on the underlying issues of using computers (digital presentations of material) when one cannot see.
My D.Phil. research work was motivated by my experience of doing mathematics using algebra notation as part of my masters in Biological Computation at the University of York. I did this after giving up, as a consequence of sight loss, my previous Ph.D. work in biochemistry. There was a lot of mathematics in the master’s degree; not necessarily that hard as mathematics goes, but made much harder by not being able to see the notations and use the notations for manipulation and effective/efficient thinking.
At the root of my difficulty, it appeared to me, was not having a piece of paper upon which I could do my algebraic manipulations. The paper remembers the equation for me, I can write down re-arrangements of the symbols, cross things out, all the time only using my brain to work out what to do, both strategically and tactically, then remembering each bit by externalising it on the page in front of me. Without the paper, most of this had to happen in my head – I mimicked some of paper and pencil’s attributes in Braille or, even worse, in text in a text editor, but it isn’t the same at all – and this prompted me to think exactly why it isn’t the same. I discussed these problems with Alastair Edwards and eventually did my masters project with him looking at rendering and browsing algebra written in some linear code in audio and being able to render and browse that audio presentation. This led on to my D.Phil research work with Alastair where I looked at the human computer interaction of the problem of doing algebra, and other complex notations, in audio.
There’s no need to go into the details of my D.Phil. work here, because I want to look at basics of interacting with information when one cannot see; in particular what’s possible (if not beautiful) In terms of interaction and what the real “can’t do very well” problems are that, as far as I can tell, still remain.
Reading “plain” text (words written one after another in normal natural language form) is more or less OK. I use a screenreader and I can read and write straight-forward text without too much of a problem. My window upon the screen of text is rather small; it’s essentially one line. I can get my screenreader to read bigger portions of text, but the quick look, the scan is still problematic. I can move around the text to find what I want and inspect it with relative ease; the interaction is, to my mind, clunky, but it’s all doable. As soon as moves away from simple, linear strings of works and into two-dimensions, as in algebra notation, and into informationally dense material (again algebra is dense and complex or complex because it’s dense), speech based screenreaders don’t offer an effective reading solution.
This comes to two of the things that I worked out during my D.Phil.:
- A listening reader tends to be a passive reader. As a listening reader, I tend to lack agility in my control of information flow. In the worst case, e.g., with an audio book, I listen at the rate dictated by the reader, not what my eyes and brain want to do. Obviously I control information flow with keystrokes that makes my screenreader say things, but it’s all a bit clunky, slow and intrusive compared to what one does with ones eyes – they move around the screen (or paper) in a way that basically gets me to the right portion of text, either word by word, or bigger chunks, without my having to consciously do very much at all. So, speed and accuracy in the control of the flow of information turns the reader from being passive to being active.
I lack an adequate external memory. The paper or the screen has the text upon it and it remembers it for me, but as it’s slow and clunky to get at it, I rely more on my brain’s memory and that’s a bit fragile. Of course there is an external memory – the information I have access to on a computer – but it only really plays the role of an external memory if there is sensible (fast and accurate) control in the access to that external memory.
The external memory in conjunction with speed and accuracy in control of information flow makes eyes and paper/screen all rather effective. It was these two issues that I addressed in my D.Phil. work.
Despite these issues, access to straight-forward text is OK. I, along with lots of other people, read and write perfectly well with screenreader’s and word processors. In the small the interaction works well, but I find reading and comprehending larger documents much harder work; it’s a burden on my memory and flipping backwards and forwards in text is relatively hard work – not impossible, but harder work than it was when I could see.
Some of this difficulty I describe with the large grained view of information comes from the ability, or the lack of it, to glance at material. Typesetters have spent centuries working out styles of layout that make things easy to read, there are visual clues all over pages to aid navigation and orientation. Algebra notation is laid out to group operands in a way that reflects the order of precedence of the operators – it makes a glance at an expression written in algebra easier. Similarly, diagrams need to at least give the illusion of being able to see the whole thing (see below) – the glance at the whole diagram. Work on glancing has been done, including some by myself, and there are ways of doing it for individual information types, but I don’t know of a generic solution and certainly one that is available to me for everyday use.
Glancing at information to assess reading strategies, help orientation and navigation, and choices in reading is difficult
My final chore is the looking at two things at once problem. Eyes give one the impression that two things can be looked at at once. In the small this is true, the field of accurate vision is narrow, but does see several things in detail at once. However, the speed and accuracy in control of information flow afforded by the eyes, combined with the layout of information (when done well), on an external memory means that eyes can move back and forth between items of information rather well. This is hard in speech and audio- so much layout information is lost – when reading research papers, moving back and forth from the narrative text to the references was easy with eyes; it’s hard with speech (what I do is to have two windows open and move between the windows – this is hard work).
My interaction with spreadsheets always seems v clunky to me. My natural view with a speech based screenreader is one cell at a time; looking at column or row headers to see what they are is naturally a matter of flicking one’s eyes up or along to remember the orientation and that’s fine. I can do this, but the means of doing so is intrusive. Similarly, dealing with any tabular information is painful. The ability to compare rows, columns, cells is central; indexing via column and row headings is vital. I have the keystrokes to do it all in my screenreader, but it’s hard work – in contrast, one flicks one’s eyes back and forth and one appears to be looking at two things at once. Tables are nothing in terms of difficulty when it comes to diagrams; even if there is access to the material (e.g., simple line graphs, histograms, and node and arc diagrams) one has to build up a picture of the whole thing piecemeal. The “looking at two things at once” ability of eyes makes this task relatively easy and the inability to do this with speed, accuracy and so on means many interactions are either very hard or impossible.
- Looking at two things at once is nigh on impossible
In conclusion, I think there are still two main unsolved problems in audio interaction with information:
- Looking at two things at once.
Once I have general solutions to these two things, I’ll be a much more effective and efficient reader that is satisfied with my reading.