Steve and Andreas provided comments and text, to which I have not done full justice in this cut and paste. Time pressures led me to let it out in this rather choppy form. When we make a more formal version of it, we will smooth it up (and make it shorter!).
This paper was the result of stepping back to consider where we are going in our overall interface research, and how it fit into the larger picture. It was triggered by the discussions at the all-DLI meeting in December, and I decided that trying to write it up as a DL conference paper would be a good way to articulate the issues. We ended up deciding that it wasn't really appropriate for that context, and wouldn't be in good enough shape for the deadline. For our group discussion, here are the questions that motivated this analysis:
In order to create interfaces to the digital library, we need to understand the nature of the library itself. The word "library" conjures up images of buildings with reading rooms, cabinets full of catalog cards, and neat rows of stacks containing items on shelves. The structure that the user encounters is dominated by considerations of how the collection of materials is organized: their physical form and their organization into a set of subcollections and topic categories, which then correspond to rooms, stacks, and shelves for the physical objects.
The on-line environment has traditionally provided users with a similar collection-structured space, organized around hierarchical file systems, web servers, transfer applications (such as FTP) and the like. But as the diversity and interoperablity of the network increases, this dimension becomes less relevant to general information uses. The user is more focused on what can be done with the materials, rather than where they are stored, or how to move them from there to here. Although there have been some attempts to create online collections with the uniform comprehensible structuring of a traditional library, most information needs are better satisfied by the open-ended and far-reaching variety of information sources and services that are available on the Internet. Each of these sources or services has its own structure, rules, and formats, and the means for finding, accessing, and using them are highly varied. For the typical user, the result is potentially bewildering -- like a physical library with dozens of different catalogs of different kinds, with each room full of books organized differently, and so on.
In the Stanford Integrated Digital Library Project [Paepcke et al., 1996] we have adopted the perspective that the digital library is not a specific collection, or an organization of collections, but is instead a unified means for a user to access the full range of information and services on the local and global network. The Infobus structure specifies the underlying protocols and processes, which provide uniform access and manipulation within the distributed and heterogeneous space of collections, information exploration methods, and services on the net. These range from well-defined well-structured collections (such as online journals) to the chaotic diversity of the World Wide Web. The user interface, then, needs to be structured with the user's overall pattern of interactions and activities in mind.
More recent projects have extended the collection and navigation perspective to collections whose structure does not match the traditional structures for organizing print documents. At the University of California Santa Barbara, for example [[Alexandria]], researchers are extending previous work on Geographic Information Systems to provide uniform access to large distributed collections of geographically-organized materials, such as maps, aerial photographs, and satellite survey data. The relevant organizing dimensions are those of location (longitude and latitude), scale,, and "footprints" of meaningful spatial objects (cities, states, regions, named rivers and mountains, etc.).
The World Wide Web, until fairly recently was purely collection structured, where the collection was the set of pages available on the web -- the user interface mechanisms were there for browsing among the document collection and viewing. They emphasized facilities for navigation (moving around in the collection) based on the structure of the elements (links). Since this structure is so much simpler than the conventional library structures, the browser could be extremely simple ("click and go"), which was a major factor in the explosive growth of the web (more on this point later). This of, course, is changing with CGI-BIN scripts and then Java -- more on that later.
To the degree that the digital library primarily contains printed pages for viewing, the digital library makes a similar division, with its emphasis on providing tools for navigation within the collections. The same analysis extends to other media (such as video) as well, if the library function is simply one of picking out the relevant item, and viewing is left to other applications. But if the objects in the library support more active kinds of manipulation, the emphasis is shifted to the structure of what can be done with an item once it has been identified and retrieved.
The document-structured perspective dominates most of the standard application interfaces to which we are accustomed: word processors, spreadsheets, drawing programs, and the like. Although a small amount of the interface is devoted to navigating and managing files, the majority of the mechanisms are for use on a specific document, of whatever type the application supports. The tools are specially tailored to the document type (e.g., text, drawing, or spreadsheet), making use whenever possible of standard mechanisms for universal operations such as selecting, moving and changing objects.
Some of the research in current digital libraries projects has aimed at providing new kinds of document-structured interfaces for working with networked information. In some cases the interfaces are designed for specialized document types (as with the geographic information mentioned above). In other projects, concepts such as "multivalent documents" (MVD) [[Berkeley Diglib]] have been developed, which can be applied to any documents that can be represented in a two-dimensional visual space. A multivalent document can have any number of inter-related "layers" reflecting different aspects of a document (e.g., a scanned image and the corresponding text; a typeset formula and the corresponding computation), and can allow cascaded authorship (e.g., one person penning comments onto another person's document). The MVD interface provides elements for managing the different views and moving among them in a coordinated way.
Another kind of document, being explored by the Informedia project at Carnegie Mellon University, [[Informedia]] includes time-based materials as audio and video recordings. The tools for search and access must be specially designed to reflect the structure of video documents. They include skimming modes, key frame representations, integration of text and picture search, etc. The interface is specific to these tools, for example making it easy to scan a sequence of video through key frames or key clips.
In a library-related example, specialized interfaces (and in fact specialized hardware with multiple CD-ROMs) have been developed for patent search: an information-intensive task with well-defined structure. Others support financial analysis, making structured use of materials such as stock quotes and securities filings. For a significant number of future users of the digital library, much of their access will be of this kind: not a general use of searching, browsing, and manipulating information items, but a highly targeted use of structured materials for pre-defined purposes.
This perspective has not drawn as much attention in the world of digital libraries as have perspectives that offer more general interfaces, for much the same reason that the design of specialized interfaces for "vertical" applications has not drawn the same attention as the development of the general graphical user interface. Markets are small (relatively), so the commercial interest is less. Design tradeoffs in the interface are driven more by domain and task-specific considerations than by general interface mechanisms and principles. Hence they are less amenable to general development and discussion, and require more codevelopment with the specific user population. Also, they tend to be supplanted by the use of more general-purpose interfaces, for example as IBM's web-based patent information has now made the earlier dedicated systems obsolete.
Today's broad-spectrum computer interfaces (operating systems and wide-audience applications) coexist with numerous specialized interfaces. Federal Express provides a web page for its customers, as well as hand-held special devices for its drivers. The same will be true in future digital libraries. Along with the more general interfaces and capacities, there will be a never-ending need for specialized interfaces that optimize a person's use of specific materials for specific tasks.
The intended audience for our digital library does not fall under any of these categories. Our stereotypical user a researcher or student working in a broad technical field (such as computing). This corresponds to the typical user of today's university libraries (and many corporate libraries). But there is a key difference. In the traditional library, the tools were for finding and accessing materials, with no provision for what the user did with those materials. But our intended user will be accessing the digital library from a workstation or other personal device on which many aspects the user's work are integrated: not just the library retrieval, but the ongoing interplay of activities involving information exploration and use for getting work done.
In this environment, library aspects need to be integrated into the user's overall management of information and tasks. Multiple collections will be searched and browsed, materials will be collected by the individual for future reading and use, and for sharing with others. Documents will be generated, marked up, and exchanged, and the overall environment will gradually be customized to suit the user's specific needs and preferences. Tools are needed not just to retrieve materials that are "out there" on the net, but also to manage the user's current context and state.
An interface of this kind is structured around the user's overall information usage, and often is structured, at least in part, dynamically by the user. It needs to provide uniform affordances for collecting and manipulating information objects (documents, images, and more) that may come from different sources in varying formats and with different access rights. It needs to provide standard ways to access the open-ended collection of different kinds of information services (searching, categorizing, translating, summarizing,...) that are becoming available in the net. It should make it possible for the user to provide materials and services as well as using ones provided by other people -- the library collection is not a centralized unity, but the union of everything each individual wants to make public to any audience.
At first blush, this seems hopelessly ambitious. How can we build a do-everything interface for all the needs of a digital library user with the range of activities of a student or researcher? How can we anticipate all the different ways in which information might be organized and manipulated, or all the different kinds of tools and services to be provided and used? How can we provide an interface that is good for working with text documents and with maps and with video and with everything else that will come along?
But the same question can be asked of the generic GUI interface. It attempts to provide a basis for all the different kinds of interaction that a user might have with a computer, by creating a common set of conventions and tools that applications-designers can combine in appropriate ways for their specific uses. An analysis of user interfaces in mature word processors and related document centered tools (including those for graphics, spreadsheets and other documents that go beyond simple text), makes it clear that document centered interface designs have gradually converged to include a relatively stable set of visual representations and interaction affordances.
Our research into user interface design for digital libraries is focused on finding similar common ground for the interface aspects that are newly created or brought into prominence by distributed digital library facilities. Although current operating systems and browsers provide many of the uniform tools needed for working with digital information (hierarchical file and folder directories, cut/copy/paste buffers, screen and window management, hotlists, and the like), they lack many of the characteristics needed for a user-structured interface to a distributed collection of information sources and services. The following section describes some of the requirements and problems that have become evident in our work on a digital libary testbed.
The collaborative aspect of digital library usage spans a broad spectrum from anonymous and loosely coupled, to tightly connected and highly focused work groups. Systems across this spectrum all suffer a tension between convenience for individuals and the need of the group.
For example, the World-Wide Web can be viewed as a large collaboratively constructed repository. The rules of participation are informal, but not altogether absent. There is, for example, an expectation that images are kept small enough so as not to be cumbersome. There is also a---frequently unwarranted---expectation that contributors keep their links in order. Since this is a tedious task, individuals tend to lapse, at the expense of the overall community.
More tightly coupled collaborative systems are used within work teams [[ref Andreas CSCW J.]]. Facilities within those systems include group-based rating of news items [[Doug Terry's Tapestry system,...]], collaborative editing, group annotations [Commentor] and computer mediated search sessions performed jointly by Digital Library users and a remotely located reference librarian. The tension between individual needs and group requirements permeate these settings as well. For example, when common electronic mail repositories are maintained by even a small group of users with closely related interests, opinions about the best labelling, and the granularity of folders can vary greatly [[Berlin: Where did you put it]]. The tension is between catering to individual preferences and providing uniformity for ease of use by everyone.
A key interface challenge arising in collaborative contexts is how to enforce a minimum in conformity to help the group work together, while not getting in the way of the individual users. The challenge is to provide customizable interactive views onto documents and other resources, which underneath 'do the right thing' to ensure that information stays sharable, searchable, and complete. For the example of a collaboratively maintained local repository this could mean. among other things, that individuals can view and maintain the repository through their own sets of attribute tags, which are translated to a common set whenever possible.
A number of mechanisms have been proposed for extending basic security mechanisms across systems, such as the use of public key encryption. These mechanisms will be an important component of any solution, and need to be integrated under a coherent user conceptual model that can serve to let users understand where they are and what they can do. In the face of increasing accessibility, user interfaces must make it clear to users what is private, and how to protect their information.
In general the underlying models for access rights that have served since the early days of timesharing are not flexible enough to handle the wide variety of rights control that are needed in the networked resource environment. In today's network environment, monetary payment mechanisms play a crucial role [[UPAI[[ and need to be integrated into a more general model that is based on contracts rather than on protection, providing additional diversity and flexibility [[Rmanage ref]].
The challenge in providing this flexibility for digital library users is to control the complexity that it can introduce to the interface. For example, how conspicuous should information about the flow of money be made in the interface? What about access control? Should there be fine-grained gradations of access rights for all kinds of information objects, or are simpler models sufficient for all or many, and can they be integrated with more comprehensive (but complex) ones in a seamless way?
Our research on information exploration [[ref SenseMaker]] has shown the value of mechanisms that allow users to dynamically configure tabular attribute arrays, selecting not only the attributes to be displayed and the ordering of attributes and of items, but also the grouping of items based on attribute value, and the use of the groupings as a basis for filtering and for specifying additional items. The typical "folder window" has provided a uniform structure for simple attribute display in GUI operating systems, the challenge is to expand the capabilities of this to handle multiple organizaing dimensions and large collections in a comprehensible way.
Although this question comes up in the design of any integrated interface, it is particularly difficult in the Internet environment with its continual proliferation of new services, providers, and interfaces. Our goal is to provide a comprehensible way for users to combine these in a user-structured environment. Our DLITE interface [[DLITE details]] is one example of how these choices may be made. We are experimenting with interacting with distributed asynchronous services through a drag-and-drop interface that extends the traditional GUI use of drag and drop for application invocation. In it's simple form, this may exclude users from some highly advanced affordances of some services, but it makes interaction with many heterogeneous services feasible for any one user. Design choices are complicated by the fact that Digital Library systems are intended to be used by people with a wide spread of computer savvy. They range from casual users, to highly trained librarians. This is in contrast to information systems designed to serve limited, specialized user populations, such as the databases used by customer support personnel.
This wide spread of user expertise suggests that maybe two or more fundamentally different interface models are needed to satisfy the whole range of user populations. We need to learn more about whether this is necessary, or whether the underlying services may be sufficiently customizable in their behavior that at least closely related interfaces may be used by novices and experts alike. Alternatively, one could imagine something analagous to fisheye views in providing levels of service functionality. Simple functionality could be at the center of attention, while more sophisticated affordances would be available in the periphery.
In a multiprocessing, networked services environment, the picture is different. Collections of resources and services are available on the network, to be explored or invoked from the interface. They need to be located and initiated, may have varying availability status, may take more than one argument to initiate (e.g., a source to be searched and a query to be used within it). Once they are started they may run for indefinitely long, may or may not need further interaction, and may or may not produce a set of results, either when they complete or incrementally as they run. Users may need to check status, pause, restart, and cancel. Services may hang, stop responding, or disappear. They may have incremental costs, so that it is important to have explicit interactive control over how much work they do. For example, a service might be asked to return a few results and wait for confirmation before generating further ones.
Unix and other multi-process operating systems provide facilities for process status and management, but these are generally in text command-line form and are not designed for a proxied multi-server environment. They are a first step towards the kinds of the facilities that need to be provided but cannot serve as a full conceptual model for the digital library environment.
As one final note on the challenges, it is important to reiterate that they are not specific to services that are oriented to a digital library (which in itself is not a well-defined subclass of computer-based systems). The challenges are relevant to any multitasking distributed service environment, in which design and control are not subject to a centralized integration. We see the work on developing interfaces for digital library as a part of a larger picture of developing the next generation of online user environments.
As an model, consider the history of personal computer interfaces. The designers of the first applications available on the Apple II and the DOS-based PC built their interfaces from scratch, using the basic system facilities for screen and keyboard. This meant that the same operation (e.g., opening a file for selecting an item for editing) might be done in quite different ways in different applications.
The Xerox Star [[ref to Star and profile in BDS]] offered a comprehensive interface that operated uniformly across the tools for working with different kinds of documents (text, drawings, tables, formulas, etc.). Every part of the application used the same set of "universal commands" for selection, copying, moving, editing properties, etc., and standard direct manipulation techniques allowed users to manage files and directories (folders and file-cabinets), to print, and to communicate documetns to other people. The design goal was to make the overall system (not just one application) easy for people to learn and remember. They could develop a conceptual model [[ref to Liddle in BDS]] that enabled them to determine how to get an operation done, and to predict what would happen across a wide range of contexts.
With the Macintosh, the degree of uniformity was less than the Star, since the overall system strategy shifted from Xerox's integrated one-supplier approach to an approch that accommodated large numbers of independent software developers on a common platform. Through the Macintosh Human Interface Guidelines [[ref to guidelines and to profile in BDS]] and the associated toolbox that came with the Macintosh OS, Apple attempted to enforce standards across applications, in order to make the Macintosh easy to use for unsophisticated computer users. They promulgated their standards in several ways:
The Macintosh OS provided standard code for things like window management (opening, closing, scrolling, sizing, etc.), which could be used in every interface. Apple made it easy for developers to handle parts of the interface without having to do the work themselves. Of course, an application could have its own interface for aspects of managing windows (and many did), but unless there was an overriding reason, the natural course was to use the standard, which meant that it was uniform across applications.
The Finder (the desktop interface for managing applications and files) provided the user with standard affordances to execute operations. For example, if the icon for a document of appropriate type is dragged onto the icon for an application, the application will be started up with that document as its focus. This might mean different things to different applications (e.g., in some cases the focus document might be the one to edit, while in others it might be a parameters or preferences file.). But developers were expected to do something reasonable (meeting user expectations).
The Guidelines were very explicit about the location and contents of a few crucial menus that were relevant to every application: The
File menu, with its items to
Save, Print, and Quit; the Edit menu with Cut, Copy, and Paste, and so on. Although the detailed execution of these items might vary from application to application, the user could count on their being in a known place, and on their having relatively uniform effects.
The Macintosh provided basic tools for direct manipulation of objects that could be used in a wide variety of applications. In a word processor, the objects are words, characters, paragraphs, and the like. In a spreadsheet they are cells, formulas, and ranges. In a drawing program, they may be lines, circles, shaded areas, etc.; in a mail program, mailboxes and messages. In each case, the Mac user can expect certain actions to have predicatable consequences. Clicking on an object "selects" it, which then makes it the target of subsequent operations. Holding down the shift key while clicking causes a new object to be selected in addition to (rather than replacing) the previous selection. Dragging an object "moves" it, which may be a literal spatial move (as in drawing programs) or a structured transfer (as in moving a message to a mailbox). Dragging with the option key down moves a copy instead of the original, and so forth. These conventions are not followed uniformly in all applications, but as with the other items above, they offer the user a good heuristic to the user for knowing how to get something done in an application without having learned all of the details of its command structure.
The intent of the Macintosh guidelines was to create a user-centered environment, rather than a platform for multiple independent applications, each with its own environmental design. This approach was copied in IBM's attempt to define a standard character-based interface in CUA [[refs]] and in NextStep, OSF Motif, and the evolving versions of Microsoft Windows. In each case, the goal was to create a uniform environment that provided users with a way of understanding wht to do that cut across the multiple applications and application vendors.
In developing a digital library interface, there are several potential responses to this practical barrier:
This alternative is greatly facilitated by the emergence of standards such as Java and ActiveX for making application interfaces available on the Web. A specialized digital library interface (e.g., for browsing videos, searching maps, or marking up OCR'ed pages) can be provided to anyone with a web browser. The general capabilities provided by environment builders will be supplemented by interfaces taking advantage of the specialized structure of the collection, document, or task.
This is the alternative we have taken in developing the DLITE interface. It reflects a clear distinction between the goals that are appropriate for product development and those that are appropriate for research. In the product world, the key issue is competitive advantage. Ideas (as embodied in products) will be adopted only if the product dominates the competition as an overall package (including its capabilities, the conversion and learning costs, predictions of future extensions, relation to legacy systems, etc.). In the research world, the ideas themselves are the medium of impact, with demonstrations and implementations playing the critical role of developing and communicating the ideas. Success in this dimension is not measured by number of units sold or distributed, but in the degree to which the underlying ideas and designs become accepted and disseminated in future systems.
For many of the reasons discussed previously, it is unlikely that the commercial developer community will be motivated to put digital library interface research as one of their priorities in design. However, there does exist a community of digital library researchers, such as those in the Digital Library Initiative projects [[DLI ref]], who are developing interfaces for a variety of collections, documents, and tasks, and who have common needs for many of the interaction elements identified in section 7. It is possible to appeal to this community and to build up a body of applications that would carry much more influence than those built by any single project. The mechanisms for specifying and promulgating the standard elements could include any or all of those described in section 4, ranging from plug-in code components to guideline specifications.
This is the fundamental strategy we have been following with the InfoBus architecture [[Infobus ref]], with its protocols for accessing information sources and services. We have been collaborating with other digital library projects and with industrial partners (such as Knight-Ridder and Xerox) to integrate their services and collections onto the InfoBus, and in turn to provide their user interfaces with access to resources from all of the sites.
And finally there is the overriding concern of complexity. Anyone with experience in software development will shudder at the breadth of the list in section 7, knowing that any system attempting to handle all (or even most) of the items in depth will be pushing the bounds of complexity that make implementation feasible. The HTTP and HTML standards stand as a stark case in point. They represented drastic reductions of capability from text formatting standards and information transport protocol standards that already existed. The list of things that they couldn't handle was huge, ranging from simple text placement and section numbering to user login and session maintenance. But the web succeeded, not in spite of these simplifications, but because of them. The key insights were in recognizing which hard problems could simply be ignored, making implementation easier while providing an overall functionality that gave substantial new value.
Apple Computer. Human Interface Guidelines: The Apple Desktop Interface. Reading, MA: Addison-Wesley, 1987.
[Sensemaker] Michelle Baldonado ....
Bush, Vannevar. As we may think. Atlantic. 176 (July, 1945), 101-108.
[DLI] Summaries of the six DLI projects from the May 1996, Special
Issue on Digital Libraries in the Institute of Electrical and
Electronics Engineers, IEEE Computer Magazine.
Edwin Hutchins, James Hollan, and Donald Norman, Direct Manipulation Interfaces, in D. Norman and S. Draper, eds., User Centered System Design, Erlbaum, 1986, pp. 87-124.
[SuperBook] Egan, D.E.. Remde, J.R.; Gomez, L.M.; Landauer, T.K.; Eberhardt, J.; Lochbaum, C.C. Formative design-evaluation of SuperBook. ACM Transactions on Information Systems (Jan. 1989) vol.7, no.1, p. 30-57. See also <http://superbook.bellcore.com/SB/ >
Gaver, Bill. Technology Affordances, CHI 91 Conference Proceedings, ACM Press, 1991, pp. 79-84.
Highwire Press <http://highwire.stanford.edu >
Johnson, Jeff, Terry Roberts, William Verplank, David C. Smith, Charles Irby, Marian Beard, and Kevin Mackey. Xerox Star: A retrospective, IEEE Computer 22:9 (September, 1989), 11-29.
Journal Storage Project (JSTOR) <http://jstor.umdl.umich.edu />
[Informedia] Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., &
Wactlar, H. Informedia Digital Video Library. Communications of the
ACM, 38(4):57-58 (1995).
[UPAI] Steve Ketchpel....
Liddle, David. Starting with the User Experience. In T. Winograd (ed.), Bringing Design to Software. Reading MA: Addison-Wesley, 1996. xx-xx.
Norman, Donald. The Design of Everyday Things. New York: Basic Books, 1988.
[Berkeley] Virginia E. Ogle and Robert Wilensky, Testbed Development for the Berkeley Digital Library Project, D-lib Magazine, July 1996, ISSN 1082-9873.
Paepcke, Andreas, Steve Cousins, Hector Garcia-Molina, Scott Hassan, Steven Ketchpel, Martin Röscheisen, and Terry Winograd (1996), "Using distributed objects for digital library interoperability", IEEE Computer, 29:5 (May, 1996) 61-68.
[MVD] Thomas A. Phelps and Robert Wilensky,
Multivalent Digital Documents in UC Berkeley's Digital
Library Project, SIGLINK, Vol. 4, No. 2, September 1995.
Winograd, Terry (ed.).Bringing Design to Software. Reading MA: Addison-Wesley, 1996.