Perspectives on Interfaces for Digital Libraries

Terry Winograd, Andreas Paepcke, and Steve Cousins
Stanford University
Version of January 27

DRAFT FOR DISCUSSION, DO NOT CITE OR DISTRIBUTE

Meta-Note

Since I am putting this out as an internal discussion paper, I will be direct and informal in some of the things that I would be more cautious about in publication. Please do not pass this URL along to people outside of our internal working group.

Steve and Andreas provided comments and text, to which I have not done full justice in this cut and paste. Time pressures led me to let it out in this rather choppy form. When we make a more formal version of it, we will smooth it up (and make it shorter!).

This paper was the result of stepping back to consider where we are going in our overall interface research, and how it fit into the larger picture. It was triggered by the discussions at the all-DLI meeting in December, and I decided that trying to write it up as a DL conference paper would be a good way to articulate the issues. We ended up deciding that it wasn't really appropriate for that context, and wouldn't be in good enough shape for the deadline. For our group discussion, here are the questions that motivated this analysis:

  1. Other DLI projects are doing interface work specifically directed towards their materials: maps. multi-layer documents, video browsing, etc. Ours is at a much more general level, which corresponds to the kinds of things done by OS interface developers (Mac, Windows, etc.). By trying to be so general are we losing any leverage we might have by being a digital library, and trying instead ot compete with Microsoft and Apple?

  2. Why does new work need to be done at all at this general level? Are there specific aspects of the digital library that can lead us to make new contributions to the design of interfaces, or are we just adapting the old GUI ideas to our environment?

  3. Assuming we have some interesting new ideas at this general level, what role can work like ours play in the bigger picture with forces like Microsoft, Netscape, etc., who are also developing bigger and better interfaces that integrate web search, desktop, etc.? Are we just playing in an academic sandbox?

  4. Is there any realistic potential to actually disseminate the interface concepts we are doing, and, if so, how could that be done? How does it relate to the testbed, and to the other DL projects?

  5. Given that there are always an unbounded number of issues and details in improving an interface, how do we decide on priorities? What will really give mileage for effort expended, given that we are at least an order of magnitude understaffed for polished interface development and testing?
In trying to answer these questions, I ended up drawing some distinctions between different kinds of interfaces and interface strategies, and those form the bulk of this paper. The conclusions for our own project are left open for our discussion.

Contents

  1. Introduction
  2. Interface Perspectives
  3. Digital Library Challenges for User Interface Design
  4. Bringing Uniformity to User-Structured Interfaces
  5. Practical Strategies
  6. References

1. Introduction

The past few years have seen a rapidly growing research effort in digital libraries. In a sense, the concept is old, predating the modern digital computer [Bush, 1945], yet we are just now seeing the emerging outlines of the internetworked global digital library of the future. In conjunction with designing the underlying mechanisms and the social structure that will support the operation of the digital library, researchers need to design appropriate mechanisms for interaction between individual library users and the full range of mechanisms and services they will encounter. In developing our digital library testbed [Paepcke, 1996], we have found that the digital library environment is different in a number of ways from the traditional workstation environments that shaped current graphical user interface (GUI) technologies. This paper examines the new challenges and their implications for interaction and interface design.

In order to create interfaces to the digital library, we need to understand the nature of the library itself. The word "library" conjures up images of buildings with reading rooms, cabinets full of catalog cards, and neat rows of stacks containing items on shelves. The structure that the user encounters is dominated by considerations of how the collection of materials is organized: their physical form and their organization into a set of subcollections and topic categories, which then correspond to rooms, stacks, and shelves for the physical objects.

The on-line environment has traditionally provided users with a similar collection-structured space, organized around hierarchical file systems, web servers, transfer applications (such as FTP) and the like. But as the diversity and interoperablity of the network increases, this dimension becomes less relevant to general information uses. The user is more focused on what can be done with the materials, rather than where they are stored, or how to move them from there to here. Although there have been some attempts to create online collections with the uniform comprehensible structuring of a traditional library, most information needs are better satisfied by the open-ended and far-reaching variety of information sources and services that are available on the Internet. Each of these sources or services has its own structure, rules, and formats, and the means for finding, accessing, and using them are highly varied. For the typical user, the result is potentially bewildering -- like a physical library with dozens of different catalogs of different kinds, with each room full of books organized differently, and so on.

In the Stanford Integrated Digital Library Project [Paepcke et al., 1996] we have adopted the perspective that the digital library is not a specific collection, or an organization of collections, but is instead a unified means for a user to access the full range of information and services on the local and global network. The Infobus structure specifies the underlying protocols and processes, which provide uniform access and manipulation within the distributed and heterogeneous space of collections, information exploration methods, and services on the net. These range from well-defined well-structured collections (such as online journals) to the chaotic diversity of the World Wide Web. The user interface, then, needs to be structured with the user's overall pattern of interactions and activities in mind.



2. Interface Perspectives

The potential space of user interfaces to a digital library can be viewed in terms of four fundamental perspectives: collection-structured, document-structured, task-structured, and user-structured. This distinction is not a sharp taxonomy of interfaces as a whole: a single interface may combine several perspectives (in fact it often needs to), and some interfaces may not have a single primary perspective. However we can identify some clearly contrasting points within this space, and the classification can provide insight into the tradeoffs that are made in each design.

2.1. Collection-structured

A collection-structured interface is usually associated with a particular collection of materials, providing users with tools for searching, browsing, and viewing the collection. It is the interface approach that is most closely aligned with the traditional notion of libraries, in some cases serving as a relatively straightforward extension of the traditional manual organizing methods (often with the addition of textual search). Clear examples include the web-based interfaces to journal collections, such as those of Highwire Press [Highwire] and the Journal Storage Project [JSTOR]. The interface is designed to provide access to the specific collection, and provides navigation tools based on the structure of journal issues, tables of contents, and individual articles. Before the advent of the web, a number of specialized collection browsers were developed for books, software and documentation, online help, government document collections, and the like (see, for example Superbook [SuperBook]).

More recent projects have extended the collection and navigation perspective to collections whose structure does not match the traditional structures for organizing print documents. At the University of California Santa Barbara, for example [[Alexandria]], researchers are extending previous work on Geographic Information Systems to provide uniform access to large distributed collections of geographically-organized materials, such as maps, aerial photographs, and satellite survey data. The relevant organizing dimensions are those of location (longitude and latitude), scale,, and "footprints" of meaningful spatial objects (cities, states, regions, named rivers and mountains, etc.).

The World Wide Web, until fairly recently was purely collection structured, where the collection was the set of pages available on the web -- the user interface mechanisms were there for browsing among the document collection and viewing. They emphasized facilities for navigation (moving around in the collection) based on the structure of the elements (links). Since this structure is so much simpler than the conventional library structures, the browser could be extremely simple ("click and go"), which was a major factor in the explosive growth of the web (more on this point later). This of, course, is changing with CGI-BIN scripts and then Java -- more on that later.

2.2. Document-structured

In earlier discussions of digital libraries, documents were essentially taken for granted. Once you had a page on your screen, all that needed to be provided by the interface was a tool for viewing. Returning to the physical library metaphor, this corresponds to a model in which library patrons use the library facilities to retrieve materials, and then take them home (or to a carrel) to do whatever kind of work they want to. The facilities provided by the library itself don't extend into working with the materials, just getting them.

To the degree that the digital library primarily contains printed pages for viewing, the digital library makes a similar division, with its emphasis on providing tools for navigation within the collections. The same analysis extends to other media (such as video) as well, if the library function is simply one of picking out the relevant item, and viewing is left to other applications. But if the objects in the library support more active kinds of manipulation, the emphasis is shifted to the structure of what can be done with an item once it has been identified and retrieved.

The document-structured perspective dominates most of the standard application interfaces to which we are accustomed: word processors, spreadsheets, drawing programs, and the like. Although a small amount of the interface is devoted to navigating and managing files, the majority of the mechanisms are for use on a specific document, of whatever type the application supports. The tools are specially tailored to the document type (e.g., text, drawing, or spreadsheet), making use whenever possible of standard mechanisms for universal operations such as selecting, moving and changing objects.

Some of the research in current digital libraries projects has aimed at providing new kinds of document-structured interfaces for working with networked information. In some cases the interfaces are designed for specialized document types (as with the geographic information mentioned above). In other projects, concepts such as "multivalent documents" (MVD) [[Berkeley Diglib]] have been developed, which can be applied to any documents that can be represented in a two-dimensional visual space. A multivalent document can have any number of inter-related "layers" reflecting different aspects of a document (e.g., a scanned image and the corresponding text; a typeset formula and the corresponding computation), and can allow cascaded authorship (e.g., one person penning comments onto another person's document). The MVD interface provides elements for managing the different views and moving among them in a coordinated way.

Another kind of document, being explored by the Informedia project at Carnegie Mellon University, [[Informedia]] includes time-based materials as audio and video recordings. The tools for search and access must be specially designed to reflect the structure of video documents. They include skimming modes, key frame representations, integration of text and picture search, etc. The interface is specific to these tools, for example making it easy to scan a sequence of video through key frames or key clips.

2.3 Task-structured

One of the characteristics of traditional libraries is their breadth and diversity of uses. The library is a large multi-source multi-user collection, not targeted to any one particular task. When we look at computer interfaces, on the other hand, there is a wide range of specificity. Some interfaces, such as system level interface for the Macintosh or Windows95, are used by everyone who does anything at all general with the machine. On the other hand, some interfaces are based on software, and even hardware, suited to a specific task or collection of tasks. In some extreme cases, such as the hand-held computers used by the drivers for overnight package delivery, the entire device is designed to support a small set of specific tasks. The interface structure and interactions are optimized for the specific work patterns that the user executes over and over again.

In a library-related example, specialized interfaces (and in fact specialized hardware with multiple CD-ROMs) have been developed for patent search: an information-intensive task with well-defined structure. Others support financial analysis, making structured use of materials such as stock quotes and securities filings. For a significant number of future users of the digital library, much of their access will be of this kind: not a general use of searching, browsing, and manipulating information items, but a highly targeted use of structured materials for pre-defined purposes.

This perspective has not drawn as much attention in the world of digital libraries as have perspectives that offer more general interfaces, for much the same reason that the design of specialized interfaces for "vertical" applications has not drawn the same attention as the development of the general graphical user interface. Markets are small (relatively), so the commercial interest is less. Design tradeoffs in the interface are driven more by domain and task-specific considerations than by general interface mechanisms and principles. Hence they are less amenable to general development and discussion, and require more codevelopment with the specific user population. Also, they tend to be supplanted by the use of more general-purpose interfaces, for example as IBM's web-based patent information has now made the earlier dedicated systems obsolete.

Today's broad-spectrum computer interfaces (operating systems and wide-audience applications) coexist with numerous specialized interfaces. Federal Express provides a web page for its customers, as well as hand-held special devices for its drivers. The same will be true in future digital libraries. Along with the more general interfaces and capacities, there will be a never-ending need for specialized interfaces that optimize a person's use of specific materials for specific tasks.

2.4 User-structured

If a person were interested in using only one well-structured collection, or one kind of document, or doing one structured task, then he or she would be best served by an interface designed around one of the three prior perspectives. By focusing on the particular characteristics of the document, collection, or task, the designer can provide well-tuned operations and straightforward mappings from the interface to the underlying materials.

The intended audience for our digital library does not fall under any of these categories. Our stereotypical user a researcher or student working in a broad technical field (such as computing). This corresponds to the typical user of today's university libraries (and many corporate libraries). But there is a key difference. In the traditional library, the tools were for finding and accessing materials, with no provision for what the user did with those materials. But our intended user will be accessing the digital library from a workstation or other personal device on which many aspects the user's work are integrated: not just the library retrieval, but the ongoing interplay of activities involving information exploration and use for getting work done.

In this environment, library aspects need to be integrated into the user's overall management of information and tasks. Multiple collections will be searched and browsed, materials will be collected by the individual for future reading and use, and for sharing with others. Documents will be generated, marked up, and exchanged, and the overall environment will gradually be customized to suit the user's specific needs and preferences. Tools are needed not just to retrieve materials that are "out there" on the net, but also to manage the user's current context and state.

An interface of this kind is structured around the user's overall information usage, and often is structured, at least in part, dynamically by the user. It needs to provide uniform affordances for collecting and manipulating information objects (documents, images, and more) that may come from different sources in varying formats and with different access rights. It needs to provide standard ways to access the open-ended collection of different kinds of information services (searching, categorizing, translating, summarizing,...) that are becoming available in the net. It should make it possible for the user to provide materials and services as well as using ones provided by other people -- the library collection is not a centralized unity, but the union of everything each individual wants to make public to any audience.

At first blush, this seems hopelessly ambitious. How can we build a do-everything interface for all the needs of a digital library user with the range of activities of a student or researcher? How can we anticipate all the different ways in which information might be organized and manipulated, or all the different kinds of tools and services to be provided and used? How can we provide an interface that is good for working with text documents and with maps and with video and with everything else that will come along?

But the same question can be asked of the generic GUI interface. It attempts to provide a basis for all the different kinds of interaction that a user might have with a computer, by creating a common set of conventions and tools that applications-designers can combine in appropriate ways for their specific uses. An analysis of user interfaces in mature word processors and related document centered tools (including those for graphics, spreadsheets and other documents that go beyond simple text), makes it clear that document centered interface designs have gradually converged to include a relatively stable set of visual representations and interaction affordances.

Our research into user interface design for digital libraries is focused on finding similar common ground for the interface aspects that are newly created or brought into prominence by distributed digital library facilities. Although current operating systems and browsers provide many of the uniform tools needed for working with digital information (hierarchical file and folder directories, cut/copy/paste buffers, screen and window management, hotlists, and the like), they lack many of the characteristics needed for a user-structured interface to a distributed collection of information sources and services. The following section describes some of the requirements and problems that have become evident in our work on a digital libary testbed.



3. Digital Library Challenges for User Interface Design

Our explorations of existing and anticipated digital library interfaces have identified many different interface aspects that could benefit from coherent standard conceptual models, representations, and action affordances. In looking at these items, it is important to remember the overall environment, which differs greatly from the one that motivated the familiar graphical user interface. Mechanisms that worked well for a single user on a single machine, with access to files under that user's control, to be generated and manipulated with a few standard applications, are no longer adequate in the distributed networked multi-process environment.

3.1 Sharing and Collaboration

Research on groupware and computer-supported collaborative work have focused on mechanisms for selective sharing and dissemination of information and for real-time communication associated with computer-based information. Systems such as Lotus Notes have dealt with some of these questions for many years, and sharing and communication mechanisms are now being incorporated into the next generation of web browsers, such as Netscape's Communicator.

The collaborative aspect of digital library usage spans a broad spectrum from anonymous and loosely coupled, to tightly connected and highly focused work groups. Systems across this spectrum all suffer a tension between convenience for individuals and the need of the group.

For example, the World-Wide Web can be viewed as a large collaboratively constructed repository. The rules of participation are informal, but not altogether absent. There is, for example, an expectation that images are kept small enough so as not to be cumbersome. There is also a---frequently unwarranted---expectation that contributors keep their links in order. Since this is a tedious task, individuals tend to lapse, at the expense of the overall community.

More tightly coupled collaborative systems are used within work teams [[ref Andreas CSCW J.]]. Facilities within those systems include group-based rating of news items [[Doug Terry's Tapestry system,...]], collaborative editing, group annotations [Commentor] and computer mediated search sessions performed jointly by Digital Library users and a remotely located reference librarian. The tension between individual needs and group requirements permeate these settings as well. For example, when common electronic mail repositories are maintained by even a small group of users with closely related interests, opinions about the best labelling, and the granularity of folders can vary greatly [[Berlin: Where did you put it]]. The tension is between catering to individual preferences and providing uniformity for ease of use by everyone.

A key interface challenge arising in collaborative contexts is how to enforce a minimum in conformity to help the group work together, while not getting in the way of the individual users. The challenge is to provide customizable interactive views onto documents and other resources, which underneath 'do the right thing' to ensure that information stays sharable, searchable, and complete. For the example of a collaboratively maintained local repository this could mean. among other things, that individuals can view and maintain the repository through their own sets of attribute tags, which are translated to a common set whenever possible.

3.2 User Identity, Privacy, And Payment

Most workstation users are not limited to materials on their individual computer, but are connected to some kind of service provider or local area network, which enables them to connect to other resources. This 'home provider' provides a variety of services, one of which is the identification of a user with a standard context (the login context or directory for the user) and the authentication of the identity of the person logging in (e.g., through passwords). Existing simple mechanisms work well, to the extent that all of the relevant resources are under the control of the same provider as the context for the user. They need to be extended when trust, authentication, and user identity are distributed across system boundaries.

A number of mechanisms have been proposed for extending basic security mechanisms across systems, such as the use of public key encryption. These mechanisms will be an important component of any solution, and need to be integrated under a coherent user conceptual model that can serve to let users understand where they are and what they can do. In the face of increasing accessibility, user interfaces must make it clear to users what is private, and how to protect their information.

In general the underlying models for access rights that have served since the early days of timesharing are not flexible enough to handle the wide variety of rights control that are needed in the networked resource environment. In today's network environment, monetary payment mechanisms play a crucial role [[UPAI[[ and need to be integrated into a more general model that is based on contracts rather than on protection, providing additional diversity and flexibility [[Rmanage ref]].

The challenge in providing this flexibility for digital library users is to control the complexity that it can introduce to the interface. For example, how conspicuous should information about the flow of money be made in the interface? What about access control? Should there be fine-grained gradations of access rights for all kinds of information objects, or are simpler models sufficient for all or many, and can they be integrated with more comprehensive (but complex) ones in a seamless way?

3.3 Dynamic Multi-Item Attribute Visualization

Before 1980, the Xerox Star had a standard means (in fact a dedicated button on the keyborad) to visualize the properties of any item that could be selected on the display. This general ability is important in every kind of application that has distinguishable objects and properties. In many cases, it is useful to simultaneously see selected properties of each item in a collection rather than of a single item. The most obvious example in the standard graphical interface is the use of a window for displaying the contents of a folder as a table of selected attributes, such as file size and dates.

Our research on information exploration [[ref SenseMaker]] has shown the value of mechanisms that allow users to dynamically configure tabular attribute arrays, selecting not only the attributes to be displayed and the ordering of attributes and of items, but also the grouping of items based on attribute value, and the use of the groupings as a basis for filtering and for specifying additional items. The typical "folder window" has provided a uniform structure for simple attribute display in GUI operating systems, the challenge is to expand the capabilities of this to handle multiple organizaing dimensions and large collections in a comprehensible way.

3.4 Invoking Diverse Distributed Autonomous Services

Digital library components vary greatly in their interaction affordances. This is true even for facilities which have quite similar base functionality. For example, while all commercial information providers basically provide for searching over a repository, they differ in specialized affordances. Some may allow simultaneous searches over multiple subcollections, while others don't. Some may provide wild cards in their search language, while others don't. Some rank return values, others sort by date. The design of any system that brings such different but closely related services together under one user interface must include choices about functionality to leave out, or alternatively, to be locally augmented by the interface to provide a degree of uniformity.

Although this question comes up in the design of any integrated interface, it is particularly difficult in the Internet environment with its continual proliferation of new services, providers, and interfaces. Our goal is to provide a comprehensible way for users to combine these in a user-structured environment. Our DLITE interface [[DLITE details]] is one example of how these choices may be made. We are experimenting with interacting with distributed asynchronous services through a drag-and-drop interface that extends the traditional GUI use of drag and drop for application invocation. In it's simple form, this may exclude users from some highly advanced affordances of some services, but it makes interaction with many heterogeneous services feasible for any one user. Design choices are complicated by the fact that Digital Library systems are intended to be used by people with a wide spread of computer savvy. They range from casual users, to highly trained librarians. This is in contrast to information systems designed to serve limited, specialized user populations, such as the databases used by customer support personnel.

This wide spread of user expertise suggests that maybe two or more fundamentally different interface models are needed to satisfy the whole range of user populations. We need to learn more about whether this is necessary, or whether the underlying services may be sufficiently customizable in their behavior that at least closely related interfaces may be used by novices and experts alike. Alternatively, one could imagine something analagous to fisheye views in providing levels of service functionality. Simple functionality could be at the center of attention, while more sophisticated affordances would be available in the periphery.

3.5 Service Availability, Invocation, Flow, and Status

In the standard personal computer environment, the accessible volumes contain some number of application programs. An application can be started, either with an initial document (drag and drop) or with a default (by double clicking on it or selecting it and choosing Open from the File Menu). A simple list of all loaded applications (regardless of their current state) is available in a menu. Applications do not return results when they complete, but interact with users and files as they run.

In a multiprocessing, networked services environment, the picture is different. Collections of resources and services are available on the network, to be explored or invoked from the interface. They need to be located and initiated, may have varying availability status, may take more than one argument to initiate (e.g., a source to be searched and a query to be used within it). Once they are started they may run for indefinitely long, may or may not need further interaction, and may or may not produce a set of results, either when they complete or incrementally as they run. Users may need to check status, pause, restart, and cancel. Services may hang, stop responding, or disappear. They may have incremental costs, so that it is important to have explicit interactive control over how much work they do. For example, a service might be asked to return a few results and wait for confirmation before generating further ones.

Unix and other multi-process operating systems provide facilities for process status and management, but these are generally in text command-line form and are not designed for a proxied multi-server environment. They are a first step towards the kinds of the facilities that need to be provided but cannot serve as a full conceptual model for the digital library environment.

As one final note on the challenges, it is important to reiterate that they are not specific to services that are oriented to a digital library (which in itself is not a well-defined subclass of computer-based systems). The challenges are relevant to any multitasking distributed service environment, in which design and control are not subject to a centralized integration. We see the work on developing interfaces for digital library as a part of a larger picture of developing the next generation of online user environments.



4. Bringing Uniformity to User-Structured Interfaces

Stating challenges is one thing, figuring out how to deliver solutions to them is another. The history of user interface design has been full of important problems that weren't really solve, good ideas that never made it into practice, and in some cases the successful integration of new models. The larger challenge, then is to find ways to bring uniform models into the world of practice.

As an model, consider the history of personal computer interfaces. The designers of the first applications available on the Apple II and the DOS-based PC built their interfaces from scratch, using the basic system facilities for screen and keyboard. This meant that the same operation (e.g., opening a file for selecting an item for editing) might be done in quite different ways in different applications.

The Xerox Star [[ref to Star and profile in BDS]] offered a comprehensive interface that operated uniformly across the tools for working with different kinds of documents (text, drawings, tables, formulas, etc.). Every part of the application used the same set of "universal commands" for selection, copying, moving, editing properties, etc., and standard direct manipulation techniques allowed users to manage files and directories (folders and file-cabinets), to print, and to communicate documetns to other people. The design goal was to make the overall system (not just one application) easy for people to learn and remember. They could develop a conceptual model [[ref to Liddle in BDS]] that enabled them to determine how to get an operation done, and to predict what would happen across a wide range of contexts.

With the Macintosh, the degree of uniformity was less than the Star, since the overall system strategy shifted from Xerox's integrated one-supplier approach to an approch that accommodated large numbers of independent software developers on a common platform. Through the Macintosh Human Interface Guidelines [[ref to guidelines and to profile in BDS]] and the associated toolbox that came with the Macintosh OS, Apple attempted to enforce standards across applications, in order to make the Macintosh easy to use for unsophisticated computer users. They promulgated their standards in several ways:

4.1 Standard components

The Macintosh OS provided standard code for things like window management (opening, closing, scrolling, sizing, etc.), which could be used in every interface. Apple made it easy for developers to handle parts of the interface without having to do the work themselves. Of course, an application could have its own interface for aspects of managing windows (and many did), but unless there was an overriding reason, the natural course was to use the standard, which meant that it was uniform across applications.

4.2 Standard hooks

The Finder (the desktop interface for managing applications and files) provided the user with standard affordances to execute operations. For example, if the icon for a document of appropriate type is dragged onto the icon for an application, the application will be started up with that document as its focus. This might mean different things to different applications (e.g., in some cases the focus document might be the one to edit, while in others it might be a parameters or preferences file.). But developers were expected to do something reasonable (meeting user expectations).

4.3 Standard menus and menu items

The Guidelines were very explicit about the location and contents of a few crucial menus that were relevant to every application: The File menu, with its items to Open, Save, Print, and Quit; the Edit menu with Cut, Copy, and Paste, and so on. Although the detailed execution of these items might vary from application to application, the user could count on their being in a known place, and on their having relatively uniform effects.

4.4 Standard conventions for manipulation

The Macintosh provided basic tools for direct manipulation of objects that could be used in a wide variety of applications. In a word processor, the objects are words, characters, paragraphs, and the like. In a spreadsheet they are cells, formulas, and ranges. In a drawing program, they may be lines, circles, shaded areas, etc.; in a mail program, mailboxes and messages. In each case, the Mac user can expect certain actions to have predicatable consequences. Clicking on an object "selects" it, which then makes it the target of subsequent operations. Holding down the shift key while clicking causes a new object to be selected in addition to (rather than replacing) the previous selection. Dragging an object "moves" it, which may be a literal spatial move (as in drawing programs) or a structured transfer (as in moving a message to a mailbox). Dragging with the option key down moves a copy instead of the original, and so forth. These conventions are not followed uniformly in all applications, but as with the other items above, they offer the user a good heuristic to the user for knowing how to get something done in an application without having learned all of the details of its command structure.

The intent of the Macintosh guidelines was to create a user-centered environment, rather than a platform for multiple independent applications, each with its own environmental design. This approach was copied in IBM's attempt to define a standard character-based interface in CUA [[refs]] and in NextStep, OSF Motif, and the evolving versions of Microsoft Windows. In each case, the goal was to create a uniform environment that provided users with a way of understanding wht to do that cut across the multiple applications and application vendors.



5. Practical Strategies

What is the practicality of getting any of the designs adopted beyond the confines of our own testbed? Apple's Macintosh guidelines succeeded in large part because they were used by developers for whom the Macintosh was a novel technology and there was nothing on the market that had competing alternatives for the kinds of interfaces they were developing. In today's world that is far from the case. The development of interfaces of all kinds is driven by the commercial concerns and visions of the companies that dominate the market: Microsoft, Apple, SUN, and newcomers like Netscape. Even if it could be argued that an interface developed in one of the digital library projects is in some sense "better," it will not see general adoption as a user-centered environment. The fight for control over the environment is a big-stakes highly contested battle among the software giants.

In developing a digital library interface, there are several potential responses to this practical barrier:

5.1 Leave the user-centered interface design to the commercial players, and focus on developing task-centered, collection-centered, or document-centered interfaces that depend on the specific characteristics of your materials.

This alternative is greatly facilitated by the emergence of standards such as Java and ActiveX for making application interfaces available on the Web. A specialized digital library interface (e.g., for browsing videos, searching maps, or marking up OCR'ed pages) can be provided to anyone with a web browser. The general capabilities provided by environment builders will be supplemented by interfaces taking advantage of the specialized structure of the collection, document, or task.

5.2 Build a demonstration system that clearly presents the important aspects of a user-centered design, and has sufficient use to test their value and importance.

This is the alternative we have taken in developing the DLITE interface. It reflects a clear distinction between the goals that are appropriate for product development and those that are appropriate for research. In the product world, the key issue is competitive advantage. Ideas (as embodied in products) will be adopted only if the product dominates the competition as an overall package (including its capabilities, the conversion and learning costs, predictions of future extensions, relation to legacy systems, etc.). In the research world, the ideas themselves are the medium of impact, with demonstrations and implementations playing the critical role of developing and communicating the ideas. Success in this dimension is not measured by number of units sold or distributed, but in the degree to which the underlying ideas and designs become accepted and disseminated in future systems.

5.3 Foster a developer community that will adopt (and evolve) the models and mechanisms.

For many of the reasons discussed previously, it is unlikely that the commercial developer community will be motivated to put digital library interface research as one of their priorities in design. However, there does exist a community of digital library researchers, such as those in the Digital Library Initiative projects [[DLI ref]], who are developing interfaces for a variety of collections, documents, and tasks, and who have common needs for many of the interaction elements identified in section 7. It is possible to appeal to this community and to build up a body of applications that would carry much more influence than those built by any single project. The mechanisms for specifying and promulgating the standard elements could include any or all of those described in section 4, ranging from plug-in code components to guideline specifications.

This is the fundamental strategy we have been following with the InfoBus architecture [[Infobus ref]], with its protocols for accessing information sources and services. We have been collaborating with other digital library projects and with industrial partners (such as Knight-Ridder and Xerox) to integrate their services and collections onto the InfoBus, and in turn to provide their user interfaces with access to resources from all of the sites.

There are many questions to be answered before embarking on the third strategy as a model for interface development, both concerning the technical details and the social/intellectual environment in which digital library research is being carried out. Developers are always reluctant to give up any of their autonomy in choosing design tradeoffs, for the less tangible benefits of compatibility with others. Some of the dimensions (such as those dealing with user identity, trust, long-term persistence, etc.) can only be developed if there are institutions that transcend a single university or company, which can provide a trusted basis for a domain of transactions. The form that these will take on the Internet is still undetermined.

And finally there is the overriding concern of complexity. Anyone with experience in software development will shudder at the breadth of the list in section 7, knowing that any system attempting to handle all (or even most) of the items in depth will be pushing the bounds of complexity that make implementation feasible. The HTTP and HTML standards stand as a stark case in point. They represented drastic reductions of capability from text formatting standards and information transport protocol standards that already existed. The list of things that they couldn't handle was huge, ranging from simple text placement and section numbering to user login and session maintenance. But the web succeeded, not in spite of these simplifications, but because of them. The key insights were in recognizing which hard problems could simply be ignored, making implementation easier while providing an overall functionality that gave substantial new value.

Conclusion

At today's point in the history of interface designs, we may be at a stage that is analogous to the period when Apple II and DOS interfaces were proliferating. There are lots of good ideas, little uniformity (of model or of mechanism), and an opportunity to create a good deal more coherence. We will never find a single universal digital library interface, but we can create the tools and models on which to build a family of interfaces that provide for the challenges we have discussed.


References (still rough and fragmentary)

[ADL] The Alexandria Rapid Prototype: building a digital library for spatial information. This paper was prepared for the 1995 ESRI User Conference Proceedings. Contents of the 1995 User Conference Proceedings Environmental Systems Research Institute, Inc. May 22-25, 1995
http://alexandria.sdc.ucsb.edu/

Apple Computer. Human Interface Guidelines: The Apple Desktop Interface. Reading, MA: Addison-Wesley, 1987.

[Sensemaker] Michelle Baldonado ....

Bush, Vannevar. As we may think. Atlantic. 176 (July, 1945), 101-108.

[DLI] Summaries of the six DLI projects from the May 1996, Special Issue on Digital Libraries in the Institute of Electrical and Electronics Engineers, IEEE Computer Magazine.
http://dli.grainger.uiuc.edu/national.htm

Edwin Hutchins, James Hollan, and Donald Norman, Direct Manipulation Interfaces, in D. Norman and S. Draper, eds., User Centered System Design, Erlbaum, 1986, pp. 87-124.

[SuperBook] Egan, D.E.. Remde, J.R.; Gomez, L.M.; Landauer, T.K.; Eberhardt, J.; Lochbaum, C.C. Formative design-evaluation of SuperBook. ACM Transactions on Information Systems (Jan. 1989) vol.7, no.1, p. 30-57. See also <http://superbook.bellcore.com/SB/ >

Gaver, Bill. Technology Affordances, CHI 91 Conference Proceedings, ACM Press, 1991, pp. 79-84.

Highwire Press <http://highwire.stanford.edu >

Johnson, Jeff, Terry Roberts, William Verplank, David C. Smith, Charles Irby, Marian Beard, and Kevin Mackey. Xerox Star: A retrospective, IEEE Computer 22:9 (September, 1989), 11-29.

Journal Storage Project (JSTOR) <http://jstor.umdl.umich.edu />

[Informedia] Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., & Wactlar, H. Informedia Digital Video Library. Communications of the ACM, 38(4):57-58 (1995).

[UPAI] Steve Ketchpel....

Liddle, David. Starting with the User Experience. In T. Winograd (ed.), Bringing Design to Software. Reading MA: Addison-Wesley, 1996. xx-xx.

Norman, Donald. The Design of Everyday Things. New York: Basic Books, 1988.

[Berkeley] Virginia E. Ogle and Robert Wilensky, Testbed Development for the Berkeley Digital Library Project, D-lib Magazine, July 1996, ISSN 1082-9873.

Paepcke, Andreas, Steve Cousins, Hector Garcia-Molina, Scott Hassan, Steven Ketchpel, Martin Röscheisen, and Terry Winograd (1996), "Using distributed objects for digital library interoperability", IEEE Computer, 29:5 (May, 1996) 61-68.

[MVD] Thomas A. Phelps and Robert Wilensky, Multivalent Digital Documents in UC Berkeley's Digital Library Project, SIGLINK, Vol. 4, No. 2, September 1995.

Winograd, Terry (ed.).Bringing Design to Software. Reading MA: Addison-Wesley, 1996.