Hypothesis 1. Immediate audio annotation for digital photographs is one of the most convenient ways to quickly and efficiently tag images for future search. 2. The information users put in the voice annotation is relevant to the photos and can be efficiently used to search for them. Evaluation Plan We have already conducted the first set of user tests to evaluate our hypotheses. In this set of evaluations, we used a simple prototype with a camera and a microphone to assess whether users are comfortable with using audio recording to annotate images. We asked the users to operate the camera in three modes: recording the audio prior to taking the image, recording the audio concurrently to taking the image, and recording the audio immediately after taking the image. We observed the following aspects: 1. Which mode, if any, are the users most comfortable using? 2. How do the users respond to indications of poor GPS signal or poor audio recording level? 3. What content do users record to annotate the images and how long, at average, are the voice recordings? For our second stage of evaluations, we plan to test the relevance of the audio annotations to the actual images. For that stage, we will convert the recorded audio annotations into text tags for images (either by hand introducing random errors associated with speech recognition, or using an actual speech recognition software), and ask the users who took the images to search for specific ones. From our initial observations, we believe that the typical audio annotations that the users record may not be sufficiently relevant because of the lack of information. Current prototype The current prototype is an ordinary digital camera, and a digital recorder/player(with an external microphone/earphone headset) capable of time-stamping recorded audio files. The camera has a simulated "Record" and "Play" buttons. The user testing the system wears the headset and as her or she takes the images and uses the additional buttons to record the audio annotation, the experimenter controls the digital recorder. In addition, the user is given the information about the state of the system (simulated GPS signal level, simulated audio recording level) via the changing paper slides overlayed on the camera screen. Further Development To complete evaluation for the second part of our hypothesis, we plan to build a piece of software that uses the data collected in the first set of tests to allow users to search for images they took. The software will allow searching for images by keyword, location, and other image characteristics (date/time taken, exposure). We will use the software to test whether the information that the users added to the images with their voice is actually relevant to the content. Since some time will have passed since the images were taken, the users are expected to only have a vague recollection of the image content and what they recorded. That closely simulates a real-life situation. Related Work Naaman, Mor, et al. Context Data in Geo-Referenced Digital Photo Collections. This paper describes a method of adding a variety of metadata to images based on time and location information. The researchers also conduct a user study to assess which aspects of the metadata are most useful in looking for a particular image. Also other papers by Mor Naaman. Mills, Timothy J., et al. Shoebox: A Digital Photo Management System. This paper describes a piece of software developed at the AT&T Laboratories that is used to browse and search for images that have tags transcribed from audio annotations. The paper reports that the results of retrieval in images annotated in that fashion were faster and more precise than the results obtained using manual retrieval. http://www-lce.eng.cam.ac.uk/publications/files/tr.2000.10.pdf Show&Tell Project, Buffalo University. This is an image annotation system that, along with image-tagging on many other characteristics, includes methods for parsing and interpreting the speech annotations. The authors report on the positive feedback received about the software from users in the intelligence analyst community. Projects of interest also include: FotoFile, CompuPic