Digital Imaging: a practical handbook.

Stuart D. Lee. London: Facet Publishing, 2002.
ISBN: 1-85604-353-3. GBP 24.95.

Review by Edward Vanhoutte

Now that digital imaging is becoming an increasingly important technique in the field of electronic scholarly editing, handbooks and guides to good practice on the subject are most welcomed by scholarly editors who work in the electronic age. Originally written for the electronic library community, this book proves useful for whoever wants to understand both the theoretical but mainly practical implications of a digital imaging project. Writing from his own experiences as a project director of e.g. the Wilfred Owen Multimedia Digital Archive and an author of a Mellon Foundation funded scoping study into the possible future digitization activities at the university of Oxford <>, Stuart Lee, who is the head of the Learning Technologies Group of Oxford University Computing Services, offers in this book a structured guidance for everyone who is about to embark on a digitization project or is interested in this growth area. In five pragmatic chapters, the Handbook follows the life cycle of a digital imaging project, from initial inception to completion, and alerts the reader to the possible pitfalls of such a project. Thanks to its clear structure and language, this book can be used as teaching material.

Taking a hypothetical project as an example (a didactic technique which Lee repeats throughout the book and which emphasizes its handbook character), the first chapter introduces some of the basic questions and definitions underlining the term 'digitization' and the digital project. Starting from the concise definition 'Digitization is the conversion of an analog signal or code into a digital signal or code' (3) and the introductory treatment of the questions What is digitization? and Why do we digitize material?, this chapter provides the reader with a step by step overview of a typical digitization project life-cycle which is further developed in the following chapters. Offering more than just the digitization chain which focuses on the purely technical matter of digitization and which Peter Robinson described in his The digitization of primary textual sources (Oxford, 1993), the digitization project life-cycle described in this book also includes the instigation of the project, the assessment, selection, and preparation of the material, and the editing, delivery and support of the digitized material, calling in expertise from subject and conservation experts, digital and film photographers, cataloguers, IT specialists, management and administration. This brings forward the question of costing which is dealt with quite extensively in chapter four, and for which a handy reckoner is provided, allowing one not only to calculate the costs for hard- and software and the digitization work as such, but also the cost of supporting staff who prepare, select, and repair the material and devote their time to clearing copyrights. This reckoner is typical for the handbook-character of this volume. The procedures explained in this book are often illustrated, summarized, and schematised by checklists, decision matrices, flowcharts, and tables. Some questionnaires which can help prospective project leaders to carry out structured assessments are appended to the book.

The second chapter then looks at the first two stages of a digitization project: instigation and selection and assessment. Digitization techniques in scholarly editing projects are very often used as a reaction to a specific request. Rather than relying on blurry photocopies of documentary source material, or black and white microfilms, scholarly editors nowadays usually request full colour images of inaccessible material for their transcription work. Also, the input for a collation process can be derived from captured images, and very often the publisher of a scholarly edition will want to include several (full colour) images in the publication, be it in printed or electronic form. To meet all these demands, digitization projects must be instigated. This, Lee coins reactive digitization. Here, the editor or the person in charge for the production side of the edition, will need active knowledge of digitization techniques in order to select and assess the material and suggest plausible digitization strategies and methodologies, or to call in and instruct a digitization expert. Passive knowledge of digitization, standards, and techniques will suffice when the editor works with a collection which has already been digitized by its governing body for reasons of preservation and access, which Lee calls proactive digitization. Whichever scenario applies to the editorial project, someone will have to assess and select the items for digitization and provide an answer to the basis question What are you digitizing? Following the decision matrix proposed in this second chapter, the user must assess each item in an archive or collection by different categories such as the status of the copyright on the material, the need for better access or for preservation, and in how far the project adheres to institutional and commercial strategies.

Once the collection has been prioritized or selected, the digitization of the material can start. The third chapter explains why digitization is not a simple process by presenting a technical overview including details on dots, pixels, resolution, compression and interpolation, digital image file formats, digitization hardware (which is dealt with quite extensively) and software. In the second part of this chapter, the digitization assessment stage is under focus. By carrying out such an assessment exercise, a digitization expert, armed with the knowledge so far outlined in the book, will be able to suggest a best digitization procedure and establish whether the digitization should be from the original or from a surrogate. By the assessment exercise, the expert will see 'whether there is sufficient hardware, software, and expertise available to complete the project, bearing in mind the constraints imposed by time and money.' (70) The feasability checklist included in this chapter proves, once again, to be an important instrument for assessing and prioritizing projects.

Following the life-cycle of the digitization project, as introduced in the first chapter of the book, the fourth chapter focuses on the next steps: the preparation of the material, a crucial phase in the digitization process which is very often overlooked, and their actual digitization. The extensive digitization ready reckoner for time and cost which is presented at the end of this chapter revisits the contents of this chapter systematically, resulting in a rough idea of the cost for the complete digital capture of the archive. Depending on the condition, extent, and format of the material, the digitization methods and image formats selected, and major strategic decisions (in-house digitization of outsourcing) this reckoner will give you a rough estimate of the (time and) cost for copyright clearance, preparation of the material, staff, hardware and software, and digitization up to the point of assembling the digital archive, which is the focus of the fifth and last chapter.

Once the digitization process is completed, the project is not over yet. The creation of metadata, archiving, and providing user access to the collection will have to be considered. Lee compares at the beginning of this chapter the standard text-based approach for cataloguing digital images to content-based image retrieval (CBIR) which is a fairly new concept. CBIR is a system which automatically extracts information from the image which can then be indexed and searched. 'In essence, CBIR software which uses advanced algorithms, can process an image by extracting information about the colours used, the textures and the various shapes in the picture.' (105) Practically all current digitization projects use the text-based approach to cataloguing. Indeed the CBIR software is still limited, and users feel somewhat lost when having to search a catalogue, not by lemma, but by shape. When the systems and the software get more advanced, there is certainly a future for CBIR. The input of a text-based catalogue consists for the greater part of (automatically extracted) metadata which can be as concise or extensive as the needs of the project dictate. Lee gives a detailed analysis of different metadata systems, and what should/can be recorded and introduces SGML, XML and HTML as possible markup languages for metadata systems. Next, he touches upon the TEI (Text Encoding Initiative) and EAD (Encoded Archival Description) document type definitions. Delivering the image or the full text (through OCR) to the user is the next challenge which is addressed in this concluding chapter. Different delivery systems and methods are discussed and illustrated with case studies of on-line examples of successful projects. The chapter closes off with a well documented section on copyright and image protection (watermarking), and some notes on archiving digital files. The final conclusion puts everything together, and revisits the complete life-cycle fo a digital image project, presenting a final workflow chart for the project. Lee adds to this his hypothetical utopian vision of an ideal digitization service.

The remainder of the book provides three appendices with a commented list of international digital imaging projects, a series of questionnaires to be used in serveying potential digitization activities and a select reading list for further study. An index completes this excellent handbook which takes the novice on a ride through a whole new world of digitization and which points out the difficulties and pitfalls of digitization projects to the advanced reader.

© 2004 CTB, and Edward Vanhoutte

This text is also published in Dirk Van Hulle & Wim Van Mierlo (eds.), Variants. Reading Notes. The Journal of the European Society for Textual Scholarship. 2/3 (2004). Amsterdam/New York, NY: Rodopi, 2004, p. 352-356.

