Printed words on digital approaches towards editing written works.

Edward Vanhoutte

A review of

When in 1980 Susan Hockey and Robert Oakman each published an introductory volume on the use of computing for humanities students–respectively A Guide to Computer Applications in the Humanities (Hockey 1980) and Computer Methods for Literary Research (Oakman 1980)–, textual scholars were at once provided with two reference chapters on textual criticism and textual editing with the computer. Technologies and insights have changed since, but still both these chapters very much remain to serve as standard introductions to the field.

When comparing Hockey (1980) to her more recently published Electronic Texts in the Humanities (Hockey 2000) which appeared two decades later, the reader can observe the main shift in the debate on electronic (assisted) editing over time. Hockey rightly presents this shift on a chronological before-and-after-axis with her book somewhere in the middle explaining the before and introducing the after. The chapter, therefore, refocuses when the author states that "It is not surprising that editors have for long viewed the computer as a useful tool for typesetting complex editions. [...] But more and more editors are attracted to the idea of distributing and publishing editions electronically." (Hockey 2000, 131 & 132). This shift also shows in the renaming of the chapter's title. Whereas the 1980 chapter was named 'Textual Criticism' and focused primarily on the use of the computer in the collation of manuscripts and in finding the relationships between manuscripts, the 2000 chapter was named 'Textual Criticism and Electronic Editions' while two thirds of the chapter hints at the theme of genuine electronic editions, or electronic editions pur sang, which are created, live, and are distributed electronically.

This shift is of course the result of the development of new tools and techniques, not to mention the advent of the desktop PC in the eighties with its exponentially growing storage capacity and processing speed, and the launch of the World Wide Web as a universal and graphical disseminating and publishing medium only in 1993. But the major contribution to the field of electronic editing, and of humanities computing in general, was made by the work of the Text Encoding Initiative (TEI) which constituted a methodological shift in textual sciences, and which is now maintained by the TEI Consortium <>. With the publication of the green voluminous P3 Guidelines for Electronic Text Encoding and Interchange in 1994 (Sperberg-McQueen & Burnard 1994) scholarly editors were presented with recommendations for the 'Transcription of Primary Sources' (chapter 18) and the encoding of the 'Critical Apparatus' (chapter 19). A frequently heard critique of scholarly editors who work with modern material is that the guidelines are too much and clearly focused on the production of transcriptions and editions of older material. A close look at the workgroups which created the document type definition subsets for these two chapters shows why that is: the majority of the members were scholars of older texts. Adapting these guidelines to the transcription and edition of modern material might, however, be a difficult exercise and might need some stretching of the guidelines, but it is by no means impossible. One of the basic principles which the Guidelines defend is that each and any scholar should have the freedom to express his or her own theory of the text by means of text encoding and markup. Therefore, the TEI provides humanities scholars of any discipline, language or writing system with a very powerful extension mechanism which can accommodate the TEI Document Type Definitions (DTDs) at will. A clever piece of software which is called 'The Pizza Chef' even allows you to generate your customized TEI DTDs on-line. The work of the Text Encoding Initiative (TEI) will no doubt prove to become of even more importance in the future, now that the TEI Consortium has issued a completely revised and XML-compatible version of its Guidelines for Electronic Text Encoding and Interchange (TEI P4) (Sperberg-McQueen & Burnard 2002) in a fashionable blue and silver.

The revision of the Guidelines has resulted in a couple of changes when compared to TEI P3. Apart from the typography which has been changed with mixed success–the elements are now introduced with their respective attributes, which wasn't the case in P3. The frequently consulted alphabetical reference list of classes, entities and elements at the end of the second volume has been cleared from some systematic errors and omissions, and the editors have changed the format of this section substantially "we hope for the better" (1059). It is still the question if this typographically revamping indeed presents the reference section in a better way. But then again, the on-line version of the guidelines which can be freely consulted on <> is an extremely useful and user friendly manual for those who don't mind reading on the screen, that is. Further changes to the P3 version are documented in the prefatory notes at the back of the book and comprise the complete XML-ization of the TEI fragments with backwards TEI-SGML compatibility (tag omission is not allowed anymore), the validity check on all the examples scattered throughout the guidelines, and a new second chapter 2 'A Gentle Introduction to XML'. A new <ab> (anonymous block) element was introduced "to contain any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph." (719).

To the textual critic who has worked with TEI in creating electronic editions back in the SGML era, two questions remain with the publication of P4. Firstly, the riddle of the different content models of <add> and <del> (specialPara and phrase.seq respectively) has not been solved, and secondly, the TEI community would be much helped with an XSLT stylesheet for the transformation of normalized (thus capitalized) SGML legacy data to case sensitive TEI XML. The TEI Consortium has commissioned a working group to look into the problems of migrating legacy data from SGML to XML. It is much hoped that the work of the working group will enable projects to remain compatible with the upcoming P5 version of the TEI.

The development, application of, and the critique on the recommendations of the TEI have resulted in some new philosophical and philological challenges which are articulated in new ways of thinking about texts, text ontology, heuristics, the semantics of markup, the preservation of data, the academic evaluation of humanities computing, and the definition of a field or a community. In contrast with the existence of introductory books on tools and techniques for the manipulation and analysis of electronic texts, no comprehensive introductory volume exists on the more philosophical questions. They can however be found in (internal) reports, articles and (conference) papers, as was the case in the early days of the use of the computer as a tool in humanities subjects. Next to the ongoing debates on the main conferences and colloquia, over the last two years, a couple of important contributions towards this new field of reflective thinking about Humanities Computing and electronic scholarly editing have appeared in various essay volumes.

The volume New Media and the Humanities. Research and Applications (Fiormonte & Usher, 2001) publishes 14 essays as the proceedings of the first seminar Computers, Literature and Philology which took place in Edinburgh in September 1998. The CliP seminar has seen its fifth birthday in 2002, and has established itself as an international forum for discussion on the interdisciplinarity and intercultural characteristics of Humanities Computing. Quoting the editors from the introduction

The papers cover the gamut of current activity from the study of philosophical implications, through the technical aspects of encoding, the practical difficulties of applying standardised procedures to individual projects, to the crucial organisation of informatics communities in the humanities, and the use of computers in the teaching of literature. (Fiormonte & Usher 2001, viii).

That "current" must of course be read in the context of the 1998 colloquium.

At least six of these essays, one of which is published in Italian (Massimo Guerrieri's 'Per una edizione informatica dei Mottetti di Eugenio Montale: varianti e analisi statistica') are of direct interest to the theory and practice of electronic editing. Claire Warwick in her contribution '"Reports of my death have been greatly exaggerated": Scholarly editing in the digital age' (49-56) warns against the danger to consider electronic texts merely as data, which may result in the creation of non-scholarly repositories of electronic texts, of which we all know many examples. The creation of such digital repositories or digital libraries should be governed by a theory of non-critical editing for which very little attention exists in the current handbooks and survey articles on scholarly editing. The specialized journals mainly, if not exclusively, focus on the theory and practice of critical editing. In his article 'Literal transcription – Can the text ontologist help' (23-30) Allen Renear points out that the distribution of attention to and resources for the study of critical editing and the study of non-critical editing is in inverse proportion to their relative practical importance. He suggests that a theory of literal transcription can be derived from a principled methodology such as a form of text ontology. If literal transcription "gives us the text, the whole text, and nothing but the text" (25), Renear argues, we should ask the question "What is text?" Providing an answer to this question is in se a practical matter which should be project driven. Only on the basis of an ontology of the text can a single encoding scheme be designed which makes a selection amongst the huge variety of scholarly knowledge at our disposal to be expressed in a single semiotic system. Whereas the goals of critical and non-critical editing are clearly distinguishable–the representation of the physical object is what transcription aims at, while the idealised version of the object which is constituted as the result of editing, could well have never existed physically–Lou Burnard contends that "the process by which they are achieved seem strikingly similar, involving the same essentially interpretative relationship between the agent/reader and the object/text." (35), and that they just use different markup schemes, i.e. different research agendas which result from an hermeneutic activity. In his essay 'On the hermeneutic implications of text encoding' (31-38). Burnard emphasizes the importance of a single encoding scheme to the emerging discipline of digital transcription: "By using a single formalism we reduce the complexity inherent in representing the interconnectedness of all aspects of our hermeneutic analysis, and thus facilitate a polyvalent analysis." (37). Through this mediation by a set of codes, the computer can process human interpretation. This is corroborated by Fabio Ciotti who in his meditation on 'Text encoding as a theoretical language for text analysis' (39-47) states that "the transcription process of a text is oriented from the very beginning by criteria chosen by who performs it." (43). The question of text ontology and the choice of markup scheme is touched upon again when Ciotti considers encoding language as a theoretical language which is used by the scholar "to build up theories or models of textual phenomena he is interested in, and to explain his interpretative hypothesis about a certain object of study (the text and its feature at a certain level of analysis)." (45). Staffan Björk and Lars Erik Holmquist add to this textual debate an attention for the visualisation of textual material which exists in different variants. After having described the principles of information retrieval systems and visualization techniques, they introduce in their article 'Exploring the Literary Web: The Digital Variants Browser' an interesting tool which is in use at the Department of Italian at the University of Edinburgh. Through contracts with contemporary Italian authors, the Italian Department makes different stages of writing available to students of text, in order for them to realize and study the complex writing phenomena underlying the final version of a literary work. The Digital Variants Browser adapts the flip zooming technique to a split zooming technique which allows comparison between different versions or writing stages of a text as a focus in context. Further, the graphical presentation of the material is also equipped with powerful searching capabilities which highlights the search string in all variant texts. The occurrence which is found to be of special interest of the student can then be concentrated on by making use of the focus + context technique.

The volume New Media and the Humanities further consists of introductory articles on humanities computing and electronic philology by Willard McCarty and Francisco A. Marcos Martin, on hypertext as a critical discourse by Frederico Pellizzi, on computational linguistics by Antonio Zampolli and Elisabeth Burr, on researching and teaching literature by Giusseppe Gigliozzi, on sounds an their structure in Italian narrative poetry by David Robey, and on the postmodern web by Licia Calvi. The book is an intellectually entertaining collection of essays both by its contents and its form. The different papers invite the reader to find the common denominator amongst the essays and to confront the different views and standpoints demonstrated. The form of the printing invites the reader to solve the many typo's and references which don't match the list of works cited at the back of the book.

Up to now we have mainly dealt with a text-centred approach of humanities computing and electronic editing, but electronic critical and non-critical editing is inherently an interactive enterprise between image and text. Several textual scholars have required the accompaniment of a digital facsimile to its transcription or edition in an electronic paradigm, emphasizing the materialist and social theory of text. The computer as "a venue for representation" (4) is studied in the February 2002 thematic issue of Computers and the Humanities on 'image-based Humanities Computing' (Kirschenbaum 2002). Apart from the editor's introduction, the issue presents two essays on specific applications of image based humanities computing, one report on software development, two broader reflective pieces of writing by Mary Keeler and Jerome McGann, and an excellent selective bibliography on the subject by Bethany Nowviskie (109-131). The weakness of the latter, however, is the fact that the first part of her bibliography (Research and Theory) only mentions three non-English entries. The misleading implication that image based Humanities Computing is an exclusive Anglo-American enterprise, is counterbalanced by a most interesting paper by Eric Lecolnet, Laurent Robert and François Role. In their 'Text-Image Coupling for Editing Literary Resources' (49-73) they demonstrate a system for the interactive transcription of manuscript material and the coupled browsing of image and text as complex hypermedia. Based on human-computer interaction and document analysis techniques on the one hand and the use of XML/TEI on the other, the text/image coupling scheme facilitates both the creation and distribution of edited materials. In order to do so, universally available web-based browsing tools based on XSL transformations as well as specific visualization software are addressed in this paper. The authors also introduce the Ubit toolkit, which is a new GUI toolkit they developed for creating XML editing and browsing tools. No doubt the system, which needs the transcription of a manuscript to be lineated, works well for older or simple modern material, but the editor interested in the creation of editions of complex modern (manuscript) material has not been catered for in this article. The first promising figure of this article, however, shows a page from a complex manuscript by Flaubert, but the five further figures which show the system at work, use a simple medieval manuscript and a simple handwritten text without any authorial additions, deletions, substitutions or revisions.

In the first two articles we find an interesting disagreement on the issue of image manipulation and editor's interference. Kevin Kiernan, Brent Seales and James Griffioen rely heavily on image manipulation in restoring and reconstructing the badly damaged and dispersed manuscript of the Life of St. Basil the Great (7-26). Their e-foliation, ultraviolet techniques and image processing make this "wreck of a manuscript" (7) legible and accessible for further research. Joseph Viscomi, on the other hand, strongly defends a documentary approach which represents the artifact as it appears today, and refuses to correct foxing and other visible flaws in the building of the much acclaimed William Blake Archive (27-48). The latter article outlines the imaging protocols in use with the Blake Archive for the reproduction of Blakes pictorial poetry. This paper, together with the others which form this very fine (though extremely expensive) volume is required reading for all editors interested in some sort of image-based of image-inclusive electronic edition.

Digital text and digital images form the basic data for digital libraries. Electronic editions, when published both on stand-alone media and on-line, can become part of the collection of a library, be it a digital one or not. Though not primarily aimed at textual scholars, this very fine book provides useful insights and background reading for every humanities scholar dealing with electronic data. In nine course-book like chapters, beginning and ending with a neat introduction and conclusion respectively, the authors introduce the issues at stake in realizing a digital future for librarians, and they suggests strategies for tackling the problems of digitization, collections development, cost, metadata, protocols and standards, end-user access, preservation, and digital librarians. The non-technical style of this book, its many quotes, references, and pointers, and the extensive bibliography and glossary makes this book a strong candidate, together with Susan Hockey's Electronic Texts in the Humanities, for required student's reading in any course on humanities computing or electronic texts.

The textual critic and electronic editor will find the two introductory chapters ('Digital futures in current contexts' and 'Why digitize?') together with the fourth chapter on the economic factors, the chapter on metadata (chapter 5), and chapter eight 'Preservation' extremely useful and pleasant reading. The last numbered chapter (9) which focuses on new roles for digital librarians for the Information Age, calls the modern digital librarian a knowledge mediator, information architect, hybrid librarian, and knowledge preserver–names which can all be applied to the changing role of the editor in the electronic paradigm. The rest of this chapter can also be transferred to the field of scholarly editing and provide the (aspiring) electronic editor with useful hints. For instance, where Deegan and Tanner see the key to success in the digital context in the development of technical and management skills (project planning, risk management) for which training and education are of vital importance. Together with an adjusted organizational culture and the knowledge of the editing profession already present, the suggestions found in this book on digital libraries could well exceed their direct implications for the library world, and form an indispensable strategic handbook for editing professionals. Or is an electronic edition some sort of digital library after all?


© Edward Vanhoutte, 2002.
This text was published in H.T.M. van Vliet & P.M.W. Robinson (eds.), Variants. The Journal of the European Society for Textual Scholarship. 1 (2002). Turnhout: Brepols, 2002, p. 266-274.

XHTML auteur: Edward Vanhoutte
Last revision: 19/11/2003

Valid XHTML 1.0!