A Short Story about XML Schemas, Digital Preservation and Format Libraries

Steve Knight

2012, Vol. 7, No. 1, pp. 72-80



One morning we came in to work to find that one of our servers had made 1.5 million attempts to contact an external server in the preceding hour. It turned out that the calls were being generated by the Library’s digital preservation system (Rosetta) while attempting to validate XML Schema Definition (XSD) declarations included in the XML files of the Library’s online newspaper application Papers Past, which we were in the process of loading into Rosetta. This paper describes our response to this situation and outlines some of the issues that needed to be canvassed before we were able to arrive at a suitable solution, including the digital preservation status of these XSDs; their impact on validation tools, such as JHOVE; and where these objects should reside if they are considered material to the digital preservation process.

The International Journal of Digital Curation. ISSN: 1746-8256
The IJDC is published by the University of Edinburgh
and is a publication of the Digital Curation Centre.