A Short Story about XML Schemas, Digital Preservation and Format Libraries

  • Steve Knight

Abstract

One morning we came in to work to find that one of our servers had made 1.5 million attempts to contact an external server in the preceding hour. It turned out that the calls were being generated by the Library’s digital preservation system (Rosetta) while attempting to validate XML Schema Definition (XSD) declarations included in the XML files of the Library’s online newspaper application Papers Past, which we were in the process of loading into Rosetta. This paper describes our response to this situation and outlines some of the issues that needed to be canvassed before we were able to arrive at a suitable solution, including the digital preservation status of these XSDs; their impact on validation tools, such as JHOVE; and where these objects should reside if they are considered material to the digital preservation process.
Published
09-Mar-2012
Section
Papers (Peer-reviewed)