Selecting Efficient and Reliable Preservation Strategies
This article addresses the problem of formulating efficient and reliable operational preservation policies that ensure bit-level information integrity over long periods, and in the presence of a diverse range of real-world technical, legal, organizational, and economic threats. We develop a systematic, quantitative prediction framework that combines formal modeling, discrete-event-based simulation, hierarchical modeling, and then use empirically calibrated sensitivity analysis to identify effective strategies.
Specifically, the framework formally defines an objective function for preservation that maps a set of preservation policies and a risk profile to a set of preservation costs, and an expected collection loss distribution. In this framework, a curator’s objective is to select optimal policies that minimize expected loss subject to budget constraints. To estimate preservation loss under different policy conditions optimal policies, we develop a statistical hierarchical risk model that includes four sources of risk: the storage hardware; the physical environment; the curating institution; and the global environment. We then employ a general discrete event-based simulation framework to evaluate the expected loss and the cost of employing varying preservation strategies under specific parameterization of risks.
The framework offers flexibility for the modeling of a wide range of preservation policies and threats. Since this framework is open source and easily deployed in a cloud computing environment, it can be used to produce analysis based on independent estimates of scenario-specific costs, reliability, and risks.
We present results summarizing hundreds of thousands of simulations using this framework. This exploratory analysis points to a number of robust and broadly applicable preservation strategies, provides novel insights into specific preservation tactics, and provides evidence that challenges received wisdom.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright for papers and articles published in this journal is retained by the authors, with first publication rights granted to the University of Edinburgh. It is a condition of publication that authors license their paper or article under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence.