Out of the Jar into the World! A Case Study on Storing and Sharing Vertebrate Data


In 2018, the Deep Blue Repositories and Research Data Services (DBRRDS) team at the University of Michigan Library began working with the University of Michigan Museum of Zoology (UMMZ) to provide a persistent and sustainable (i.e., non-grant funded, institutionally supported) solution for their part of the National Science Foundation’s (NSF) openVertebrate (oVert) initiative. The objective of oVert is to the digitize scientific collections of thousands of vertebrate specimens stored in jars on museum shelves and make the data freely accessible to researchers, students, classrooms, and the general public anywhere in the world. The University of Michigan (U-M) is one of five scanning centers working on oVert and will contribute scans of more than 3,500 specimens from the UMMZ collections (Erickson 2017).

In addition to ingesting scans, the project involved developing methods to work around several significant system constraints: Deep Blue Data’s file structure (flat files only, no folders) and the closed use of Specify, UMMZ’s specimen database, for specimen metadata. DBRRDS had to create a completely new workflow for handling batch deposits at regular intervals, develop scripts to reorganize the data (according to a third-party data model) and augment the metadata using a third-party resource, Global Biodiversity Information Facility (GBIF).

This paper will describe the following aspects of the UMMZ CT Scanning Project partnership in greater detail: data generation, metadata requirements, workflows, code development, lessons learned, and next steps.

[This paper is a conference pre-print presented at IDCC 2020 after lightweight peer review.]


Conference Pre-prints