Data Management in the Long Tail: Science, Software, and Service
Scientists in all fields face challenges in managing and sustaining access to their research data. The larger and longer term the research project, the more likely that scientists are to have resources and dedicated staff to manage their technology and data, leaving those scientists whose work is based on smaller and shorter term projects at a disadvantage. The volume and variety of data to be managed varies by many factors, only two of which are the number of collaborators and length of the project. As part of an NSF project to conceptualize the Institute for Empowering Long Tail Research, we explored opportunities offered by Software as a Service (SaaS). These cloud-based services are popular in business because they reduce costs and labor for technology management, and are gaining ground in scientific environments for similar reasons. We studied three settings where scientists conduct research in small and medium-sized laboratories. Two were NSF Science and Technology Centers (CENS and C-DEBI) and the third was a workshop of natural reserve scientists and managers. These laboratories have highly diverse data and practices, make minimal use of standards for data or metadata, and lack resources for data management or sustaining access to their data, despite recognizing the need. We found that SaaS could address technical needs for basic document creation, analysis, and storage, but did not support the diverse and rapidly changing needs for sophisticated domain-specific tools and services. These are much more challenging knowledge infrastructure requirements that require long-term investments by multiple stakeholders.
Copyright for papers and articles published in this journal is retained by the authors, with first publication rights granted to the University of Edinburgh. It is a condition of publication that authors license their paper or article under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence.