International Journal of Digital Curation <p>The IJDC publishes pre-prints, peer-reviewed papers, articles and editorials on digital curation, research data management and related issues. &nbsp;It complements the International Conference on Digital Curation (IDCC) and includes selected proceedings.</p> en-US <p>Copyright for papers and articles published in this journal is retained by the authors, with first publication rights granted to the University of Edinburgh. It is a condition of publication that authors license their paper or article under a <a href="" rel="license">Creative Commons Attribution Licence</a>.<br><br><a href="" rel="license"><img style="border-width: 0;" src="" alt="Creative Commons License"></a></p> (IJDC Editorial Team) (University of Edinburgh Library Learning Services) Mon, 31 Dec 2018 00:00:00 +0000 OJS 60 Measuring FAIR Principles to Inform Fitness for Use <p class="abstract-western">For open science to flourish, data and any related digital outputs should be discoverable and re-usable by a variety of potential consumers. The recent FAIR Data Principles produced by the Future of Research Communication and e-Scholarship (FORCE11) collective provide a compilation of considerations for making data findable, accessible, interoperable, and re-usable. The principles serve as guideposts to ‘good’ data management and stewardship for data and/or metadata. On a conceptual level, the principles codify best practices that managers and stewards would find agreement with, exist in other data quality metrics, and already implement. This paper reports on a secondary purpose of the principles: to inform assessment of data’s FAIR-ness or, put another way, data’s fitness for use. Assessment of FAIR-ness likely requires more stratification across data types and among various consumer communities, as how data are found, accessed, interoperated, and re-used differs depending on types and purposes. This paper’s purpose is to present a method for qualitatively measuring the FAIR Data Principles through operationalizing findability, accessibility, interoperability, and re- usability from a re-user’s perspective. The findings may inform assessments that could also be used to develop situationally-relevant fitness for use frameworks.</p> Bradley Wade Bishop, Carolyn Hank ##submission.copyrightStatement## Sat, 22 Dec 2018 21:23:50 +0000 Media Digitization and Preservation Initiative: A Case Study <p class="abstract-western">Since its creation nearly a decade ago, the Digital Curation Centre (DCC) Curation Lifecycle Model has become the quintessential framework for understanding digital curation. Organizations and consortia around the world have used the DCC Curation Lifecycle Model as a tool to ensure that all the necessary stages of digital curation are undertaken, to define roles and responsibilities, and to build a framework of standards and technologies for digital curation. Yet, research on the application of the model to large-scale digitization projects as a way of understanding their efforts at digital curation is scant. This paper reports on findings of a qualitative case study analysis of Indiana University Bloomington’s multi-million-dollar Media Digitization and Preservation Initiative (MDPI), employing the DCC Curation Lifecycle Model as a lens for examining the scope and effectiveness of its digital curation efforts. Findings underscore the success of MDPI in performing digital curation by illustrating the ways it implements each of the model’s components. Implications for the application of the DCC Curation Lifecycle Model in understanding digital curation for mass digitization projects are discussed as well as directions for future research.</p> Devan Ray Donaldson, Allison McClanahan, Leif Christiansen, Laura Bell, Mikala Narlock, Shannon Martin, Haley Suby ##submission.copyrightStatement## Sun, 23 Dec 2018 21:06:00 +0000 Privacy Concerns in Qualitative Video Data Reuse <div class="WordSection1"> <p class="abstract-western">In this article, we examine how data producers’ and reusers’ privacy concerns shape their views about data sharing and reuse in the field of education, with an emphasis on video records of practice. We find that data producers and reusers were concerned about the risks that qualitative data, and video records of practice in particular, present to themselves, their colleagues, and the subjects represented in the data. Specifically, they emphasized risks relating to the privacy the subjects – teachers and students who appear in the videos. In response to these risks, data producers have engaged in a number of strategies to minimize risk and/or mitigate potential harm including: (1) education and training; (2) using informed consent to facilitate and/or restrict data sharing; and (3) limiting data capture/production. We discuss the implications that our findings have for digital repositories, and for efforts to facilitate the sharing and reuse of qualitative video data in education.</p> </div> Rebecca D. Frank, Allison R. B. Tyler, Anna Gault, Kara Suzuka, Elizabeth Yakel ##submission.copyrightStatement## Mon, 19 Aug 2019 03:13:59 +0100 Getting to Beta <p>Libraries and archives are increasingly producing subject-based digital collections alongside, but separate from, their main digital collections. These smaller projects are often treated as digital one-offs; they are created, launched, promoted, and then largely forgotten. The authors of this study argue that small-scale digital collections instead be treated as test cases for their institutions’ main digitization programs. Because they are lightweight and have relatively low stakes, these collections get pushed through the system quickly and can illuminate its workings and shortcomings in a snapshot form. The authors treat their own experience in developing the Animal Welfare Act History Digital Collection at the National Agricultural Library as a case study in using a digital collection to test and revise an institution’s digitization program. In so doing, this study suggests how agile projects like the AWAHDC can be core components in digital curation policies and their implementation.&nbsp;</p> Kathryn Gucer, Kristina Adams, Chuck Schoppet, Ricardo Punzalan ##submission.copyrightStatement## Sat, 13 Apr 2019 20:13:30 +0100 Participatory Prototype Design: Developing a Sustainable Metadata Curation Workflow for Maternal Child Health Research <div class="WordSection1"> <p class="abstract-western"><span style="color: #000000;">T</span><span style="color: #000000;">his paper describes the findings from a participatory prototype design project, where the authors worked with maternal and child health (MCH) researchers and stakeholders to develop a MCH metadata profile and sustainable curation workflow. This work led to the development of three prototypes: 1) a study catalogue hosted in Dataverse, 2) a metadata and research records repository hosted in REDCap and 3) a metadata harvesting tool/dashboard hosted within the Shiny RStudio environment. We present a brief overview of the methods used to develop the metadata profile, curation workflow and prototypes. Researchers and other stakeholders were participant-collaborators throughout the project. The participatory process involved a number of steps, including but not limited to: initial project design and grant writing; scoping and mapping existing practices, workflows and relevant metadata standards; creating the metadata profile; developing semi-automated and manual techniques to harvest and transform metadata; and end project sustainability/future planning. In this paper, we discuss the design process and project outcomes, limitations and benefits of the approach, and implications for researcher-oriented metadata and data curation initiatives.</span></p> </div> Amanda Harrigan, Saurabh Vashishtha, Sharon Farnel, Kendall Roark ##submission.copyrightStatement## Fri, 28 Dec 2018 12:59:44 +0000 Giving Datasets Context: a Comparison Study of Institutional Repositories that Apply Varying Degrees of Curation <p class="abstract-western">This r<span style="color: #000000;">esearch study compared four academic libraries’ approaches to curating the metadata of dataset submissions in their institutional repositories and classified them in one of four categories: no curation, pre-ingest curation, selective curation, and post-ingest curation. The goal is to understand the impact that curation may have on the quality of user-submitted metadata. The findings were 1) the metadata elements varied greatly between institutions, 2) repositories with more options for authors to contribute metadata did not result in more metadata contributed, 3) pre- or post-ingest curation process could have a measurable impact on the metadata but are difficult to separate from other factors, and 4) datasets submitted to a repository with pre- or post-ingest curation more often included documentation.</span></p> Amy Koshoffer, Amy E. Neeser, Linda Newman, Lisa R Johnston ##submission.copyrightStatement## Fri, 21 Dec 2018 21:28:50 +0000 Complexities of Digital Preservation in a Virtual Reality Environment, the Case of Virtual Bethel <p class="abstract-western" lang="en-US">The complexity of preserving virtual reality environments combines the challenges of preserving singular digital objects, the relationships among those objects, and the processes involved in creating those relationships. A case study involving the preservation of the Virtual Bethel environment is presented. This case is active and ongoing. The paper provides a brief history of the Bethel AME Church of Indianapolis and its importance, then describes the unique preservation challenges of the Virtual Bethel project, and finally provides guidance and preservation recommendations for Virtual Bethel, using the National Digital Stewardship Alliance Levels of Preservation. Discussion of limitations of the guidance and recommendations follow.</p> Angela P. Murillo, Lydia Spotts, Andrea Copeland, Ayoung Yoon, Zebulun M Wood ##submission.copyrightStatement## Fri, 21 Dec 2018 18:56:00 +0000 Disciplinary Data Publication Guides <p class="abstract-western">Many academic disciplines have very comprehensive standard for data publication and clear guidance from funding bodies and academic publishers. In other cases, whilst much good-quality general guidance exists, there is a lack of information available to researchers to help them decide which specific data elements should be shared. This is a particular issue for disciplines with very varied data types, such as engineering, and presents an unnecessary barrier to researchers wishing to meet funder expectations on data sharing.&nbsp;<span style="color: #000000;">This&nbsp;</span><span style="color: #000000;">article&nbsp;</span><span style="color: #000000;">outlines a project to provide simple, visual, discipline-specific guidance on data publication, undertaken at the University of Bristol at the request of the Faculty of Engineering</span><span style="color: #000000;">.</span></p> Zosia Beckles, Stephen Gray, Debra Hiom, Kirsty Merrett, Kellie Snow, Damian Steer ##submission.copyrightStatement## Thu, 27 Dec 2018 12:52:28 +0000 Secure Data for the Future: A Risk Assessment <p class="abstract-western">The guarantee of secure and authentic future access to any digital data is a big worry to those who work with data now and those who are responsible to keep it accessible for the future. There are a wide range of threats to digital data that these people should need to take into consideration. The project PreservIA had the goal to assess the risks of using analogue 35mm film to store and preserve digital information and define its strengths and weaknesses for long-term secure preservation of all kinds of digital data.</p> <p class="abstract-western">The research project was examining the application of the Piql technology to ensure the security, integrity and authenticity of the information stored on a unique storage medium. PiqlFilm has been designed for a life span of 500 years or more and the research tries to assess how well this solution could maintain the authenticity and availability of the information, independently of internal and external changes in the surrounding environment over time.</p> <p class="abstract-western">The research project has been designed using a scenario-based approach and the morphological method of scenario development is used to define a set of scenarios covering the risks to the service.</p> <p class="abstract-western">The scenario classes used were accident, technical error, natural disaster, crime, sabotage, espionage, terrorism, armed conflict and nuclear war. A scenario template has been included for the purpose of describing current and future scenarios. The final scenario analysis identified potential vulnerabilities.</p> <p class="abstract-western">The paper shows briefly how Piql Preservation Services holistic preservation approach perform the work, defines a methodology to select the scenarios for the assessment and then studies the vulnerabilities and security challenges of the solution on those scenarios. The project also includes a comparison of other existing storage media to evaluate their robustness to the addressed scenarios in relation to Piql technology.</p> Bendik Bryde, Roberto Gonzalez ##submission.copyrightStatement## Wed, 01 May 2019 13:56:25 +0100 Operationalizing the Replication Standard <p class="abstract-western">In response to widespread concerns about the integrity of research published in scholarly journals, several initiatives have emerged that are promoting research transparency through access to data underlying published scientific findings. Journal editors, in particular, have made a commitment to research transparency by issuing data policies that require authors to submit their data, code, and documentation to data repositories to allow for public access to the data. In the case of the American Journal of Political Science (AJPS) Data Replication Policy, the data also must undergo an independent verification process in which materials are reviewed for quality as a condition of final manuscript publication and acceptance.</p> <p class="abstract-western">Aware of the specialized expertise of the data archives, AJPS called upon the Odum Institute Data Archive to provide a data review service that performs data curation and verification of replication datasets. This article presents a case study of the collaboration between AJPS and the Odum Institute Data Archive to develop a workflow that bridges manuscript publication and data review processes. The case study describes the challenges and the successes of the workflow integration, and offers lessons learned that may be applied by other data archives that are considering expanding their services to include data curation and verification services to support reproducible research.</p> Thu-Mai Lewis Christian, Sophia Lafferty-Hess, William G Jacoby, Thomas Carsey ##submission.copyrightStatement## Sun, 23 Dec 2018 21:34:51 +0000 From Passive to Active, From Generic to Focussed: How Can an Institutional Data Archive Remain Relevant in a Rapidly Evolving Landscape? <p class="abstract-western">Founded in 2008 as an initiative of the libraries of three of the four technical universities in the Netherlands, the 4TU.Centre for Research Data (4TU.Research Data) has provided a fully operational, cross-institutional, long-term archive since 2010, storing data from all subjects in applied sciences and engineering. Presently, over 90% of the data in the archive is geoscientific data coded in netCDF (Network Common Data Form) – a data format and data model that, although generic, is mostly used in climate, ocean and atmospheric sciences. In this practice paper, we explore the question of how 4TU.Research Data can stay relevant and forward-looking in a rapidly evolving research data management landscape. In particular, we describe the motivation behind this question and how we propose to address it.</p> Maria J. Cruz, Jasmin Böhmer, Egbert Gramsbergen, Marta Teperek, Alastair Dunning, Madeleine de Smaele ##submission.copyrightStatement## Wed, 22 May 2019 17:13:18 +0100 Designing and Building Interactive Curation Pipelines for Natural Hazards in Engineering Data <p class="abstract-western">To design data curation pipelines within DesignSafe-CI, we gathered requirements and sought regular guidance from a group of experts in different aspects of natural hazards engineering research. Upon achieving understanding of experimental, simulation, hybrid simulation and field reconnaissance research workflows, we created four data models to guide data organization and developed specialized vocabularies as metadata. We then translated the models and metadata to interface design (front-end), and selected the infrastructure resources that would support curation and publication functions (back-end). We used iterative design and testing, including the use of interactive mockups of the GUI, to communicate and elicit feedback from the experts, and mapped real datasets to the mockups to evaluate the fitness of the data models, the clarity of the curation tasks. To address the problem of big data interfaces, we provide data representations that highlight the structure of the datasets and the possibility to browse their components in relation to provenance.</p> Maria Esteva, Craig Jansen, Josue Balandrano Coronel ##submission.copyrightStatement## Mon, 19 Aug 2019 03:13:59 +0100 The Impact on Authors and Editors of Introducing Data Availability Statements at Nature Journals <p class="abstract-western">This article describes the adoption of a standard policy for the inclusion of data availability statements in all research articles published at the Nature family of journals, and the subsequent research which assessed the impacts that these policies had on authors, editors, and the availability of datasets. The key findings of this research project include the determination of average and median times required to add a data availability statement to an article; and a correlation between the way researchers make their data available, and the time required to add a data availability statement.</p> Rebecca Grant, Iain Hrynaszkiewicz ##submission.copyrightStatement## Thu, 27 Dec 2018 17:45:51 +0000 Curating Scientific Workflows for Biomolecular Nuclear Magnetic Resonance Spectroscopy <p class="abstract-western"><span style="color: #000000;">This paper describes our recent and ongoing efforts to enhance the curation of scientific workflows to improve reproducibility and reusability of biomolecular nuclear magnetic resonance (bioNMR) data. Our efforts have focused on both developing a workflow management system, called CONNJUR Workflow Builder (CWB), as well as refactoring our workflow data model to make use of the PREMIS model for digital preservation. This revised workflow management system will be available through the NMRbox cloud-computing platform for bioNMR. In addition, we are implementing a new file structure which bundles the original binary data files along with PREMIS XML records describing the provenance of the data. These are packaged together using a standardized file archive utility. In this manner, the provenance and data curation information is maintained together along with the scientific data. The benefits and limitations of these approaches, as well as future directions, are discussed in this paper.</span></p> Douglas Heintz, Michael R Gryk ##submission.copyrightStatement## Fri, 19 Apr 2019 12:36:54 +0100 Tiny Data: Building a Community of Practice around Humanities Datasets <p class="abstract-western"><span style="color: #000000;">Quantitative data, the foundation of scientific research, have been in the foreground of discussions about data creation, curation, and publication pipelines. However, data for humanistic and social scientific inquiries take many forms, including physical and ephemeral primary resources (books, objects, performances, interactions); qualitative, free-form observations; as well as quantitative, structured data and metadata. At the Vanderbilt University Jean and Alexander Heard Library, we started the Tiny Data Working Group (TDWG) in 2016 to tackle some of the humanistic research data creation and curation issues in a constructive, collaborative, and interdisciplinary format. The present paper considers what it means to be FAIR with humanities data, as well as how to build a community of data-literate humanists, based on our experiences with the TDWG</span><span style="color: #000000;">.</span></p> Veronica Ikeshoji-Orlati, Mary Anne Caton, Suellen Stringer-Hye ##submission.copyrightStatement## Tue, 30 Apr 2019 22:04:49 +0100 Data Curation Network: A Cross-Institutional Staffing Model for Curating Research Data <p class="abstract-western">Funders increasingly require that data sets arising from sponsored research must be preserved and shared, and many publishers either require or encourage that data sets accompanying articles are made available through a publicly accessible repository. Additionally, many researchers wish to make their data available regardless of funder requirements both to enhance their impact and also to propel the concept of open science. However, the data curation activities that support these preservation and sharing activities are costly, requiring advanced curation practices, training, specific technical competencies, and relevant subject expertise. Few colleges or universities will&nbsp;be able to hire and sustain all of the data curation expertise locally that its researchers will require, and even those with the means to do more will benefit from a collective approach that will allow them to supplement at peak times, access specialized capacity when infrequently-curated types arise, and stabilize service levels to account for local staff transition, such as during turn-over periods. The Data Curation Network (DCN) provides a solution for partners of all sizes to develop or to supplement local curation expertise with the expertise of a resilient, distributed network, and creates a funding stream to both sustain central services and support expansion of distributed expertise over time. This paper presents our next steps for piloting the DCN, scheduled to launch in the spring of 2018 across nine partner institutions. Our implementation plan is based on planning phase research performed from 2016-2017 that monitored the types, disciplines, frequency, and curation needs of data sets passing through the curation services at the six planning phase institutions. Our DCN implementation plan includes a well-coordinated and tiered staffing model, a technology-agnostic submission workflow, standardized curation procedures, and a sustainability approach that will allow the DCN to prevail beyond the grant-supported implementation phase as a curation-as-service model.</p> Lisa R Johnston, Jake Carlson, Cynthia Hudson-Vitale, Heidi Imker, Wendy Kozlowski, Robert Olendorf, Claire Stewart, Mara Blake, Joel Herndon, Timothy M. McGeary, Elizabeth Hull ##submission.copyrightStatement## Wed, 26 Dec 2018 18:34:52 +0000 Making Everything Available. British Library Research Services and Research Data Strategy <p class="abstract-western">The way that researchers generate, analyse and share information keeps evolving at a rapid pace. To ensure that it is well equipped to serve its global user base for years to come, the British Library is transforming the way it works too, from the physical buildings to its digital service portfolio. One key programme, Everything Available, will ensure the Library’s continued support for research with services to enable access to information in an open and timely manner. This paper will describe the activities planned within Everything Available, with a particular focus on the aims of the Library’s recently refreshed Research Data Strategy. It will give an insight into the challenges and opportunities faced by a National Library in providing relevant services in an ‘open’ world.</p> Rachael Kotarski, Torsten Reimer ##submission.copyrightStatement## Thu, 27 Dec 2018 15:11:27 +0000 Remediation Data Management Plans <p><span class="Apple-converted-space">&nbsp;</span>Data <span class="Apple-converted-space">&nbsp;</span> Management <span class="Apple-converted-space">&nbsp;</span> Plans <span class="Apple-converted-space">&nbsp;</span> (DMPs) <span class="Apple-converted-space">&nbsp;</span> have <span class="Apple-converted-space">&nbsp;</span> been <span class="Apple-converted-space">&nbsp;</span> used <span class="Apple-converted-space">&nbsp;</span> in <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> last <span class="Apple-converted-space">&nbsp;</span> decade <span class="Apple-converted-space">&nbsp;</span> to <span class="Apple-converted-space">&nbsp;</span> encourage <span class="Apple-converted-space">&nbsp;</span> good<span class="Apple-converted-space">&nbsp;</span>data <span class="Apple-converted-space">&nbsp;</span> management <span class="Apple-converted-space">&nbsp;</span> practices <span class="Apple-converted-space">&nbsp;</span> among <span class="Apple-converted-space">&nbsp;</span> researchers. <span class="Apple-converted-space">&nbsp;</span> DMPs <span class="Apple-converted-space">&nbsp;</span> are <span class="Apple-converted-space">&nbsp;</span> widely <span class="Apple-converted-space">&nbsp;</span> used, <span class="Apple-converted-space">&nbsp;</span> preventive <span class="Apple-converted-space">&nbsp;</span> tools<span class="Apple-converted-space">&nbsp;</span>that <span class="Apple-converted-space">&nbsp;</span> encourage <span class="Apple-converted-space">&nbsp;</span> good <span class="Apple-converted-space">&nbsp;</span> data <span class="Apple-converted-space">&nbsp;</span> management <span class="Apple-converted-space">&nbsp;</span> practices. <span class="Apple-converted-space">&nbsp;</span> DMPs <span class="Apple-converted-space">&nbsp;</span> are <span class="Apple-converted-space">&nbsp;</span> traditionally <span class="Apple-converted-space">&nbsp;</span> used <span class="Apple-converted-space">&nbsp;</span> to <span class="Apple-converted-space">&nbsp;</span> manage<span class="Apple-converted-space">&nbsp;</span>data <span class="Apple-converted-space">&nbsp;</span> during <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> planning <span class="Apple-converted-space">&nbsp;</span> stage <span class="Apple-converted-space">&nbsp;</span> of <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> project, <span class="Apple-converted-space">&nbsp;</span> often <span class="Apple-converted-space">&nbsp;</span> required <span class="Apple-converted-space">&nbsp;</span> for <span class="Apple-converted-space">&nbsp;</span> grant <span class="Apple-converted-space">&nbsp;</span> proposals, <span class="Apple-converted-space">&nbsp;</span> and <span class="Apple-converted-space">&nbsp;</span> prior<span class="Apple-converted-space">&nbsp;</span>to <span class="Apple-converted-space">&nbsp;</span> data <span class="Apple-converted-space">&nbsp;</span> collection. <span class="Apple-converted-space">&nbsp;</span> In <span class="Apple-converted-space">&nbsp;</span> this <span class="Apple-converted-space">&nbsp;</span> paper <span class="Apple-converted-space">&nbsp;</span> we <span class="Apple-converted-space">&nbsp;</span> will <span class="Apple-converted-space">&nbsp;</span> use <span class="Apple-converted-space">&nbsp;</span> a <span class="Apple-converted-space">&nbsp;</span> case <span class="Apple-converted-space">&nbsp;</span> study <span class="Apple-converted-space">&nbsp;</span> to <span class="Apple-converted-space">&nbsp;</span> argue <span class="Apple-converted-space">&nbsp;</span> that <span class="Apple-converted-space">&nbsp;</span> Data <span class="Apple-converted-space">&nbsp;</span> Management<span class="Apple-converted-space">&nbsp;</span><span class="Apple-converted-space">&nbsp;</span>Plans <span class="Apple-converted-space">&nbsp;</span> can <span class="Apple-converted-space">&nbsp;</span> be <span class="Apple-converted-space">&nbsp;</span> useful <span class="Apple-converted-space">&nbsp;</span> in <span class="Apple-converted-space">&nbsp;</span> improving <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> management <span class="Apple-converted-space">&nbsp;</span> of <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> data <span class="Apple-converted-space">&nbsp;</span> of <span class="Apple-converted-space">&nbsp;</span> research <span class="Apple-converted-space">&nbsp;</span> projects <span class="Apple-converted-space">&nbsp;</span> that<span class="Apple-converted-space">&nbsp;</span><span class="Apple-converted-space">&nbsp;</span>have <span class="Apple-converted-space">&nbsp;</span> moved <span class="Apple-converted-space">&nbsp;</span> beyond <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> planning <span class="Apple-converted-space">&nbsp;</span> stage <span class="Apple-converted-space">&nbsp;</span> of <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> research <span class="Apple-converted-space">&nbsp;</span> life <span class="Apple-converted-space">&nbsp;</span> cycle. <span class="Apple-converted-space">&nbsp;</span> In <span class="Apple-converted-space">&nbsp;</span> particular, <span class="Apple-converted-space">&nbsp;</span> we <span class="Apple-converted-space">&nbsp;</span> focus<span class="Apple-converted-space">&nbsp;</span><span class="Apple-converted-space">&nbsp;</span>on <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> case <span class="Apple-converted-space">&nbsp;</span> of <span class="Apple-converted-space">&nbsp;</span> active <span class="Apple-converted-space">&nbsp;</span> projects <span class="Apple-converted-space">&nbsp;</span> where <span class="Apple-converted-space">&nbsp;</span> data <span class="Apple-converted-space">&nbsp;</span> has <span class="Apple-converted-space">&nbsp;</span> already <span class="Apple-converted-space">&nbsp;</span> been <span class="Apple-converted-space">&nbsp;</span> collected <span class="Apple-converted-space">&nbsp;</span> and <span class="Apple-converted-space">&nbsp;</span> is <span class="Apple-converted-space">&nbsp;</span> still <span class="Apple-converted-space">&nbsp;</span> being<span class="Apple-converted-space">&nbsp;</span><span class="Apple-converted-space">&nbsp;</span>analyzed. <span class="Apple-converted-space">&nbsp;</span> We <span class="Apple-converted-space">&nbsp;</span> discuss <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> differences <span class="Apple-converted-space">&nbsp;</span> and <span class="Apple-converted-space">&nbsp;</span> commonalities <span class="Apple-converted-space">&nbsp;</span> in <span class="Apple-converted-space">&nbsp;</span> structure <span class="Apple-converted-space">&nbsp;</span> between <span class="Apple-converted-space">&nbsp;</span> preventive<span class="Apple-converted-space">&nbsp;</span>Data <span class="Apple-converted-space">&nbsp;</span> Management <span class="Apple-converted-space">&nbsp;</span> Plans <span class="Apple-converted-space">&nbsp;</span> and <span class="Apple-converted-space">&nbsp;</span> remedial <span class="Apple-converted-space">&nbsp;</span> Data <span class="Apple-converted-space">&nbsp;</span> Management <span class="Apple-converted-space">&nbsp;</span> Plans, <span class="Apple-converted-space">&nbsp;</span> and <span class="Apple-converted-space">&nbsp;</span> describe <span class="Apple-converted-space">&nbsp;</span> in <span class="Apple-converted-space">&nbsp;</span> detail <span class="Apple-converted-space">&nbsp;</span> the<span class="Apple-converted-space">&nbsp;</span>additional <span class="Apple-converted-space">&nbsp;</span> considerations <span class="Apple-converted-space">&nbsp;</span> that <span class="Apple-converted-space">&nbsp;</span> are <span class="Apple-converted-space">&nbsp;</span> needed <span class="Apple-converted-space">&nbsp;</span> when <span class="Apple-converted-space">&nbsp;</span> writing <span class="Apple-converted-space">&nbsp;</span> remedial <span class="Apple-converted-space">&nbsp;</span> Data <span class="Apple-converted-space">&nbsp;</span> Management <span class="Apple-converted-space">&nbsp;</span> Plans:<span class="Apple-converted-space">&nbsp;</span>the <span class="Apple-converted-space">&nbsp;</span> goals <span class="Apple-converted-space">&nbsp;</span> and <span class="Apple-converted-space">&nbsp;</span> audience <span class="Apple-converted-space">&nbsp;</span> of <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> document, <span class="Apple-converted-space">&nbsp;</span> the <span class="Apple-converted-space">&nbsp;</span> data <span class="Apple-converted-space">&nbsp;</span> inventory, <span class="Apple-converted-space">&nbsp;</span> and <span class="Apple-converted-space">&nbsp;</span> an <span class="Apple-converted-space">&nbsp;</span> implementation <span class="Apple-converted-space">&nbsp;</span> plan.<span class="Apple-converted-space">&nbsp;</span></p> Clara Llebot ##submission.copyrightStatement## Thu, 02 May 2019 12:19:39 +0100 National Research Infrastructure - Funder or Partner? <p class="abstract-western">Since 2009 the Australian National Data Services (ANDS) has evolved and matured as a national infrastructure project. This has involved a change in its engagement model; primarily moving from a compliance and milestone driven model, towards a partnering organisation. In 2013 ANDS streamlined its contract management and reporting process and initiated the Institutional Engagement program to assist partnering organisations achieve their research data ambitions. These, amongst other initiatives, helped ANDS move towards operating as a collaborator and partner, rather than solely as a funder.</p> <p class="abstract-western">Between 2013 and 2017 ANDS changed its engagement model during four of its funding programs by offering funding and expertise into projects. However, the uptake of expertise was not as successful in the earlier programs as anticipated. As a result, changes in how ANDS engaged, including working more closely with project partners at the project initiation stage, were introduced. These changes improved ANDS’ ability to become embedded as a trusted and invested partner in the project team. Feedback provided by project partners during surveys and interviews suggests the shift from funder to partner is slowly evolving and moving in the right direction. To continue this process, ANDS, RDS and Nectar have adopted a Partnership Strategy as part of delivering its aligned business plan in 2018.</p> Angeletta Leggio ##submission.copyrightStatement## Wed, 01 May 2019 23:32:26 +0100 Building Open-Source Digital Curation Services and Repositories at Scale <p class="abstract-western" lang="en-US">The focus of this article is to share several in-progress research and development open-source approaches that seek to design, build, and test digital curation services and repositories that have the potential to scale (the IMLS-funded Fedora DRAS-TIC and the NSF-funded Brown Dog). We also discuss the creation of a big records testbed of justice, human rights, and cultural heritage collections (100 TB and 100 million records), the emergence of Computational Archival Science (CAS), and the resulting efforts at integrating digital curation education and research.&nbsp;We ultimately seek to develop a sustainable community of users and developers, with solutions that serve the international library, archives, and scientific data management communities. We are also focused on digital curation training and education in these innovative environments.</p> Richard Marciano, Gregory Jansen, Will Thomas, Sohan Shah, Michael Kurtz ##submission.copyrightStatement## Thu, 27 Dec 2018 16:35:06 +0000 The Administrative Load of Sharing Sensitive Data - Challenges and Solutions? <p class="abstract-western">Sharing data openly has become a straightforward process at the University of Bristol. The University’s top funders mandate or recommend data sharing as a condition of funding, and many publishers require access to research data to enable results of published articles to be verified. The University has provided a dedicated data repository to support this since 2015, and demand for open publication has risen steadily since its inception. However, an increasing number of requests for sharing data relate to data that has ethical, legal or commercial sensitivities and so cannot be published openly.</p> <p class="abstract-western">Rather than discuss the wide-ranging ethical implications of data sharing, this practice paper will focus on the secure sharing of sensitive data that has ethical approval and, where required, has the necessary consent in place, from the perspective of an institution that has already decided to undertake the work inherent in sharing sensitive data. The specific purpose is to detail the workflow and administrative tasks integral in this and to highlight the types of challenges encountered.</p> Kirsty Merrett, Zosia Beckles, Stephen Gray, Debra Hiom, Kellie Snow, Damian Steer ##submission.copyrightStatement## Wed, 22 May 2019 16:41:46 +0100 Incorporating Software Curation into Research Data Management Services: Lessons Learned <p class="abstract-western">Many large research universities provide research data management (RDM) support services for researchers. These may include support for data management planning, best practices (e.g., organization, support, and storage), archiving, sharing, and publication. However, these data-focused services may under-emphasize the importance of the software that is created to analyse said data. This is problematic for several reasons. First, because software is an integral part of research across all disciplines, it undermines the ability of said research to be understood, verified, and reused by others (and perhaps even the researcher themselves). Second, it may result in less visibility and credit for those involved in creating the software. A third reason is related to stewardship: if there is no clear process for how, when, and where the software associated with research can be accessed and who will be responsible for maintaining such access, important details of the research may be lost over time.</p> <p class="abstract-western">This article presents the process by which the RDM services unit of a large research university addressed the lack of emphasis on software and source code in their existing service offerings. The greatest challenges were related to the need to incorporate software into existing data-oriented service workflows while minimizing additional resources required, and the nascent state of software curation and archiving in a data management context. The problem was addressed from four directions: building an understanding of software curation and preservation from various viewpoints (e.g., video games, software engineering), building a conceptual model of software preservation to guide service decisions, implementing software-related services, and documenting and evaluating the work to build expertise and establish a standard service level.</p> Fernando Rios ##submission.copyrightStatement## Fri, 28 Dec 2018 12:20:42 +0000 Keep Calm and Fill in Your DMP: Lessons Learnt from a Swiss DMP-Template Initiative <p class="abstract-western"><span style="color: #000000;"><span style="font-size: small;">Aligning with other funders such as Horizon 2020, the Swiss National Science Foundation (SNSF) requires researchers</span></span><span style="color: #000000;"><span style="font-size: small;">who apply for project funding to provide a Data Management Plan (DMP) as an integral part of their research proposal.</span></span><span style="color: #000000;"><span style="font-size: small;">In an attempt to assist and guide researchers filling out this document, and to provide a service as efficient as possible, the libraries of the Ecole Polytechnique Fédérale de Lausanne (EPFL) and ETH Zurich took the lead to elaborate on a DMP template with content suggestions and recommendations. In this practice paper, we will describe the collaborative effort between </span></span><span style="color: #000000;"><span style="font-size: small;">the two Swiss federal institutes of technology, namely EPFL and ETH Zurich,&nbsp;</span></span><span style="color: #000000;"><span style="font-size: small;">as well as some partners of the national Data Life Cycle Management (DLCM) project, which resulted in a very helpful document as reported by our researchers.</span></span></p> Lorenza Salvatori, Ana Sesartic, Nathalie Lambeng, Eliane Blumer ##submission.copyrightStatement## Thu, 27 Dec 2018 19:02:05 +0000 Data Mining Research with In-copyright and Use-limited Text Datasets: Preliminary Findings from a Systematic Literature Review and Stakeholder Interviews <p class="abstract-western">Text data mining and analysis has emerged as a viable research method for scholars, following the growth of mass digitization, digital publishing, and scholarly interest in data re-use. Yet the texts that comprise datasets for analysis are frequently protected by copyright or other intellectual property rights that limit their access and use. This article discusses the role of libraries at the intersection of data mining and intellectual property, asserting that academic libraries are vital partners in enabling scholars to effectively incorporate text data mining into their research. We report on activities leading up to an IMLS-funded National Forum of stakeholders and discuss preliminary findings from a systematic literature review, as well as initial results of interviews with forum stakeholders. Emerging themes suggest the need for a multi-pronged distributed approach that includes a public campaign for building awareness and advocacy, development of best practice guides for library support services and training, and international efforts toward data standardization and copyright harmonization.</p> Megan Senseney, Eleanor Dickson, Beth Namachchivaya, Bertram Ludäscher ##submission.copyrightStatement## Thu, 27 Dec 2018 17:13:16 +0000 A Landscape Survey of ActiveDMPs Stephanie Simms, Sarah Jones, Tomasz Miksa, Daniel Mietchen, Natasha Simons, Kathryn Unsworth ##submission.copyrightStatement## Thu, 27 Dec 2018 18:43:11 +0000 Data Stewardship Addressing Disciplinary Data Management Needs <p class="abstract-western"><span style="color: #000000;">One of the biggest challenges for multidisciplinary research institutions which provide data management support to researchers is addressing disciplinary differences (Akers and Doty,</span><span style="color: #006b6b;"><span lang="zxx"><a class="western">2013</a></span></span><span style="color: #000000;">). Centralised services need to be general enough to cater for all the different flavours of research conducted in an institution. At the same time, focusing on the common denominator means that subject-specific differences and needs may not be effectively addressed. In 2017, Delft University of Technology (TU Delft) embarked on an ambitious Data Stewardship project, aiming to comprehensively address data management needs across a multi-disciplinary campus. </span>In this article we describe the principles behind the Data Stewardship project at TU Delft, the progress so far, identify the key challenges and explain our plans for the future.</p> Marta Teperek, Maria J. Cruz, Ellen Verbakel, Jasmin Böhmer, Alastair Dunning ##submission.copyrightStatement## Thu, 27 Dec 2018 12:27:42 +0000 Embedded Metadata Patterns Across Web Sharing Environments <p class="abstract-western">This research project tried to determine how or if embedded metadata followed the digital object as it was shared on social media platforms by using EXIFTool, a variety of social media platforms and user profiles, the embedded metadata extracted from selected New York Public Library (NYPL) and Europeana images, PDFs from open access science journals, and captured mobile phone images. The goal of the project was to clarify which embedded metadata fields, if any, migrated with the object as it was shared across social media.</p> Santi Thompson, Michele Reilly ##submission.copyrightStatement## Thu, 27 Dec 2018 19:36:21 +0000 Research Data Management Practices: Synergies and Discords between Researchers and Institutions <p class="abstract-western">The aim of this study was to explore the synergies and discords in attitudes towards research data management (RDM) drivers and barriers for both researchers and institutions. Previous work has studied RDM from a single perspective, but not compared researchers’ and institutions’ perspectives. We carried out qualitative interviews with researchers as well as institutional representatives to identify drivers and barriers, and to explore synergies and discords of both towards RDM. We mapped these to a data lifecycle model and found that the contradictions occur at early stages in the lifecycle of data and the synergies occur at the later stages. This means that for future successful RDM, the points of discord at the start of the data lifecycle must be overcome. Finally, we conclude by proposing key recommendations that could help institutions when addressing both researcher and institutional RDM needs.</p> Sally Vanden-Hehir, Helena Cousijn, Hesham Attalla ##submission.copyrightStatement## Sun, 23 Dec 2018 20:25:27 +0000 Are Research Datasets FAIR in the Long Run? <p class="abstract-western"><span style="color: #000000;">Currently, initiatives in Germany are developing infrastructure to accept and preserve dissertation data together with the dissertation texts (on state level – bwDATA Diss</span><sup><span style="color: #000000;"><a class="sdfootnoteanc" name="sdfootnote1anc"></a>1</span></sup><span style="color: #000000;">, on federal level – eDissPlus</span><sup><span style="color: #000000;"><a class="sdfootnoteanc" name="sdfootnote2anc"></a>2</span></sup><span style="color: #000000;">). In contrast to specialized data repositories, these services will accept data from all kind of research disciplines. To ensure FAIR data principles (Wilkinson et al., </span><span style="color: #006b6b;"><span lang="zxx"><a class="western">2016</a></span></span><span style="color: #000000;">), preservation plans are required, because ensuring accessibility, interoperability and re-usability even for a minimum ten year data redemption period can become a major challenge. Both for longevity and re-usability, file formats matter. In order to ensure access to data, the data’s encoding, i.e. their technical and structural representation in form of file formats, needs to be understood. Hence, due to a fast technical lifecycle, interoperability, re-use and in some cases even accessibility depends on the data’s format and our future ability to parse or render these. </span></p> <p class="abstract-western">This leads to several practical questions regarding quality assurance, potential access options and necessary future preservation steps. In this paper, we analyze datasets from public repositories and apply a file format based long-term preservation risk model to support workflows and services for non-domain specific data repositories.</p> <div id="sdfootnote1"> <p class="western"><a class="sdfootnotesym-western" name="sdfootnote1sym"></a>1</p> <p class="sdfootnote-western"><span style="color: #000000;">BwDATADiss-bw Data for Dissertations:</span><span style="color: #006b6b;"><span lang="zxx"><a class="western" href=""></a></span></span></p> </div> <div id="sdfootnote2"> <p class="sdfootnote-western"><a class="sdfootnotesym-western" name="sdfootnote2sym"></a>2<span style="color: #000000;">EDissPlusDFG-Project – Electronic Dissertations Plus:</span><span style="color: #006b6b;"><span lang="zxx"><a class="western" href=""></a></span></span></p> </div> Dennis Wehrle, Klaus Rechert ##submission.copyrightStatement## Tue, 30 Apr 2019 17:16:14 +0100 Emerging Roles for Optimising Re-Use of Open Government Data Fanghui Xiao, Liz Lyon, Ning Zou, Robert M. Gradeck ##submission.copyrightStatement## Fri, 03 May 2019 14:41:47 +0100