Two Decades, Same Story? Insights and Future Directions in Long Tail Data Curation
DOI:
https://doi.org/10.2218/ijdc.v19i1.1057Abstract
This paper examines the evolution of the concept of long tail research data in the scholarly literature. The “long tail” concept, originally used to describe “niche” digital products that have a significant market share when taken as an aggregate, was first applied to research data in 2007 to refer to a vast array of smaller, heterogeneous data collections that cumulatively represent a substantial portion of scientific knowledge. These datasets are frequently overlooked due to inadequate data management practices and institutional support. Bridging the discussions on data curation in library & information science (LIS) and domain-specific contexts, this paper identifies several themes in these discussions and offers insights, or provocations, that encourage researchers to rethink the existing frameworks and methods and find new approaches that would help both researchers and data professionals. This review seeks to enhance understanding of long tail data as both a concept and a field, while also informing current and future research and practice.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Inna Kouper, Gretchen Stahlman

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright for papers and articles published in this journal is retained by the authors, with first publication rights granted to the University of Edinburgh. It is a condition of publication that authors license their paper or article under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence.