Two Decades, Same Story? Insights and Future Directions in Long Tail Data Curation

Authors

DOI:

https://doi.org/10.2218/ijdc.v19i1.1057

Abstract

This paper examines the evolution of the concept of long tail research data in the scholarly literature. The “long tail” concept, originally used to describe “niche” digital products that have a significant market share when taken as an aggregate, was first applied to research data in 2007 to refer to a vast array of smaller, heterogeneous data collections that cumulatively represent a substantial portion of scientific knowledge. These datasets are frequently overlooked due to inadequate data management practices and institutional support. Bridging the discussions on data curation in library & information science (LIS) and domain-specific contexts, this paper identifies several themes in these discussions and offers insights, or provocations, that encourage researchers to rethink the existing frameworks and methods and find new approaches that would help both researchers and data professionals. This review seeks to enhance understanding of long tail data as both a concept and a field, while also informing current and future research and practice.

Downloads

Published

2025-06-05

Issue

Section

Conference Papers