Data reuse, the reuse of scientific data to solve new research problems, accepts both the new interpretation of data explored by other researchers and the new test of original research data by researchers using other analysis technologies. Although big data, research infrastructure and informatization of the research environment are transforming scientific research into the fourth research paradigm, data reuse has provided an effective way for new scientific discovery and knowledge innovation. Its public value increases daily as a strategic resource of national scientific and technological innovation and scientific research infrastructure. The research of data reuse has received much attention in the past 20 years, but the knowledge system in this subject area has not yet been established and lacks proper planning and forward-looking prediction.
This study comprehensively uses the bibliometric methods and knowledge map analysis tools (such as HistCite and CiteSpace) to process and analyze the large-scale research literature data objectively and intuitively. Using the Web of Science database as the source of literature collection, we utilize the "data reuse", "data re-use", "data reusing", "reusing data", "reusing of data", "secondary data use", and "data re-usability" as the keywords and the deadline of data collection was March 20, 2021. This study involves 364 papers in sum finally.
The main findings and theoretical contributions of this study are as follows:
(1) The existing research on data reuse presents the development path, evolution process, driving factors, and research structure of "two main lines", "three stages", "four forces" and "five core fields". From the perspective of the development path, data reuse is mainly carried out along two main lines, which run through three evolutionary stages: germination (before 2006), development (2007-2014) and outbreak (2015-). From the keyword co-occurrence analysis, data reuse research has five core fields: basic theoretical research, data sharing and reuse relationship, user behavior and scientific research management, data reuse ethics, and data reuse in various disciplines.
(2) The knowledge system of data reuse research consists of four levels, including the guarantee platform layer, theoretical foundation layer, research branch layer and method tool layer. The development of digital scientific research and data infrastructure, the change of data behavior, scientific research evaluation, and the development of big data technology are the frontiers and growth points of developing four levels of knowledge systems and methods and tools. They also constitute the four driving forces for the in-depth development of scientific data reuse: the needs of big scientific research and the formation of a digital scientific research environment, the development of the data-intensive scientific discovery, the recognition of scientific data achievements, and the development of digital technology.
(3) The subsequent research on data reuse has an opportunity window for academic research in five aspects: public academic value of scientific data, behavior and mechanism of data reuse, influence of data reuse, policy of data reuse, and data reuse in the different fields. We expect the academic community to follow up continuously on these research topics and provide theoretical supports for practically improving scientific data reuse.
Key words
scientific data /
data reuse /
fourth research paradigm /
citation analysis /
knowledge map
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
References
[1] 中华人民共和国科学技术部. SDS/T 1003-2004 科学数据共享工程技术标准[S]. 2005.
[2] AS Zimmerman. New knowledge from old data[J]. ence Technology & Human Values, 2008,33(5): 631-652.
[3] A Y. Data reuse and users' trust judgments: toward trusted data curation[D]. University of North Carolina at Chapel Hill Graduate School, 2015.
[4] SHEN Y. Data Sustainability and Reuse Pathways of Natural Resources and Environmental Scientists[J]. The new review of academic librarianship, 2018,24(2): 136-156.
[5] European Commission Expert Group on FAIR Data. TURNING FAIR INTO REALITY[R].European Union, 2018.
[6] 傅天珍, 郑江平. 国外面向科研人员的科学数据共享探析[J]. 图书馆论坛, 2015,35(02): 76-81.
[7] PIWOWAR H A, VISION T J, WHITLOCK M C. Data archiving is a good investment[J]. Nature, 2011,473(7347): 285.
[8] Kathleen Marie Fear. Measuring and anticipating the impact of data reuse[EB/OL]. (2013-10-30)[2020-10-10]. https://deepblue.lib.umich.edu/bitstream/handle/2027.42/102481/kfear_1.pdf?sequence=1&isAllowed=y.
[9] 孙玉伟, 成颖, 谢娟. 科研人员数据复用行为研究:系统综述与元综合[J]. 中国图书馆学报, 2019,45(03): 110-130.
[10] 张莹, 戚景琳, 孙玉伟. 管理学科研人员数据复用行为特征探析[J]. 信息资源管理学报, 2020,10(4): 79-87.
[11] WALLIS J C, ROLANDO E, BORGMAN C L, et al. If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology[J]. PloS one, 2013,8(7): e67332.
[12] FEDERER L M, LU Y L, JOUBERT D J, et al. Biomedical Data Sharing and Reuse: Attitudes and Practices of Clinical and Scientific Research Staff[J]. PLoS One, 2015,10(6): e129506.
[13] H C. Understanding and using archaeological topographic surveys: COMPUTER APPLICATIONS AND QUANTITATIVE METHODS IN ARCHAEOLOGY[C], 2001.
[14] COENEN A, MCNEIL B, BAKKEN S, et al. Toward comparable nursing data: American Nurses Association criteria for data sets, classification systems, and nomenclatures[J]. Comput Nurs, 2001,19(6): 240-246, 246-248.
[15] FANIEL I M, JACOBSEN T E. Reusing Scientific Data: How Earthquake Engineering Researchers Assess the Reusability of Colleagues’ Data[J]. Computer supported cooperative work, 2010,19(3-4): 355-375.
[16] GUNST R F, BASU S, BRUNELL R. Defining and estimating global mean temperature anomalies[J]. Journal of Climate, 1993,6(7): 1368-1374.
[17] J N J, U N. Knowledge and data reuse in ship system design and engineering[J]. Proceedings of the 8th International Design Conference, 2004,Vols 1-3: 441-446.
[18] WHITE H. A Reality Check for Data Snooping[J]. Econometrica, 2000,68(5): 1097-1126.
[19] ZIMMERMAN A. Not by metadata alone: the use of diverse forms of knowledge to locate data for reuse[J]. International journal on digital libraries, 2007,7(1): 5-16.
[20] FRANK R D, YAKEL E, FANIEL I M, et al. Destruction/reconstruction: preservation of archaeological and zoological research data[J]. Archival Science, 2015,15(2): 141-167.
[21] BISHOP, LIBBY. Ethical sharing and reuse of qualitative data[J]. The Australian journal of social issues, 2009,44(3): 255-272.
[22] BISHOP L. Using archived qualitative data for teaching: practical and ethical considerations[J]. International journal of social research methodology, 2012,15(4): 341-350.
[23] CHATFIELD A T, REDDICK C G. A longitudinal cross-sector analysis of open data portal service capability: The case of Australian local governments[J]. Government information quarterly, 2017,34(2): 231-243.
[24] ABELLA A, ORTIZ-DE-URBINA-CRIADO M, DE-PABLOS-HEREDERO C. A model for the analysis of data-driven innovation and value generation in smart cities' ecosystems[J]. Cities, 2017,64: 47-53.
[25] TEMPINI N. Till data do us part: Understanding data-based value creation in data-intensive infrastructures[J]. Information and Organization, 2017,27(4): 191-210.
[26] YOON A, KIM Y. Social scientists' data reuse behaviors: Exploring the roles of attitudinal beliefs, attitudes, norms, and data repositories[J]. Library & Information Science Research, 2017,39(3): 224-233.
[27] 章昌平, 米加宁, 李大宇. 数据科学研究在社会科学中的应用前景[J]. 社会科学, 2018(09): 78-88.
[28] KANSA S W. Using Linked Open Data to Improve Data Reuse in Zooarchaeology[J]. Ethnobiology letters, 2015,6(2): 224-231.
[29] NICHOLS B N, POHL K M. Neuroinformatics Software Applications Supporting Electronic Data Capture, Management, and Sharing for the Neuroimaging Community[J]. Neuropsychol Rev, 2015,25(3): 356-368.
[30] HEY T, TANSLEY S, TOLLE K. The fourth paradigm: data-intensive scientific discovery[J]. proceedings of the ieee, 2009,99(8): 1334-1337.
[31] WILKINSON M D, DUMONTIER M, JAN A I, et al. Addendum: The FAIR Guiding Principles for scientific data management and stewardship[J]. Sci Data, 2019,6(1): 6.
[32] TENOPIR C, DALTON E D, ALLARD S, et al. Changes in Data Sharing and Data Reuse Practices and Perceptions among Scientists Worldwide[J]. PloS one, 2015,10(8): e134826.
[33] 邓君, 宋文凤. 科学数据价值鉴定研究进展[J]. 情报科学, 2012,30(06): 942-946.
[34] L T P, LAURIAULT B, CRAIG D R, et al. Today's Data are Part of Tomorrow's Research: Archival Issues in the Sciences[J]. Archivaria, 2007(64): 123-179.