Case study: Publishing a data paper - Research Data Service
Transcription
Case study: Publishing a data paper - Research Data Service
Case study: Publishing a data paper Author Stephen Gray, PhD Archaeology Date 2015 Version 1.2 Notes URI data.bris.ac.uk IPR Copyright © 2015 University of Bristol Introduction This case study outlines the basic process and motivations for publishing a data paper within a data journal. What is a data paper and what is a data journal? A data paper is essentially a one or two page description of a publically available dataset which spells out its re-use potential. In addition to the dataset itself receiving a digital object identifier (DOI), an associated data paper receives a DOIs of its own. This means either can be cited in an academic context. Data journals publish data papers. The majority of data journals (including Internet Archaeology) are actually ‘mixed’ (i.e. they publish both ‘traditional’ and data papers) but some dedicated data journals are beginning to emerge (Nature’s Scientific Data1, for example). Overview My research involves conducting low-level aerial surveys of archaeological sites using an unmanned aerial vehicle (UAV). My data consists of sets of vertical photographs and associated metadata. The metadata allows each image to be georeferenced (located to precise spatial coordinates) and explains how and when each image was captured. This data can be used by myself or others to create orthorectified maps, 3D models of structures or digital elevation models (DEMs) of landscape topography. Two datasets2 from two separate surveys have been submitted to FigShare3 a commercial and free-to-use data repository. One of these, the Blackquarries Hill Long Barrow survey dataset (see Fig 1 for the FigShare dataset record), was also written up as a data paper and the paper submitted to the journal, Internet Archaeology4. Motivations for publishing a data paper Publishing a data paper is a similar process to publishing an academic journal article; data is peer reviewed, amended and made publically available under a citable unique identifier. This process is familiar to most researchers and so contributes to the academic legitimacy of data as a valuable research output. It may be that data papers and the ‘data journals’ which publish them are a temporary phenomenon, useful only until a new and truly data-centric form of review and citation evolve. However, in my opinion it’s one of the best ways we have at the moment to demonstrate that data has value. Because the low-level aerial survey generates so much data, typically far more than is required for my immediate purposes, publishing data for open re-use is an attractive prospect for me. 1 http://www.nature.com/sdata/ UAV Aerial Survey - Clifton Camp (ST56557330), http://figshare.com/articles/UAV_Aerial_Survey_Clifton_Camp_ST56557330_/1269223 and UAV Aerial Survey - Blackquarries Hill Long Barrow (ST77509320), http://figshare.com/articles/UAV_Aerial_Survey_Blackquarries_Hill_Long_Barrow_ST77509320_/1275172 3 www.figshare.com 4 http://intarch.ac.uk/ 2 In order to make this possible I ask land owners for permissions to publish the data. I also favour non-proprietary formats, wherever possible, to avoid the need for data re-users to have expensive software. Lastly, I ensure that the metadata I create is sufficient to support the needs of data re-users and not only my own immediate needs. If a survey dataset can be made available for reuse, I will submit it to a data repository who can provide ongoing access. Fig 1, FigShare dataset record for Blackquarries Hill Long Barrow survey data The data paper publication process Publishing a data paper is very similar to publishing any peer-reviewed academic paper. Internet Archaeology ask for the information listed below. The word limit is 2000 words. Title Authorship (including contact details and ORCID identifier) DOI of deposited dataset Content of the dataset Background to the dataset – include context, main aims/objectives of the dataset (and/or project) and general data methodology Summary description (if required e.g. if dataset is excavation data) Scope (incl. period terms or dates/geographical context. You should also note any data 'gaps'/what is not covered) Future work and Re-use Potential of the dataset e.g. avenues of possible further analysis, integration with other datasets etc. Details of how the dataset relates to other publications/archives (including physical archives) References Acknowledgements and Funding Statement Internet Archaeology also provide helpful guidance (http://intarch.ac.uk/authors/datapapers.html) to the submission process. The peer review process A data papers differ from a traditional journal article in that it explicitly credits the reviewer and makes their comments available to all and these include potential areas of future research. Reviewers are also looking for factual accuracy (or an identified degree or error), reusability of formats and clearly documented and reproducible data collection and modification workflows. Corrections may be requested following peer review. If changes relate to the data paper, they can be simply carried out. However, if they relate to the dataset itself, authors are encouraged to speak with the host repository. In most cases a new, it will not be possible to remove already-published datasets, in case they have already been used and cited. Instead, an amended dataset will be deposited with the repository and the metadata form the old dataset will point to the new dataset as the current version. In my case only one piece of factual information had to be changed (the National Record of the Historic Environment (NRHE) monument number) and this was within the data paper. Common to all ‘gold’ Open Access publications, an article processing charge is payable, for Internet Archaeology this is £100. If research is RCUK-funded this cost is covered by the university. The final version of the data paper was published in March 2015 (see Fig 2). Fig 2 extract from the final data paper Reflections on the process Although I’ve shared my research data before, the data paper which accompanies the Blackquarries Hill data is the first I’ve created. I created it because I feel strongly that this particular site deserves to be better known and by publishing the data paper in Internet Archaeology, I’m encouraging archaeologists and heritage managers to carry out further work on the barrow. To some degree this whole exercise has been an experiment in finding an effective way to communicate my research. It’s too early to tell but I’ll be very interested to see if publishing a data paper leads to more downloads of the associated dataset or puts me in touch with potential collaborators. If so, I will certainly publish more data papers. More information Leonardo Candela’s, Data Journals: a Survey (2014) is a comprehensive list of data journals (http://onlinelibrary.wiley.com/doi/10.1002/asi.23358/full). Re3Data (http://service.re3data.org) is a list of research data repositories, organised by academic discipline. M.A. Parsons and P.A. Fox, question the definition of data ‘publication’ in Is Data Publication the Right Metaphor? (https://www.jstage.jst.go.jp/article/dsj/12/0/12_WDS-042/_article).