Sharing and archiving of publicly funded research data
Transcription
Sharing and archiving of publicly funded research data
11/04/14 Sharing and archiving of publicly funded research data Report to the Research Council of Norway 2 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM For information on obtaining additional copies, permission to reprint or translate this work, and all other correspondence, please contact: DAMVAD info@damvad.com damvad.com Copyright 2014 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 3 Contents 1 2 3 Executive Summary 6 1.1 Mandate 6 1.2 Main findings 6 1.3 Recommendations 7 Sammendrag (in Norwegian) 10 2.1 Mandat 10 2.2 Sentrale funn 10 2.3 Anbefalinger 12 Background 13 3.1 Mandate 13 3.2 Context 14 3.2.1 Data is vital to research 14 3.2.2 Growing consensus on the importance of sharing publicly funded data 16 Structure of the report 17 3.3 4 5 4 Former studies used to develop the hypotheses 18 4.1 The consensus on the importance of access to data 18 4.2 Lack of recognition, time and proper infrastructure 18 4.3 Variations across disciplines and ages 20 4.4 Input from researchers and data managers in Norway 21 4.5 Hypotheses 21 Methodology 23 5.1 Conceptual clarifications 23 5.1.1 Scope 23 5.1.2 Financing 23 5.1.3 Research data 23 5.1.4 Archiving 24 5.1.5 Open access to research data 25 5.2 Selecting the population 25 5.3 Survey process 26 5.4 Response rate 27 5.5 A significant proportion of the researchers actively chose not to participate 27 6 Descriptive statistics 29 7 Researchers use data generated by other researchers 31 7.1 Data formats vary across research disciplines 31 7.2 Numerical data are easier to restore 32 7.3 Researchers frequently use other researchers’ data 33 7.4 Researchers mainly use data produced by other researchers from the same institution 35 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 7.5 8 9 10 11 Researchers would like even better access to other researchers’ data 35 Research data is rarely archived in data centres 36 8.1 Most data is archived on portable storage units or institutional servers 36 8.2 Storage reflects costs of recreation 38 8.3 Most researchers are satisfied with their current archiving solution 40 8.4 Those who are not satisfied point to security risks 40 8.5 Archiving activities are financed as a part of project- and institutional funding 41 Most researchers share research data 42 9.1 Researchers are positive to the principle of open access 42 9.2 Many researchers are left undecided 42 9.3 Health trusts are positive towards the effects of sharing data on research 43 9.4 Researchers share their research data, but upon request 45 9.5 More openness within humanities 45 9.6 More openness among more experienced researchers 47 Lack of time, infrastructure and incentives hamper further sharing of data 48 10.1 Variety of barriers 48 10.2 Relatively small differences across sector 50 10.3 Textual records are more sensible 52 10.4 Researchers see little support from management 53 10.5 Limited institutional support 54 10.6 Researchers call for better infrastructure, citation systems and guidelines 56 10.7 Researchers working internationally find time to be a bigger challenge 60 10.8 Researchers welcome data sharing as a part of publishing 62 Main findings and recommendations 63 11.1 Main findings 63 11.2 Recommendations 65 References 70 Appendix 73 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 5 1 Executive Summary 1.1 Mandate strategy and guidelines for sharing and archiving of publicly funded research data in Norway. Data is an important asset in the knowledge society and is vital to research. Open access to research 1.2 Main findings data allows for the use of data for different purposes and for purposes other than originally intended. Overall, findings in this report support findings in Sharing and archiving of data allows for further re- other international surveys. search, re-analysis, validation and research cooperation on complex matters. Consequently, open ac- A total of 1,474 researchers completed the survey. cess to research data both can enable new re- This constitutes 21.8 percent of the selected survey search and innovation and the dissemination of population. Another 604 researchers actively indi- knowledge. cated that they did not want to participate in the survey. In total, that is a response rate at 30.6 percent. The debate about open access to research data is An analysis of respondents indicates a high repre- by no means new. It has intensified in recent years sentativity across institution types and subject mat- due to a growing amount of data and the growing ters. possibilities offered by information technology, along with growing recognition of the value of data. Norwegian researchers frequently use and share research data with each other. As many as 64 per- The Organisation for Economic Co-operation and cent of researchers had used research data from Development (OECD) has developed guidelines on other researchers in the last three years. the sharing of publicly funded research data. Publicly funded research data could be considered a The researchers mostly used research data gener- public good, and as such should be available to the ated by other researchers from the same institution, greatest extent possible, not reserved for the indi- though this is closely followed by data from re- vidual researcher or institution. searchers at other institutions outside of Norway and other researchers nationally. Nonetheless, the sharing and archiving of research data faces technical, financial, legal and cultural ob- The remaining 36 percent of researchers report that stacles and questions that remain unanswered. they have not used data gathered by other researchers. Of these, 71.5 percent report that they The objective of this study is to gain a better under- would have liked to make use of other researchers’ standing of researchers in Norway’s current practice data. The numbers indicate untapped potential for on sharing and archiving, as well as barriers to the increased and improved sharing of data. sharing and archiving of research data. The study also proposes possible approaches to overcome Only 10 percent of the researchers had not used re- these barriers. search data generated from other researchers over the past three years and did not wish to use data The study will serve as a contribution to the Re- generated by others. search Council of Norway's work on developing a 6 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM The survey confirms that researchers in Norway see When asked about the barriers to sharing even the benefits of the sharing and archiving of research more of their data, researchers emphasized the fol- data. Around 80 percent of the respondent re- lowing: searchers agreed that open access to research data 1. Preparing data for open access takes up val- enhances research, and that it is an ethical obliga- uable time. tion of research to make research data available for 2. I do not have an adequate technical infra- validation. These are also the two reasons for open structure. access agreed to by most researchers. 3. Open access to research data might reduce my options for scientific publications in the Further, 77 percent agree that open access to re- future. search data facilitates the education of students and new researchers and 74 percent agree that open These responses indicate, inter alia, that research- access stimulates research collaboration. ers lack adequate and user-friendly infrastructure, guidelines and procedures, and certainty about im- Although most researchers agree on the benefits of material rights in order to embrace the idea of shar- sharing data, many researches are also undecided ing data. about whether publicly funded research data should be considered public property. Of the remaining 20 Contrary to our hypothesis, we did not find any ma- percent who do not agree that open access to re- jor differences across sectors, fields of research or search data will enhance research, 15 percent are years of professional experience. undecided and around 5 percent disagree. This high proportion of undecided researchers may reflect the The study further finds that 85 percent of the re- complexity of the issue and the distance between spondents archive their data on their own devices good intentions and practical solutions that address or else at an institutional server. The figures do not storage, ownership and credit, replicability of use, vary across sectors, disciplines or scientific experi- and other obstacles. ence. The survey included an open answer option where The survey responses suggest significant differ- respondents could write free text. Inputs in this sec- ences in the way in which research managers’ deal tion show that many researchers find the issue of with the sharing and archiving of data. Conse- open access challenging and complex. quently, researchers see a need for greater institutional support. Most researchers share their research data with other researchers. Yet research data is generally 1.3 Recommendations shared under certain conditions (e.g., only upon request, under a non-disclose agreement, in an anon- The study reveals multiple obstacles and, therefore, ymized format). Researchers want to control who that there is no single solution as to how to increase gets access to their data and how they use it. With the sharing and archiving of research data. Both this each researcher setting the term, there is a risk that and former studies suggest that there is a need for she becomes a gatekeeper. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 7 work directed at the level of researchers, data man- It is also important to communicate that the archiv- agers and research funders, as well as govern- ing of data does not necessarily imply full open ac- ment/international levels. cess to research data for all, but should be seen more as a premise for the sharing of data. Overall, researchers agree in principle as to the value of sharing data. However, increased sharing Second, our study indicates that a lack of incentives is hampered by uncertainty about how to go about for the crediting of data is a barrier. This could be it technically; it being felt that it takes away valuable addressed by clarifying and implementing a system time for research and that it will reduce academic for citation but also outlining the inherent responsi- credentials. bility and expectation on the part of researchers. The flipside of these barriers are possible solution. For example, the Research Council of Norway can These include: introduce requirement of data management plans Better infrastructure. and support implementation of systems for crediting Implementing a system for citation. to raise awareness, experience and recognition Implementing guidelines, training and stand- among researchers. Ideally, such measures should ards for sharing data. be easy to use, similar to international systems and work alongside the system for scientific publication. The Research Council of Norway can play a key role. Specific recommendations include raising Third, many researchers lack knowledge as to what awareness, finding ways to recognize data sharing, data to share and archive and how to do so. This putting in place standards, rules and best practice, includes information about what form the data providing technical infrastructure, and making fund- should be archived in and how proper information ing available for necessary infrastructure and train- about the data should be assigned. ing. There is a need for guidelines, standards and trainOur recommendations are summarized in Figure 1. ing on the sharing and archiving of research data. Defining what data to share and what is worth ar- First, we suggest that the Research Council of Nor- chiving (or not) could help clarify the debate. These way actively work to raise awareness on the bene- should be developed in close interaction with re- fits and pitfalls of the archiving and sharing of re- searchers, institutions and legal experts. Such work search data. should be inspired by work initiated internationally to avoid creating a Norwegian bureaucracy along- In particular, exemplifying potential opportunities and their value is important, inter alia, by using best side international standards. practice cases. Focus should be placed on showing Furthermore, selective investments in infrastruc- that sharing and archiving is also worthwhile for re- ture and technical skills are necessary. Both inter- searchers. views and studies suggest that the infrastructure for sharing and archiving data is fragmented, overlapping and insufficient. Our study also suggests that 8 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM many researchers archive most of their data on their own servers or portable computers. Better infrastructure could increase the motivations for archiving data at data archiving centres. This could provide a more secure means of archiving data and the data could be more easily restored. Finally, archiving will lay ground for the sharing of more research data. Infrastructure investments should involve all relevant stakeholders while also ensuring a robust infrastructure which will serve the needs of the future. FIGURE 1 Problems, solutions and recommendations Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 9 2 Sammendrag (in Norwegian) 2.1 Mandat forskere i Norge deler og arkiverer forskningsdata, og deres utfordringer knyttet til økt deling og arkive- Data er en verdifull ressurs i dagens kunnskaps- ring. samfunn. Åpen tilgang til data gir muligheter for bruk av data til ulike formål og for andre formål enn op- Studien vil tjene som et kunnskapsgrunnlag for prinnelig ment. Således kan åpen tilgang til data Forskningsrådet i deres arbeid med å utvikle en legge grunnlaget for utvikling av nye produkter, nye strategi for deling og arkivering av offentlig finan- tjenester og utvikling av demokratiet. sierte forskningsdata i Norge. Data er også et sentralt grunnlag for forskning. De- 2.2 Sentrale funn ling og arkivering av forskningsdata gir mulighet for videre forskning, gjenskaping av analyser, valide- Undersøkelsen er basert på en spørreundersøkelse ring og forskningssamarbeid om komplekse pro- blant forskere i Norge. I alt 1474 forskere har gjen- blemstillinger. nomført undersøkelsen. 604 forskere aktivt signalisert at de ikke ønsker å delta i undersøkelsen. Debatten om åpen tilgang til forskningsdata er ikke ny. Imidlertid har debatten blitt intensivert de siste Undersøkelsen viser at mange norske forskere bru- årene på grunn av en økende mengde data og nye ker og deler forskningsdata med hverandre. Så muligheter for analyser av store datamengder som mange som 64 prosent av forskerne i undersøkel- følge av den teknologiske utviklingen. sen har brukt forskningsdata fra andre forskere i de siste tre årene. Norske myndigheter vil, sammen med internasjonale organisasjoner som OECD og EU fremme mer Forskerne bruker hovedsakelig forskningsdata ge- deling og arkivering av offentlig finansierte forsk- nerert av andre forskere fra samme institusjon, tett ningsdata. fulgt av forskere ved andre institusjoner utenfor Norge og av forskere for andre institusjoner nasjo- Det er særlig to grunner til dette; nalt. For det første, kan offentlig finansierte forsknings- Motsatt, 36 prosent av forskerne hadde ikke brukt data anses som et offentlig gode som bør utnyttes i andre forskres data. Av disse oppgir 71,5 prosent at størst mulig grad og ikke reserveres for den enkelte de gjerne vil gjøre bruk av andre forskeres data. forsker eller institusjon. Dette et indikerer klart potensiale økt deling av data. Dernest kan bedre utnyttelse av forskningsdata Kun 10 prosent av alle forskerne har ikke brukt data styrke kvaliteten og ressursutnyttelse i norsk forsk- generert av andre forskere i løpet av de siste tre ning. årene, og ikke ønsker å bruke data generert av andre. En rekke studier viser at deling og arkivering av 10 forskningsdata er forbundet med tekniske, økono- Undersøkelsen bekrefter at forskere i Norge ser nyt- miske, kulturelle og juridiske hindringer. Målet med ten av deling og arkivering av forskningsdata. Rundt denne studien er å få bedre forståelse av hvordan 80 prosent av respondentene er enige om at åpen SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM tilgang til forskningsdata styrker forskningen, og at Respondentene er bekymret for at offentliggjøring det er en etisk forpliktelse å gjøre forskningsdata vil kreve ressurser som ikke kan kompenseres for. tilgjengelig for validering. Dette er de to grunnene for åpen tilgang som de fleste forskere er enige i. Det kan være flere grunner til at forskerne anslår at tilrettelegging av data tar opp verdifull tid. Eksem- Videre er 77 prosent og 74 prosent av forskeren pler på dette er manglende tilgang til passende in- enige i at åpen tilgang til forskningsdata er fordelak- frastruktur, mangel på bruk og kjennskap til standar- tig i utdanningen av studenter og nye forskere, og at der og retningslinjer om hvilke og hvordan forsk- åpen tilgang stimulerer forskningssamarbeid. ningsdata skal deles. Selv om de fleste forskerne er enige om fordelene I motsetning til hva vi forventet finner vi lite forskjel- ved å dele data, tyder vår undersøkelse også på at ler når det erfaringer og barrier på tvers av sektorer, mange forskere er usikre på fordelene ved deling av forskningsfelt eller år med vitenskapelig erfaring. data. 20 prosent er ikke enige om at åpen tilgang til forskningsdata vil styrke forskningen, av dette er 15 Arkivering av forskningsdata er sentralt for å vali- prosent er usikre og rundt 5 prosent er uenige. dere forskningsresultater. Arkivering av data kan Denne høye andelen usikre forskere kan reflektere også legge til rette for reanalyser og videre forsk- kompleksiteten i problemstillingen. ning dersom dataene tilgjengeliggjøres. De åpne svarene i undersøkelsen avslører også at Studien finner videre at 85 prosent av responden- mange forskere finner spørsmålet om åpen tilgang tene arkiverer sine data lokalt, enten på sin egen utfordrende og mange forskere er positive og portal datalagringsenhet eller institusjonsserver. mange negative til åpen tilgang. Andelen varierer i liten grad på tvers av sektorer, fagfelt eller år med erfaringer. De fleste forskere deler sine forskningsdata med andre forskere. 64 prosent har brukt data generert Vår studie tyder på at forskerne i liten grad opplever av andre forsker i løpet av de siste tre årene. at deres institusjonsledelse arbeider oppfordrer og Imidlertid er forskningsdata generelt delt under legger til rette før deling og arkivering av data. Bare visse restriksjoner (kun på forespørsel, under en 12 prosent oppgir at deres ledelse i «høy grad» op- konfidensiell eller i en anonymisert form). Funnene pfordrer til deling av data. Bare 4 prosent oppgir at tyder på at forskere ønsker å kontrollere hvem som dere organisasjonen i «høy grad» har de nødven- får tilgang til sine data, og hvordan dataene brukes. dige løsninger og retningslinjer for deling av data. Når de blir spurt om de hindringer for å dele enda Undersøkelsen viser en sterk sammenheng mellom mer av sine data svarer forskerne at de sentrale bar- de barrierer forskere opplever for deling av data og rierene er: de løsninger som forskerne anbefaler for økt deling av data. Forberedelse av data for åpen tilgang tar opp verdifull tid Ikke tilgang på tilstrekkelig teknisk infrastruktur Åpen tilgang til data kan redusere muligheten for vitenskapelige publikasjoner i fremtiden SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 11 Forskere ser et behov for; For det tredje indikerer vår studie at forskerne Bedre infrastruktur mangler kunnskap om hvilke data som skal deles og Systemer for sitering og kreditering av data arkiveres og hvordan dette skal gjøre. Det synes å Utvikling av retningslinjer, opplæring og standarder for deling av data være et behov for retningslinjer, standarder og opplæring om deling og arkivering av forskningsdata. Vi anbefaler at retningslinjer og normene bør utvikles i Igjen finner vi bare minimale forskjeller på tvers av nært samspill med forskere, institusjoner og juridi- sektorer, fag og vitenskapelige erfaringer. ske eksperter. Vi anbefaler slikt arbeid til å bli inspirert av arbeidet startet internasjonalt for å unngå å 2.3 Anbefalinger skape et norsk byråkrati på siden av internasjonale standarder. Vår studie tyder på at det er flere hindringer for deling og arkivering av data. Derfor er det heller ikke For det fjerde tyder vår studie på at forskerne i stor en enkelt løsning på hvordan data i større grad kan grad er fornøyd med arkiveringsløsninger. Likevel deles og arkiveres. Både denne og tidligere studier ser mange at manglende infrastruktur er et hinder tyder på at det er et behov for arbeid på flere nivåer for økt lagring og arkivering. Både intervjuer og stu- – rettet mot både forskere, forskningsinstitusjoner, dier tyder på at det er behov mer bedre og mer til- datasentre og forskningsfinansiører, og på myn- passet infrastrukturen for deling og arkivering. Be- dighetsnivå. hovet for infrastruktur understøttes også av at mange prosentforskere arkiverer sin forskningsdata Mange forskere er enige i prinsippet om å dele data lokalt enten på egne datalagringsenheter eller insti- – samtidig viser undersøkelsen at deling forhindres tusjonsservere. av at deling tar opp verdifull tid, manglende infrastruktur og at deling kan redusere muligheten for Bedre infrastruktur kunne øke motivasjonene for de- fremtidig publisering. Forskningsrådet kan spille en ling og arkivering av data. Infrastruktur i investerin- nøkkelrolle i å overvinne disse barrierene. ger bør involvere alle relevante interessenter og samtidig sikre en robust infrastruktur, som vil tjene For det første, anbefaler vi at Forskningsrådet ar- fremtidige behov. beider aktivt arbeide for å øke bevisstheten om fordelene og fallgruvene ved arkivering og deling av forskningsdata. Vi anbefaler å spre kunnskap og bevissthet om mulighetene ved økt deling av data, men også kommunisere at arkivering av data innebærer ikke nødvendigvis fullstendig åpen tilgang til data for alle. For det andre viser vår studie at forskere ikke har insentiver for å dele data. Vi anbefaler at Forskningsrådet arbeider med implementering av et system for sitering av data og at et slikt system bør utarbeides i tråd med internasjonal praksis. 12 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 3 Background This chapter presents the background and context sociological obstacles. While overall policy goals for the report. and benefits are agreed, many questions still stand in the way of effective and successful implementa- 3.1 Mandate tion of the principles of open access to research data. Data constitutes knowledge and is a valuable asset in the knowledge society. Sharing of research data The Norwegian Government has mandated the Re- allows for the use of data for purposes other than search Council of Norway to explore and facilitate originally intended, linking of data across different work on sharing and archiving of research data. data sets and validation of data. Accessible data also underpins democratic processes by making in- The Research Council of Norway is the National formation available to a wider audience. strategic and funding agency for research activities in Norway. The goal of the Research Council is to Retrieving information and allowing new genera- strengthen the Norwegian research and innovation tions of researchers to “stand on the shoulders of system and its infrastructure through the effective giants” is the very essence of research (PARSE.In- use of public resources. As previously, noted, en- sight. 2012). hanced access to research data can be seen as a measure to help achieve these goals. The Norwegian Government - alongside international organizations such as the OECD and EU – The Research Council of Norway is also principal seeks to promote more sharing and archiving of re- source of expertise and advice on research policy search data. In its most recent White Paper 1 on re- for the Norwegian Government, the central govern- search policy, the Ministry of Education noted that: ment administration and the overall research community, including universities, research institutes “Better access to research data helps facilitate re- and health trusts. search and to increase the quality of research. The government wishes to facilitate increased availabil- In the autumn of 2012, the Research Council of Nor- ity of publicly funded research data.” way initiated an internal project called "Principles for open access to publicly funded research data", led Better utilization of research data could thus by the Department for Research Infrastructure. The strengthen the quality of Norwegian research and main objective of the project was to provide a ensure a more efficient use of resources. Conse- knowledge base for further work shaping the Coun- quently, enhanced access to research data is a key cil's policy in line with the OECD guidelines from measure to reach overall research policy objectives. 2007. 2 Yet archiving and sharing of research data brings to A working group has been formed and a number of the fore a number of technical, financial, legal and activities are being undertaken in close cooperation 1 Meld. St. 18 (2012–2013) Report to the Storting, “Long lines - 2 knowledge provides opportunities” freely translated by DAMVAD. OECD (2007) Principles and Guidelines for Access to Research Data from Public Funding SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 13 with the research communities and data managers 3.2 Context to explore how open access to research data can be strengthened. 3.2.1 Data is vital to research It is in this context that the Research Council of Nor- Research can be defined in many ways. In the way has commissioned DAMVAD to undertake a OECD Frascati manual3 research is defined as survey among researchers in Norway. The objective of the survey is to gain a better understanding of re- "(…) creative work undertaken on a systematic ba- searchers’ current practices and position regarding sis in order to increase the stock of knowledge, in- the archiving and sharing of research data. cluding knowledge of man, culture and society, and the use of this stock of knowledge to devise new ap- The topic clearly involves a broad range of stake- plications." holders, including the government, research organizations, researchers, research institutes and civil Other definitions can be used, but regardless of the society. This study exclusively investigates the definition applied, data remains a vital part of re- viewpoint of researchers. search. The two overall questions investigated in this study Data is vital to researchers in investigating events, are: features, and correlations, in adjusting findings from 1. How do researchers in Norway share and archive research data? previous research, solving new or existing problems, supporting theorems and developing new theories for the benefit of society. 2. What are the obstacles to the increased sharing and archiving of research data? The debate on open access to research data is not new. The concept and related policy goals were in- Based on results and analysis on the two questions, stitutionalized by the establishment of the World the study discusses measures to reduce or over- Data Centre system, in preparation for the Interna- come identified barriers. tional Geophysical Year of 1957-1958.4 The study feeds into the Research Council of Nor- The International Council of Scientific Unions (now way’s work on developing strategies and guidelines the International Council for Science) established for sharing and archiving of research data in Nor- several World Data Centres to minimize the risk of way. data loss and maximize data accessibility, further recommending in 1955 that all research data should be made available in machine-readable form.5 OECD (2002) “Frascati Manual: proposed standard practice for surveys on research and experimental development”, 6th edition. Retrieved 27 May 2012 from www.oecd.org/sti/frascatimanual. 3 National Research Council (2008). “Earth Observations from Space: The First 50 Years of Scientific Achievements.” The National Academies Press. 4 5 World Data Center System (2009-09-18). "About the World Data Center System". NOAA, National Geophysical Data Center. 14 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM The debate on open access to research data has been intensified in recent years following the growing amount of data and growing number of data pos- FIGURE 3.1 Data management as an integrated part of Research life cycle sibilities offered by information technology. The rapidly increasing amount of data allows for the analysis of complex issues involving large datasets. New technology generates big data which carry significant data analysis opportunities, but also challenges in terms of storage, communication and processing software, and ownership issues. Examples include information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, etc., generates “big data”.6 Information technology and the Internet have increased the amount of available data. This implies new and more extensive opportunities for collecting, analysing, storing and sharing data. Information technology has also affected the way in Source: JISC Research 3.0: driving the knowledge economy. which research is done. Science has become more collaborative, data-intensive and computational, This new, data-intensive research environment of leaving academic researchers with new data man- scientific study has been called the “fourth para- agement needs that have to be addressed as an in- digm” of scientific inquiry, where “all science litera- tegrated part of the data lifecycle.7 ture is online, all of the science data is online and they interoperate with each other” (Hey et al. 2009):8 “We must all accept that science is data and that data are science, and thus provide for and justify the need for the support of much-improved data curation”9 “Big data” is a term used for large and complex data sets; see, for example, http://mike2.openmethodology.org/wiki/Big_Data_Definition. 6 7 JISC Research 3.0: driving the knowledge economy and Tenopir et al. (2011) 8 Tony Hey, Stewart Tansley and Kristin Tolle, eds.,(2009):”The Fourth Paradigm: Data-Intensive Scientific Discovery” 9 Hanson, Sugden & Alberts,(2011) “Making Data Maximally Available” Science Vol 331 11 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 15 3.2.2 Growing consensus on the importance of sharing publicly funded data The OECD principles were endorsed by the OECD Council in December 2006 and published in 2007. The OECD "Principles and Guidelines for Access to There is growing consensus that data in all its forms Research Data from Public Funding" (2007) essen- represents today's tially recommends that research data generated knowledge society. Access to data and the infra- through publicly funded research is to be made pub- structure allowing for the utilization of data has be- licly available to others: a significant resource in come a resource that should be protected and utilized in an efficient manner. “The value of data lies in their use. Full and open access to scientific data should be adopted as the A growing number of governments, organizations international norm for the exchange of scientific and research funders are actively working to in- data derived from publicly funded research.” crease openness to data. This is not limited to research data but all publicly funded data. National Research Council study, Bits of Power. Sited in the OECD Guidelines (2007) The relevance of sharing publicly funded research data rests on two argument. A “recommendation” is a legal instrument of the OECD that is not legally binding and which is often Firstly, publicly funded research data should be uti- referred to as “soft law”. As such, there are no legal lized to the greatest extent possible and not be re- obligations towards publishing data. However, when served for individual researchers or institutions. Fur- a recommendation is endorsed by a country the ther, open access to research data can be a mean country is obligated to work towards fulfilling that to utilise resources more efficiently. recommendation. “Sharing and open access to publicly funded The Norwegian government has endorsed the research data not only helps to maximize the re- OECD guidelines in, for example, the previous white search potential of new digital technologies and paper on research from 2009: networks, but provides greater returns from the public investment in research.” “Increased availability of research data, both in Norway and in the partner countries, helps OECD Guidelines (2007) to facilitate research and disseminate knowledge across borders. This is fundamental to the quality In 2004, the governments of the 30 OECD countries and something the government wants to facilitate. as well as China, Israel, Russia and South Africa adopted the “Declaration on Access to Research The Government intends to follow up on the Data from Public Funding”. In this declaration, they OECD principles and guidelines for access to recognized the importance of access to research publicly funded research data. “ data and invited the OECD to develop a set of St. Meld 30 (2008-2009) Report to the Storting, “Climate for research.” Freely translated by DAMVAD OECD guidelines based on commonly agreed principles to facilitate optimal cost-effective access to digital research data from public funding. 16 Alongside the work at the OECD level, the European Commission is also working towards more SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM openness of research data. Efforts have been made both in terms of building competences10 Although the OECD guidelines have been endorsed and infra- in Norway, it has largely been left to the various in- structure as well as in developing European-wide stitutions and disciplines to develop methods for im- policies and guidelines. plementing them. In 2010, the High Level Expert Group on Scientific In addition, the infrastructure for sharing and archiv- Data submitted its report “Riding the wave. How Eu- ing research data in Norway is fragmented, with a rope can gain from the rising tide of scientific data” decentralized system of local, regional, national and to the European Commission. international data centres. There are wide variations between different subjects and disciplines. “Riding the Wave” offers a vision of how Europe, through the efficient use of research resources, can The government of Norway and the Research strengthen research and innovation in Europe and, Council of Norway now see a need for more coordi- thereby, strengthen Europe’s competitiveness in the nated efforts to ensure that more data are shared global economy. Since the beginning of its Seventh and archived. However, knowledge as to practices Framework Programme (FP7) for research and in- and the obstacles faced is needed. This study will novation in 2008, the European Commission has serve as input for such work. operated an Open access pilot to ensure open access to research publications from the FP7-funded 3.3 Structure of the report projects. Following this chapter on the mandate for and conBased on these experiences, the European Com- text of the report, the report provides a brief sum- mission has communicated that not only publica- mary of the main findings from former studies and tions but also research data from the EU-funded interviews in Chapter 4. Chapter 5 gives a detailed projects should be openly available (when possible) description of the methodology applied, covering in the future. conceptual clarifications, the selection of the population and the survey process, etc. In December 2013, the European Commission published “Guidelines on Open Access to Scientific The results of the surveys are presented in Chap- Publications and Research Data in Horizon 2020”.11 ters 6 through chapter 10. The results are presented Such initiatives is likely to affect Norwegian re- in the following order: presentation of the respond- searchers in the times to come. ents, the respondents’ practices regarding data usage and generation, the respondents’ practices regarding data archiving and, last but not least, the respondents’ practices and obstacles in relation to the sharing of data. The main findings and recommendations are included in the final chapter. 10 For example, through the funding of Parse.Insight, a two-year project co-funded by the European Union under the Seventh Framework Programme. It was concerned with the preservation of digital information in science, from primary data through analysis to the final publications resulting from the research. 11 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 17 4 Former studies used to develop the hypotheses Numerous studies have devoted themselves to the support for open access to publicly financed re- definition and importance of sharing research data search data among various stakeholders. Nine out (Borgman 2012, Kowalczyk & Shankar 2011). of ten respondents stated that research data that is publicly available and publicly funded, has to be - as Several studies have addressed the technical as- a matter of principle - available for re-use and free pects of infrastructure and data management (Te- of charge on the Internet. nopir 2012, Graaf et al. 2011), while strategy papers and policy documents have focused on the research Similarly does studies find support for importance of process and proposed policies for the promotion of archiving of data. PARSE.Insight's (2012) European data sharing (PARSE.Insight 2012, EC 2012, Hey study on data archiving (preservation) concur that et al. 2009). the preservation of research output is important, the reasons being that it may stimulate the advance- Various studies have also focused on the practices ment of science and that it allows for the re-analysis of and barriers to sharing and archiving from the and validation of research13. viewpoint of researchers. 4.2 Lack of recognition, time and proper infrastructure The following chapter summarizes the main findings from studies dealing with the current practices of and obstacles to sharing and archiving from a re- Previous studies show that however, data are often search point of view. The findings from previous unavailable for various reasons. One of the key studies have yielded significant insights into the challenges for sharing research data concerns the matter and have been used in the development of legal issues involved. Data must be stored and the hypotheses and questions of our study. shared in a way that safeguards privacy. Laws and regulatory policies in this area comprise provisions 4.1 The consensus on the importance of ac- that have their origin in general social considera- cess to data tions and the need to protect citizens. Several studies find that researchers acknowledge There may also be other legal challenges relating to the benefits open access to research data that is who owns the rights to data when multiple funders publicly funded. are involved in a given research activity. A European Commission12 study from 2012 on “sci- Tenopir et al. (2011) conducted a survey among entific information in the digital age” found strong 1,329 scientists,14 exploring current data sharing practices and perceptions of the barriers to - and 12 The EC Online. The online survey on “scientific information in the digital age” was open from July 2011 to September 2011. The team received 1,140 responses in total from all Member States, except Ireland, Malta, Slovenia and Slovakia. 37 percent of all responses were submitted by German respondents. The responses represented the different stakeholders, 429 of which were individual researchers; six respondents (not limited to researchers) hailed from Norway. 13 Apparently, validation of research is a growing global concern, see “Trouble at the Lab”, The Economist (October 2013). 18 14 In Tenopir et al. (2011), the survey was open from October 2009 to July 2010. Initially, the investigators used a snowball sampling method. They sent an email cover letter to DataONE team members (about 35 individuals throughout the world, but primarily in the United States). To increase international response, surveys were sent by an academic publisher to its database of over 7,000 previous authors. Ultimately, 1329 respondents answered at least one question. It is not unreasonable to estimate that the survey instrument reached 15,000 people, in which case the response rate was approximately 9 percent. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM enabling of - data sharing. In this survey, the princi- privacy concerns, concerns about publishing oppor- pal reasons stated by scientists for not sharing data tunities, and the desire to retain exclusive rights to were insufficient time and a lack of funding. data. The respondents in the EU study previously referred Many journals require authors to share their data also stated funding as a central barrier. In addition, with other investigators, either by depositing the lack of credit given to researchers for making data data in a public repository or else by making it freely available was raised as a concern. Most of the re- available upon request. Caroline J. Savage and An- searchers (81.1 percent), in the EU study, rated in- drew J. Vickers (2009) endeavoured to determine sufficient credit given as a very important or im- how well authors comply with such policies by re- portant barrier to accessing research data, followed questing data from authors who had published in by lack of funding to develop and maintain the nec- one of two journals that had clear data-sharing poli- essary data infrastructures (78.7 percent) and insuf- cies. They received only one of 10 raw data sets re- ficient national or regional strategies (74.6 percent). quested. This suggests that journal policies requiring data sharing do not lead to authors making their The European-wide study Parse.Insight15 found that data sets available to independent investigators. researchers often had major concerns about legal issues, misuse of data and incompatible data-types, Researchers who choose to withhold datasets often all of which interfered with data-sharing practices. have specific reasons for doing so. Savage and Vickers (2009) noted that these reasons included Enke et al. (2012) found a diverse mix of both tech- concerns about patient privacy (for medical fields), nological (e.g., a lack of appropriate data- concerns about future publishing opportunities and bases/mechanisms) and sociological (e.g., time, the desire to retain exclusive rights to data that had funding, etc.) causes that may impede scientists taken many years to produce. from sharing data. The main reason for not sharing data (cited in their international survey on data shar- The studies presented above have provided in- ing in the field of biodiversity) was “loss of control” sights into research practices and views regarding over the data, followed closely by the amount of time sharing and archiving. that would be needed to invest in sharing data sets. Nonetheless, the studies suggest that various barriStudies indicate that sharing research data reflects ers entails. We sum up the findings in a simple illus- personal factors, such as attitudes and culture. Te- tration in Figure 4.1. There seems to be a diverse nopir et al. (2012) found that barriers to sharing re- mix of barriers involved, of which privacy issues, los- search data were deeply rooted in the practices and ing control over data, lack of credit, time for prepa- culture of the research process, as well as in the re- ration and lack of proper infrastructure appear to be searchers themselves. These factors can include the most important (highlighted in Figure 4.1). 15 cludes 1,389 responses from researchers, 262 responses from data managers and 178 responses from publishing. All parts of Europe were represented in these surveys. Parse.Insight (2010) was a two-year project co-funded by the European Commission under the Seventh Framework Programme (FP7) on Research Infrastructures. Major surveys were held within three stakeholder domains: research, publishing and data management. The survey in- SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 19 4.3 Variations across disciplines and ages Atmospherics scientists were most inclined to making their data most available to others. Although the studies often highlight many of the same obstacles, the importance of the barrier itself Interestingly, when asked whether a lack of access might differ between the studies. This can be an in- to other researchers' or institutions' data were a ma- dicator of differences in the nature of barriers across jor impediment to their research, social scientists the respondent group, but could also be a conse- agreed that this was the case more than other re- quence of the different methodologies applied. Sur- spondents (80 percent compared to 60 percent veys are typically sensitive to the way questions are across disciplines). articulated and which context they are placed in. Such a lack of data sharing may also be a question Consequently, thus one should be careful when of competition. Campbell et al. (2002) found that comparing different studies. This said, some studies fields with increased opportunities for commercial have investigated the differences between different applications, such as genetics, were less likely to types of respondents within the same study and still share data when compared to less competitive found variations, especially across disciplines and fields. age ranges. Younger researchers tend to be less likely to share Some research disciplines are typically more reluc- data. This may be due to concerns regarding their tant to share data than others. Tenopir et al. (2011) career path. Tenopir et al. (2011) found differences found that the actual rate of data sharing varied con- in responses based on the age of respondents. siderably according to subject discipline, age, and Younger people were less likely to make their data geographic location. Researchers in medicine and available to others, whereas people above 50 years social science were the least likely to share data. old showed more interest in sharing their data. FIGURE 4.1 Barriers to sharing research data Legal Sociological Technical •Privacy •Shared ownership to data •Lack of knowlegde on legal issues related to data •Lack of incentives/credit to researcher •Concernes about researchers freeriding on data gathered by other researchers •Fear of loosing controll over data •Fear of loosing scientific edge •Fear others might not understand data •Lack of infrastructure •Sharing data is time-consuming •Lack of standards for sharing and preparing metdadata •Lack of technical skills Source: DAMVAD based on Tenopir et al. (2011), Enke et al. (2012), EC (2012), Kvale (2012), PARSE.Insight (2010). 20 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM These results correspond to findings in Kvale's data formats and metadata17, and a lack of profes- (2012)16 sional data curators responsible for facilitating shar- study of life science researchers in Nor- way. Kvale (2012) found the argument that publicly ing and archiving on behalf of researchers. funded research should become public property to be stronger among researchers with more experi- A list of informants is included in the appendix. ence. However, the proposition that the sharing of research data might stimulate inter-disciplinary col- 4.5 Hypotheses laborations stood out as an argument with much stronger support among younger researchers than The international literature, workshop, and prelimi- more experienced ones. nary interviews served as a basis for formation of hypotheses to explore through the survey. We pre- 4.4 Input from researchers and data manag- sent the hypothesis in Table 4.1. ers in Norway Together, the different hypotheses allow for a deAs part of preparing this report, DAMVAD partici- tailed analysis of the practice of sharing and archiv- pated in a workshop organized by the Research ing of research data in Norway, what the main bar- Council on sharing and archiving of research data riers for sharing and archiving are, and how these in Norway in October 2013. Interviews and partici- barriers can be reduced. pation in this workshop provided certain insights and allowed for the detailed discussion on the practise of sharing and archiving in Norway. Further, informants amongst researchers and data managers in Norway offered further insights into the barriers to sharing and archiving in the Norwegian context. Many of the barriers (such as issues relating to privacy, lack of credit and time) identified in the former studies were confirmed. Data managers are also typically concerned about the technical aspects, describing the Norwegian data management infrastructure as fragmented and overlapping. The informants point to a lack of central coordination, a lack of established standards for 16 A survey were conducted by Kvale as a part of her Master's thesis on data sharing in the life sciences of researchers at the Norwegian University of Life Sciences in 2012. The questions in the survey were largely similar to the questions included in the Parse.Insight survey in 2009. Of the 650 researchers and PhD students at the Norwegian University of Life Sciences (UMB) selected as a sample population for the questionnaire, 147 respondents (or 23 percent) replied. 17 Metadata is "data about data" i.e. information or content of data. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 21 TABLE 4.1 Hypothesis to be tested in the survey Researchers see the benefit of accessing other researchers' data, but want to retain control of their own data. There are various barriers to sharing research data (legal, technical, ethical and financial). Some data cannot be shared; nonetheless, a lack of incentives, time and infrastructure remain as central obstacles of sharing research data. Research data is archived for later reanalysis and validation. Sharing and archiving activities is financed as a part of research project funding. The barriers differ significantly between sector, discipline and age. Younger researchers are more negative about sharing research data than the older scientists. Researchers in the institute sector are more concerned with future revenue, whereas researchers in the university sector are more concerned with loosing scientific edge. Researchers in disciplines using numerical data are more experienced with sharing data. Internationally-oriented researchers are more open to sharing data than those that primarily work alone. Management supports the sharing and archiving of data. Work to increase sharing and archiving of research data needs to take place on many levels: policy level (guidelines, standards etc.), infrastructure/data management level, institutional level and research level. Source: DAMVAD 22 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 5 Methodology This chapter describes the methodology of the sur- As an introduction to the survey, we informed the vey: definitions used, how we selected the survey respondents about the definition of publicly funded population, and analysis of respondents. research data: 5.1 Publicly funded research data is defined as the Conceptual clarifications use and generation of research data that is publicly This study contributes to a field of growing interest funded (e.g., fully or partly funded by the Research from researchers, research organizations, govern- Council of Norway, hospital trusts, universities and ments and civil society. Several studies have sought colleges, ministries and other public entities). to investigate the field from a range of angles using Research that is fully funded by private or interna- different methodologies, concepts and terms. tional organizations is not included in this survey. We have largely used the terms and definitions offered in accordance with the OECD guidelines and completed the necessary delineations to make the study relevant for the work of the Research Council of Norway. 5.1.3 Research data Various definitions of research data can be found in literature on the topic. This study uses the term in accordance with the OECD guidelines. As part of the introduction to the survey, we informed the re- 5.1.1 Scope spondents about the definition of research data: The researchers relevant to the study included researchers working at research institutes, universi- Research data are defined in accordance with the ties health OECD guidelines for open access to research data, trusts (Helseforetak) in Norway. Researchers out- in which research data comprises factual records side such institutions (e.g., researchers employed in (numerical scores, textual records, images and private companies) are not the included in the sur- sounds) used as primary sources for scientific re- vey. This delineation ensured that the study focused search and which are commonly accepted in the on the activities of those researchers one might ex- scientific community as necessary to validate re- pect to be publicly financed. search findings. A research data set constitutes a and university colleges and systematic, partial representation of the subject being investigated. 5.1.2 Financing This term does not cover the following: laboratory The study will serve as input to the Council's work notebooks, preliminary analyses and drafts of sci- on drawing up guidelines for publicly funded re- entific papers, plans for future research, peer re- search data. The survey has also sought to focus views, or personal communications with colleagues on publicly funded research data but not data that or physical objects. have been gathered for other reasons (such as for commercialization). This is in line with OECD guide- The OECD guidelines are primarily aimed at re- lines. search data in digital, computer-readable format. It is in this format that the greatest potential lies for SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 23 improvements in the efficient distribution of data and we have not applied this distinction between types its application to research, largely because of the of research data in the survey. marginal costs of transmitting data through the Internet. However, it could also apply to analogue research data in situations where the marginal cost of One of the research questions posed includes the term ‘metadata’. Metadata can be understood as giving access to such data can be kept reasonably structured data and information about data, of any low. sort in any media, which imposes order on a disordered information universe. Typically, metadata PARSE.Insight. (2010) used the term digital re- comprises index files and data dictionaries that search data for all output in research. In practical store administrative information. terms, raw data, processed data, publications and post-publication materials are all covered by the same term. We have not used the term ‘digital’, as we would like to cover the entire range of research data. Moreover, we did not wish the respondents to make subjective valuations as to what type of data the survey covers. One can imagine research data that has not been made digital but which can be digitalized in the future. It is common to use several data sources in research. It is useful to delineate between source data and output data. Source data is data that already exist independently of the research to be un- 5.1.4 Archiving ‘Storage’, ‘archiving’ and ‘preservation’ are all terms used to describe how access to data at some later point in time is ensured. Although no clear distinction between the three terms can be made, storage might be understood as the saving of data during a project, archiving as the medium- to long-term saving of data after a project, and preservation as professional saving for even longer periods. This study focuses on the viewpoints of researchers and how they deal with their research data; in this study, we have used the term ‘data archiving’ to denote storage beyond the lifetime of a project. dertaken. This may be information that is collected for a different purpose (e.g., administrative data or clinical data) or physical or digitized collections of As an introduction to the survey, we informed the respondents as to how archiving is defined: objects and texts (such as libraries, text corpuses and other scientific collections). Data archiving refers to the long-term storage of scientific data and methods. Typically, data are ar- Output data is data generated through research. This can be data generated through new analysis or a compilation of existing data sources, but it can also be completely new data generated through new data collection. Typically, such data will be data from experiments, simulations, field work or interviews. However, the distinction between primary (output) and secondary (source) data can sometimes be subjective and contextual. As such chived at the end of a research project or else after a scientific publication or report has been prepared. Parse.Insight (2010) used the term ‘digital preservation’ to refer to a set of processes and activities that ensure continued access to information in digital form. It denotes the process of storing digital information in such a way that it remains accessible, understandable and usable over the long-term (usually five, 10 or 50 or more years). The survey explored 24 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM several related activities, such as taking into ac- Open access to research data is the practice of count environmental changes (preservation watch), providing access on equal terms at the lowest pos- preservation planning (what needs to be done and sible cost, preferably at no more than the marginal when) and preservation actions (e.g., migration and cost of dissemination. emulation). Concerning the term open access; this is in some We have chosen not to use the term digital preser- instances merely used for open access to research vation as a term in the survey, as it can be under- publications and not to data. Therefore, we have stood as an activity for professional data managers specified throughout the survey that we do deal with rather than as an active action that researchers un- open access to research data. dertake in their everyday research activities. 5.2 5.1.5 Open access to research data Selecting the population To ensure robustness of survey results, it was im- We have used the term ‘open access’ to research data in line with OECD guidelines, which state that ‘openness’ refers to access on equal terms for the international research community at the lowest possible cost, preferably at no more than the marginal cost of dissemination. portant to obtain a representative number of completed answers from each sub-population (i.e., the university sector, research institutes and health trusts). With representative sub-samples, we are able to compare different groups of respondents. We sampled our population by randomly selecting The OECD guidelines also states that open access to research data from public funding should be easy, timely, user-friendly and - preferably - Internet-based. researchers from CRIStin.18 In addition to the mentioned sub-populations, we sought representativeness within research disciplines in research institutes, universities and university colleges. All the sub-populations had a representative number of The latter part of these guidelines can be seen as a normative judgement rather that a definition of the term; therefore, to avoid misunderstandings and dif- completed surveys once the survey had ended, with the exception being the Humanities within research institutions. ferences in interpretation, we have not included this definition in the survey. In turn, we have used the following definition, of which the respondents were also informed as an introduction to the survey: 18 The Research Information System CRIStin is a tool aimed at the recording and promotion of publication data, projects, units and competency profiles. The system is also used to report publication points. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 25 5.3 Survey process confirm the hypotheses and gain a better understanding of the status of the archiving of and open In-depth understanding and knowledge reduces the access to publicly funded research data in Norway. risk of misinterpreting questions and ensures that a survey can cover all areas of the topic in question. An initial draft of the survey was developed in Ena- Prior to designing the survey, we conducted exten- lyzer Survey Solution. We tested the draft exten- sive desk research including a literature review. sively, both internally and in collaboration with the Based on this, we formed a survey grounded on a Research Council of Norway. These tests helped to proper understanding of the obstacles and barriers ensure that the survey addressed the central hy- to open access for research data. potheses. Further, it was important that the questions asked should be unambiguous and easy to un- Getting the researchers’ views also helped to define derstand on the part of the respondent. Finally, it the questions and their response alternatives. As was of particular importance that the survey should such, we conducted explorative and in-depth inter- draw a clear distinction between what information views and participated at a workshop on data man- was needed and what information would be useful agement organized by the Research Council of Nor- to have. Thus, we did not want a survey that was way. With the information provided, we were able to too long or contained irrelevant information. TABLE 5.1 Population, invites, response rates and the degree of representativeness Universities and university colleges Population Invites Response rate Degree of representation 2,360 876 22.9% 114.9% 699 576 24.1% 93.3% Mathematics and natural science 1,599 599 31.1% 110.1% Medical science 3,779 716 28.2% 112.2% Social science 2,488 746 28.6% 121.0% Humanities Agriculture and fishery Technology 1,767 557 28.7% 93.6% Health trusts Medical science 1,867 501 28.9% 84.3% Research institutes Humanities 101 83 38.6% 50.0% 1,334 438 41.8% 110.2% Mathematics and natural science 555 407 33.9% 97.9% Medical science 588 411 37.2% 107.0% Social science 564 386 41.2% 112.0% Technology 1,162 486 34.4% 102.5% Total 18,863 6,782 30.6% Agriculture and fishery Source: DAMVAD Note: The degree of representativeness covers how close the survey are to be representative for each subpopulation allowing for a 6 percent error level at a 90 percent confidence interval. This means that within a 6 percent margin the analytic is 90 percent confident that the population is representative. 26 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM After developing the survey internally, DAMVAD invited 68 randomly drawn researchers from CRIStin 5.5 A significant proportion of the researchers actively chose not to participate to test the survey electronically using Enalyzer Survey Solution. Ten researchers completed the pilot, Approximately 600 researchers actively chose not which provided us with good feedback. to participate in the survey. Health trusts saw the highest share of researchers not willing to partici- After adjusting the pilot survey, the final survey was pate, as shown in Figure 5.1. launched by email through Enalyzer Survey Solution on the 18th of December 2013 (week 51). The share of respondents that did not want to participate in the survey was higher than what we have 5.4 Response rate experienced in other surveys. There are variations across sectors and research disciplines. 16.6 per- DAMVAD invited 9,262 researchers to participate in cent of those working in health trusts did not wish to the survey, of which 2,480 email addresses were no participate in the survey. Likewise, 14.6 percent longer working. This left us with 6,782 active re- working in the research institute sector in agriculture spondents. 1,474 researchers completed the survey and fishery did not wish to participate. while 604 actively chose not to participate. Researchers at universities were keener on particiThe response rate for the population as a whole was pating. An average of six percent did not wish to par- 30.6 percent, while it varied between 23 percent and ticipate, with the lowest share being in the research 42 percent within different sub-populations. Figure disciplines of mathematics and the natural sciences, 5.1 includes a complete overview of the number of whereby five percent did not wish to participate. invites, the response rates and the population size of our sample. One of the main objectives of the survey was to ensure representation in all the relevant sub-populations. The representative number of completed surveys varied according to the size of the total population. As the population size increases, the number of completed surveys needed for a representative sample as a percentage of the population will fall. That is, for small populations, a large portion of the actual population needs to complete the survey in order to generate a representative sample. The degree of representation is smaller for medical science performed at health trusts (84.2 percent) in comparison to medical science performed at research institutes (107 percent). SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 27 FIGURE 5.1 A significant share of researchers actively chose not to participate. 20% 17% 15% 12% 10% 6% 5% 0% Health trusts 20% 16,6 % 15% 10% 14,6 % 14,4 % 13,3 % Research institutes 11,3 % 9,8 % 9,3 % 5% 0% Source: DAMVAD 28 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 8,3 % Universities and university colleges 7,6 % 5,9 % 5,9 % 5,2 % 5,0 % 6 Descriptive statistics This chapter presents the characteristics of the survey respondents. Information about the researchers that have completed the survey is useful both to assess robustness of findings and to illustrate the TABLE 6.1 Participation across sector (“At what type of institution is your main occupation?”) Sector Freq. Pct. Research institute 649 44.0% category is important for later discussions and com- University and university college 607 41.2% parisons of findings and differences between sec- Hospital trust 158 10.7% Other 60 4.1% complexity of the researcher population. A sufficient numbers of respondents within each tors, research disciplines and scientific experience. Table 6.1 shows the distribution of the respondents for different sectors. Research institutes and univer- Source: DAMVAD Note: Other covers private organisations, non-profit and foundations sities together cover 85 percent of the respondents. Eleven percent are in hospital trusts while the last four percent comprise others, covering inter alia Table 6.2 shows the differences in gender across companies. male respondents, at 60 percent yet respondents respondents. There is an over-representation of represent a representative sample of both genders. The distribution between research institutes and universities is relatively even, which allows for comparisons between the two respondent groups. The number of respondents from hospital trusts is lower than for the two other sectors. TABLE 6.2 Participation across gender (“What is your gender”?) Gender Freq. Pct. Female 602 40.8% Male 872 59.2% Total 1,474 100% Table 6.1 shows the distribution of respondents by affiliation. One concern is the level of respondents within the hospital trust. From table 5.1 we saw that we only reached an 84.3 percent representative level. Though 84.3 percent is relatively high it still Source: DAMVAD not qualify as full statistically representative. With 158 observations, we still find that we can use the Although the survey allows for analysis based on category when comparing with other sectors. Nev- gender, gender is not used extensively to compare ertheless, we will keep in mind the limitations of this the results. This dimension is interesting, but less category. relevant to specific policies and strategies going forward, where most efforts will need to cut across gender. Table 6.3 shows the distribution of respondents across research disciplines. Social sciences and health sciences are the disciplines with the highest SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 29 TABLE 6.3 Participation across research discipline (“Which is your primary research discipline?”) Field of research Freq. Pct. TABLE 6.4 Participation across scientific experience (“For how many years have research constituted a major part of your work (including PhD or similar?”) Scientific experience Social science 323 Pct. Less than 3 years 123 9.5% 3 - 6 years 298 23.1% 7 - 10 years 233 18.1% 21.9% Health science 319 21.6% Mathematics and science 271 18.4% Technology 178 12.1% 11 - 20 years 330 25.6% Farming and fishery 159 10.8% More than 20 years 306 23.7% Humanities 122 8.3% Total 1,290 100% Other 102 6.9% Total 1,474 100% Source: DAMVAD Source: DAMVAD Note: “Others” typically covers multi-disciplinary research amount of responses, whereas humanities have the fewest. In total, we estimate that the different categories are well represented, enabling robust analysis and comparisons across different research disciplines. Finally, table 6.4 shows the distribution of respondents by scientific experience. We measure scientific experience in terms of the number of years the respondents have been conducting research (i.e., the number of years since and including their PhD). The distribution of the respondents is on this aspect as well. One fourth have conducted research for 11 to 20 years, and almost the same share have conducted research for more than 20 years. 30 Freq. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 7 Researchers use data generated by other researchers This section includes the findings related to how re- Compared with health trusts, only half of the re- searchers generate and use data. This constitutes spondents working in universities generated numer- an important context to understand possibilities and ical scores. Researchers at universities mainly use limitations for archiving but especially sharing. textual records – as 28 percent stated that they mainly generated textual records. This is shown in We have applied a definition of research data in line figure 7.1 on the following page. with the OECD guidelines, which makes the distinction between numerical records, textual records, Textual records are more common within social sci- sounds, images, videos and graphics. ences and humanities. Almost 50 percent of the responding researchers in these fields stated that they The questions about data formats are included in mainly generate textual records, in contrast to math- the survey for two reasons. In particular, they offer ematics and natural sciences where very few (7 per- an interesting perspective as to what kind of re- cent) primarily generated textual records. search data are most commonly used. Further, they also allow for the investigation of whether researchers’ views on the sharing and archiving of data differ across data formats. 7.1 Data formats vary across research disci- TABLE 7.1 Type of data generated (“What is the main format of your research data?”) Freq. Pct. Numerical scores 865 58.7% Textual records 337 22.9% Images, sounds, videos and graphics 72 4.9% I do not generate any research data 62 4.2% Other 138 9.4% Total 1,474 100% plines Three-quarters of the respondents generated numerical data, (e.g., quantitative data, data models, data series, statistics, etc.). Health trusts in particular use numerical data in their research. Of all the respondents, almost 60 percent stated that they mainly generate numerical data. This is especially true for agriculture and fishing, as well as in mathematics, the natural sciences and medicine. A total of Source: DAMVAD 23 percent generate textual records and 5 percent most frequently generated images, sounds, videos and so forth. Some researchers report that they do not generate data at all. This is true for 6 percent of the respond- In humanities, numerical data is rare. Researchers ents at universities, 3 percent at research institutes in humanities typically base their research on tex- and 2 percent at health trusts did not generate data tual records (qualitative data, field report, inter- at all. There are differences between research dis- views, social studies, etc.), images, sound and ciplines as to who does not generate data. For ex- alike, or else they do not generate data at all. Only ample, no data is reported by 1 percent within agri- 14 percent of the respondents within the Humanities culture and fishing, but 7 percent within technology. stated that they mainly generate numerical data. This is not showed in the figure. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 31 The distribution of the respondents also shows that 7.2 Numerical data are easier to restore approximately 10 percent within each group have answered ‘other’. Going through the survey, this As illustrated in Figure 7.2, numerical data are eas- most often implies that they generate both numeri- ier to restore. Fifteen percent answered that the nu- cal and textual data. merical data could be restored very easily, and almost 50 percent stated that they could restore their We found little evidence of differences in terms of numerical data with the same effort as they used the type of data generated by experience, which when producing the data. means that we will not present or comment upon the types of data generated by researchers with differ- Textual records is the source of data that is hardest ent levels of experience. to restore. Almost 50 percent answered that textual records are either impossible to restore or at least difficult to restore such data. FIGURE 7.1 Data format, by institution (“What is the main format of the research data you generate?”) 80% 76% 70% 63% 60% 50% 50% 40% 28% 30% 22% 20% 8% 10% 6% 4% 6% 4% 3% 2% 0% Numerical scores Textual records Univerities and university colleges Research institutes Source: DAMVAD 32 Images, sounds, videos and graphics SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Do not generate data Health trusts (hospitals) FIGURE 7.2 Numerical data is easily restored (“If your research data gets lost, how easily can you recreate it”?) 60% 48% 50% 47% 40% 34% 32% 30% 20% 24% 19% 12% 16% 15% 10% 15% 10% 13% 9% 6% 1% 0% Numerical data Hardly Not possible Textual records Very easily Images, sounds, videos and graphics With same effort I don’t know Source: DAMVAD 7.3 Researchers frequently use other researchers’ data The survey also asked researchers about the extent TABLE 7.2 Use of other researchers’ data (“Have you within the last three years used research data gathered by other researchers?”) to which they use other researchers’ data in their Freq. Pct. work, and the extent to which they share their own data with other researchers. No 508 36.0% Many researchers have utilized research data of Yes 904 64.0% 1,412 100% other researchers. Almost two thirds of the responding researchers had utilized research data provided by researchers within the past three years. Total Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 33 Across affiliations, most researcher use data gath- years. This a little less common within health trusts. ered by other researchers. Yet 52 percent within health trusts have used data gathered by other researcher within the last three FIGURE 7.3 Researchers use other researcher’s data, by sector (“Have you within the last three years used research data gathered by other researchers”) 80% Univerities and university colleges 68% 62% 60% 52% search disciplines. The use of other researchers’ data seems more commonplace within mathematics and natural sciences and - to a lesser extent - within humanities and medical science. This corresponds to our hy- Research institutes 40% years. Differences are more important across re- pothesis and international studies across disciplines (Tenopir, 2011). 20% Health trusts (hospitals) Specifically, 50 percent of the respondents within humanities and 44 percent in medical science report 0% not to have used research gathered by other reSource: DAMVAD Note: The figure only include those that have answered “yes” to the question: “Have you within the last three years used research data gathered by other researchers?” searchers within the last three years. In comparison, the share is 24 percent within mathematics and the natural sciences, and 32 percent within agriculture Figure 7.3 shows that 68 percent of the respondents and fishery. within research institutes have used research data gathered by other researchers within the last three FIGURE 7.4 Researchers use of other researcher’s data, across disciplines (“Have you within the last three years used research data gathered by other researchers») 76% 80% 68% 70% 60% 63% 67% 56% 50% 50% 50% 44% 40% 37% 32% 30% 33% 24% 20% 10% 0% Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology Source: DAMVAD 34 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 7.4 Researchers mainly use data produced Table 7.3 contrasts this finding with their reported interest in using data. by other researchers from the same institution Almost three-quarters (71.5 percent) of the respondents would like to make use of other researchResearchers do not travel far in their search for re- ers’ data. In other words, there is a substantial un- search data. Two thirds of the researchers used re- met demand for research data generated by other search data produced by other researchers from the researchers. As few as 144 respondents (10 per- same institution. cent of the total respondent group) have not used research data generated from other researchers for However, many respondents also utilize data gath- the past three years, and do not wish to use data ered by researchers from international institutions. generated by others. Across all respondents, 56 percent stated that they used data from other researchers at international in- Nine out of ten respondents either want, or are al- stitutions. ready using, research data gathered by other researchers. TABLE 7.3 Researchers use data produced by other researchers at their institute (“Whose research data have you used the most within the past 3 years?” Multiple answers allowed) Freq. Research data from other researchers at my institution Pct. 605 67.4% 435 48.4% TABLE 7.4 Researchers that have not used other researcher’s data, but would like to do so. (“If «no» to the above question: Would you like to make use of research data gathered by other researchers or institutions?”) Freq. Pct. No 144 28.5% Yes 362 71.5% Total 506 100% Research data from other researchers at national institutions Research data from other researchers at international insti- 503 56.0% Other 25 2.8% Total 1 568 175% Source: DAMVAD tutions. Source: DAMVAD 7.5 Researchers would like even better access to other researchers’ data As illustrated in Table 7.2 36 percent of researchers have not used data gathered by other researchers. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 35 8 Research data is rarely archived in data centres Data archiving refers to the long-term storage of sci- When asked about the most common way of archiv- entific data and methods. That is data which are ar- ing data, the vast majority of research data is stored chived at the end of a research project or else after a scientific publication or report has been prepared. locally, either on researchers’ own personal computers, USB or CD/DVD/floppy disks, or on local servers at their institutes. More than 80 percent ar- Archiving of data is an important prerequisite for val- chive data locally (Table 8.1). idation of research findings. Infrastructure for archiving data can also be an important enabling fac- One out of ten stored their data at central data ar- tor for sharing data. chive centres, either at their organizations or at national centres. Finally, less than two percent used This chapter presents the findings related to re- archive solutions outside of Norway. searchers’ practise concerning archiving of research data. These findings are both surprising and cause for concern. The major concern relates to data security. 8.1 Most data is archived on portable storage If sensitive data is stored on CD/DVDs or personal units or institutional servers computers, they are vulnerable to Internet-based intrusions. Institutional servers are better at keeping Various systems for data archiving exits. One can intruders out, but they are still not as good, or as easily imagine that researchers use a variety of data secure, as more professional data archive centres archiving solutions. Sometimes data are archived at (either local or national) which specialize in taking the institutional server, other times at a national data care of sensitive data. archive centre. TABLE 8.1 Data archiving (“What is the most common way of archiving your research data after results are ready or beyond the life of a project?”) Where do you mainly store the data you generate? Freq. Pct. Portable storage unit 235 18.2% Institutional server 850 65.6% Data is submitted to digital archive centre in my organisation 108 8.3% Data is submitted to a national digital archive centre 23 1.8% Data is submitted to an international digital archive centre 22 1.7% Do not archive 34 2.6% Other 23 1.8% Total 1,295 100% Source: DAMVAD 36 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Further, there is an issue concerning the restoration 84 percent for research institutes and 88 percent for of data if they are lost. If a USB is lost or if a personal hospital trust. computer crashes, it can be rather difficult to restore the data, and months of work can be lost. That in turn leaves a rather limited share of respondents that archive their data on archiving centres ei- There are some differences between types of affili- ther nationally or internationally. At universities and ations. Storing locally on portable storage units is university colleges it is 3 percent that mainly store more common at the universities compared to the their data on national archiving centres. The same institute sector. Researchers at research institutes figure research institutes are 1 percent whereas 2 more often use institutional servers to store their percent within health trust archive mainly share their data. data at national archiving centres. But in general figure 8.1 confirms that 85 percent of respondents mainly store their data on a portable storage unit or at the institutional server. The figure is 82 percent for universities and university collages, FIGURE 8.1 Data archiving across institution (“What is the most common way of archiving your research data after results are ready or beyond the life of a project?”) 80% 74% 71% 70% 60% 54% 50% 40% 30% 28% 17% 20% 10% 10% 7% 10% 7% 3% 1% 2% 2% 1% 2% 4% 2% 1% National digital archive/data center International digital archive/data center Do not archive data 0% Portable storage unit Institutional server Organizational digital archive/data center Universities and university colleges Research institutes Health trusts (hospitals) Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 37 FIGURE 8.2 Data archiving by research discipline (“What is the most common way of archiving your research data after results are ready or beyond the life of a project?”) 90% 81% 80% 70% 70% 59% 60% 30% 20% 10% 61% 48% 50% 40% 73% 34% 21% 19% 19% 7% 14% 9%9%7% 8% 6% 11% 2% 3%2%3%1% 0% 5% 1%1% 1%1%1% 6% 4% 1%2%1% 0% International digital archive/data center Do not archive data 0% Portable storage unit Humanities Institutional server Agriculture and fishery Organizational digital archive/data center National digital archive/data center Mathematics and natural science Medical science Social science Technology Source: DAMVAD percent of researchers store their data on instituThese differences between how stores at local port- tional servers. This is illustrated in Figure 8.2 able units or institutional servers are largely explained by differences between research disci- 8.2 Storage reflects costs of recreation plines. Humanities, where portable storage is more common, are strongly represented in the university The implications of losing data is particularly signifi- sector, while agriculture and fishery are strongly cant for data that would be costly or impossible to represented in the research institutes and report a regenerate. higher share of centralized storage. Data that can be regenerated with the same effort 38 Within humanities, 34 percent store their data on a as its initial creation is more commonly stored on portable storage unit. In agriculture and fishing, 81 portable storage units or institutional servers than SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM by other storage methods. 9 percent of data stored In total we see that those researcher that do not ar- at a portable unit can be restored very easily. chive their data also find it harder or even impossi- Whereas 27 percent of data stored at international ble to restore data. Almost 60 percent of those not archives and data centres can easily be restored. archiving their data cannot restore their data without extensive efforts. This should be compared to those On the other hand, we see that 29 percent of the archiving at national or international archiving cen- respondents, who do not archive their data, does tres where 50 percent or more can restore their data not have the opportunity to restore data. That is only with the same effort or even a lesser effort. the case for 9 percent of those storing their data on national archiving centres. FIGURE 8.3 Data archiving by data regeneration (“What is the most common way of archiving your research data after results are ready or beyond the life of a project?”) 60% 50% 50% 45% 40% 43% 35% 29% 30% 29% 27% 26%26% 24% 23% 22% 23% 20% 17% 13%13% 10% 15% 14% 14% 12% 11% 9% 9% 9% 0% Not possible Hardly With same effort Very easily Portable storage unit Institutional server Organizational digital archive/data center National digital archive/data center International digital archive/data center Do not archive data Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 39 8.3 Most researchers are satisfied with their current archiving solution Most respondents seem to be satisfied with their TABLE 8.3 Satisfaction with current archiving solutions (“If “not”, why are you not satisfied with your archiving solution?” Multiple answers allowed) current solution for the archiving of data. Two thirds stated that they were satisfied with their current so- Too complicated to use lutions, while 15 percent did not know. TABLE 8.2 Satisfaction with current archiving solutions (“Are you satisfied with your archiving solution?”) Yes No I don’t know Total Freq. Pct. 841 235 66.7% 18.6% 185 1,261 14.7% 100% Source: DAMVAD Freq. Pct. 41 17.5% Too expensive 3 1.3% Too little capacity 51 21.8% Too many archiving solutions 64 27.4% Not secure enough 115 49.1% Other 64 27.4% Total 338 144% Source: DAMVAD These are all important barriers to the active use of archiving solutions as a part of sharing data. Many researchers deal with sensitive information, and hence, security is essential. The presence of too Most respondents were satisfied with their current many archiving solutions means that it can be diffi- archiving solutions. Two-thirds of the respondents cult for researchers to know where to archive their stated that they were satisfied with their current ar- data and that it can be time-consuming for research- chiving solution. ers to archive their data. Further, it is noticeable that such systems are too complicated to use, which 8.4 Those who are not satisfied point to se- again will have the consequence that researchers curity risks will have to use valuable time to archive their data. One-fifth report that they are not satisfied. Of these, half point out lacking security as a problem. Others pointed out that archiving is too complicated, that there is not enough capacity, and even that there are too many possible solutions. Interestingly, more than 25 percent answered ‘other’ to the question about satisfaction. When answering ‘other’, the respondents were able to add comments describing what they meant by this. The frequently used arguments are categorized and included in table 8.4. 40 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 8.5 Archiving activities are financed as a part Nearly 11 percent respond “other” to this question. of project- and institutional funding Many of these argue that archiving activities is not funded or that they do not know how it is funded. Archiving activities are financed mainly as a part of Some even say that they have paid for archiving so- institutional funding. Further, many such activities lutions them selves. are financed on a project-by-project basis. TABLE 8.4 How archiving activities is financed (“How is your archiving activities financed?”) Freq. Pct. 435 34.4% 70 5.5% Part of institutional funding 626 49.5% Other 134 10.6% Total 1265 100.0% Part of research projects Part of funding for researchbased operative tasks Source: DAMVAD TABLE 8.5 Frequently used argument in the open answer on why researchers are not satisfied with their existing archiving solution Argument posed The archiving solution does not enable sharing data with others Not easily accessible by other researchers Too complicated and time consuming to use archiving systems, quoting the respondents Too time consuming to do all the back-up solutions are non-standard ad-hoc No common procedure for archiving makes it difficult and often not properly done. Lack of routines about how to store raw data. Lack of security and stability of the archiving systems Damage to hard drives pose a risk The back-up regime is not reliable Data has been lost due to change in storing technology, e.g. magnetic tapes were discarded without transfer of content to a new media. We do not trust the back up and use our own external hard disk Data can be lost at system upgrades etc. Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 41 9 Most researchers share research data Research is cumulative in the way that research of- 9.2 Many researchers are left undecided ten build on previous research. Similarly does researchers often use research data of other re- Although most researchers agree on the benefits of searchers’ in line with the principles of the OECD sharing data, many are undecided as to whether guidelines. publicly funded research data should be open or whether it should be considered public property. Ta- 9.1 Researchers are positive to the principle ble 9.1 illustrates this. of open access Of the 20 percent who either do not agree that open Researchers clearly see the benefits of the sharing access to research data will enhance research or and archiving of research data. About 80 percent of that it is an ethical obligation of research to make the respondents agree that open access to research research data available for validation, between 15 data enhanced research. and 16 percent are undecided and 4-5 percent disagree. In addition, 79 percent agree that it is an ethical obligation of research to make research data available Similarly, 53 percent agree that publicly funded re- for validation. These are the two reasons for open search data should be public property, 31 percent access to research data that most researchers are undecided and 15 percent disagree. In both agree to. Only 6.5 percent agreed that open access cases, the share of undecided is higher than the to research data would lead to less interesting re- share of disagreeing. search. The relatively high number of researchers who did Further, 77 percent and 74 percent agree that open not want to participate in the survey can also be access to research data facilitates the education of seen as an indication of the complexity of the issue. students and new researchers and that open access to research data stimulates research collabo- For both those who support and those who disagree ration respectively. with the overall principle of open access to research data, views are elaborated in the below: Below there a comment underpinning a positive attitude towards sharing data: “I don't see any challenges. Free access to everything. “ “As a matter of principle, generated data of a certain magnitude (small-scale surveys exempted) on pub- “Researchers should not hoard their data, espe- licly funded projects should be shared with the rest cially if publicly funded. After publishing their work - of the research community. A data set can in most the data should ideally be available to others for ro- cases be used and analysed for diverse purposes. bustness testing, replication, and the exploration of In my view this is a matter of research ethics and new hypotheses.” should be included in the guidelines of the National Committee for Research Ethics in the Social Sciences and the Humanities (NESH).” 42 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM “Researchers will often dislike open access be- 9.3 cause they are afraid of having made mistakes that Health trusts are positive towards the effects of sharing data on research might be revealed or having missed important patterns in the data that others might get good publica- There are only small differences across sectors, re- tions from. More important however, is that science search disciplines and professional experience. is a social process where checking and challenging each other is what moves us forward - even if the When looking across sectors, researchers at health process itself can be painful for the participants.” trusts are a bit more positive towards the effects of sharing data. I see this as a hyped up issue. As long as interpretations are needed to make sense of the data, there Figure 9.1 illustrates this. The figure shows re- is no way those data are useful for others unless the spondents’ positions in relation to the question on original researchers are also part of a new study in- whether open access to research data would en- volving the data. hance research. TABLE 9.1 Attitudes towards open access to research data (“Please indicate if you agree to the following statements related to open access to research data:”) Agree Open access to research data will enhance research Open access to research data will stimulate more research collaborations Open access to research data will make research less interesting Open access to research data will facilitate education of students and new researchers Publicly funded research data should not be public property Undecided Disagree Freq. Pct. Freq. Pct. Freq. Pct. 1 098 80.2% 209 15.3% 62 4.5% 1 012 73.9% 264 19.3% 93 6.8% 89 6.5% 207 15.1% 1 073 78.4% 1 050 76.7% 255 18.6% 64 4.7% 213 15.6% 428 31.3% 728 53.2% 290 21.2% 482 35.2% 597 43.6% 1 084 79.2% 217 15.9% 68 5.0% Lack of open access to research data has restricted my ability to answer scientific questions It is an research-ethical obligation to make data available Source: DAMVAD Note: We have collapsed the positive statements in the survey “I strongly agree” and “I agree” and called it “agree” in the table. Likewise we have collapsed “I strongly disagree” and “I disagree” and called it “Disagree” in the table. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 43 When looking across, disciplines respondents in agFIGURE 9.1 Attitude towards open access to research data (“Open access to research data will to research data will enhance research”) 100% 74% 73% ents within humanities are most positive. As illustrated in figure 9.2, only 2 percent within humanities disagreed with the statement that “open access to Univerities and university colleges 83% 80% riculture and fishing are less positive and respond- 60% research data will stimulate to more research collaboration” while the number is 14 percent within ag- Research institutes 40% 20% riculture and fishing. Respondents from agriculture and fishing, along- Hospital trusts (hospitals) 0% side those within social sciences, were also the most undecided. As many as 27 percent within so- Agree cial sciences and 24 percent within agriculture and Source: DAMVAD Note: We have collapsed the positive statements in the survey “I strongly agree” and “I agree” and called it “agree” fishing declared themselves undecided as to the statement “open access to research data will stimulate to more research collaboration.” FIGURE 9.2 Attitude towards open access to research data (“Open access to research data to research data will stimulate to more research collaboration”) 90% 84% 77% 80% 70% 80% 79% 66% 62% 60% 50% 40% 30% 27% 24% 20% 14% 14% 7% 10% 2% 5% 7% 16% 16% 15% 5% 0% Agree Humanities Agriculture and fishery Disagree Mathematics and natural science Undecided Medical science Social science Technology Source: DAMVAD Note: We have collapsed the positive statements in the survey “I strongly agree” and “I agree” and called it “agree” in the figure. Likewise we have collapsed “I strongly disagree” and “I disagree” and called it “Disagree” in the figure. 44 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Differences across professional experience are adequate explanation of the data and will best en- negligible and thus not reported here. sure that my contribution is adequately acknowledged.” 9.4 Researchers share their research data, but upon request Many researchers use research data generated by TABLE 9.2 Data availability (“Which of the following applies to the accessibility of most of your research data?”) others. Even more researchers support the idea of Freq. Pct. Data is available to all 195 15.7% Data is available to other researchers 148 11.9% 472 37.9% 73 5.9% 122 9.8% Data is not available 206 16.5% Other 29 2.3% Total 1 245 100% sharing. Logically, one would expect that many researchers also share research data with other researchers. The survey confirms that most researchers share data with other researchers. Only 16% of the respondents stated that most of their research data is not available to other researchers. Further, 16 percent of the generated research data is available to everyone, while 12 percent of the generated research data is only available to other researchers. Available for other researchers, but only upon request For other researchers, but under a license or non-disclosure agreement Could be made available with appropriate changes About half of the respondents state that their data is available to other researchers, but only upon re- Source: DAMVAD quest or under certain conditions. Researchers typically prefer to keep track of who is accessing their 9.5 More openness within humanities data and for what purpose. Consequently, each researcher becomes a gatekeeper for her own data. There seems to be more openness towards data sharing within humanities compared to other re- There are many reasons for being more restrictive search disciplines. in practice about one’s own data than in principle. In medical sciences and social sciences, 12 percent One is to ensure data is understood and used cor- report that they generate data that is readily availa- rectly. One researcher commented that: ble to others. The corresponding share within humanities is one third (see figure 9.3). Indeed, one “Generally, there is no big impediment against shar- might argue that the research data and sharing pos- ing my research data. I feel, however, that in most sibilities are fundamentally different between the cases it is best done on a case-by-case basis upon medical sciences and the humanities. a personal request because this allows me to give SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 45 For example, and as illustrated in figure 7.1, 76 per- of data. Researchers in humanities are somewhat cent of data generated within the medical sciences more inclined to unconditionally share data than is numerical data and 12 percent textual scores. their colleagues in social sciences. This is in contrast to humanities, where 14 percent is numerical data and 48 percent consist of textual Sharing data that is not otherwise available is more scores. common within health and social sciences, at 24 and 25 percent respectively. Only 6 percent within The comparison is perhaps more interesting be- mathematics stated that most of their research data tween social sciences and humanities. is not available. For agriculture and fishing, humanities and technology, the share is between 12 per- The two have an equal share of respondents gener- cent and 15 percent ating textual scores, though they are somewhat different when it comes to numerical data. These two disciplines differ in their approaches to the sharing Figure 9.3 Data availability across research discipline (“Which of the following applies to the accessibility of most of your research data?”) 70% 60% 60% 57% 54% 52% 50% 50% 43% 40% 33% 30% 20% 24% 25% 19% 14% 17% 16% 12% 12% 10% 10% 17% 14% 13% 9% 7% 15% 12% 6% 0% Data is available Humanities Data is available for other researchers Agriculture and fishery Data is available on demand Mathematics and natural science Medical science Data is not available Social science Technology Source: DAMVAD Note: The statement “Data is available on demand” consists of the following possible answers: “ Available for other researchers, but only upon request”, “ For other researchers, but under a license or non-disclosure agreement”, “ Could be made available with appropriate changes” 46 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 9.6 More openness among more experienced researchers Researchers with more experience appear more confident in sharing data. Among the respondents with more than 20 years of research experience, 19 pct report that their data is free to use. In comparison, 11 percent of the less experienced researchers make their freely available. FIGURE 9.4 Data availability across experience (“Which of the following applies to the accessibility of most of your research data?”) 70% 60% 60% 53% 55% 51% 50% 45% 40% 30% 23% 19% 20% 14% 18% 18% 15% 16% 15% 15% 15% 12% 11% 12% 8% 9% 10% 0% Data is available Data is available for other researchers Less than 3 years 3 - 6 years 7 - 10 years Data is available on demand 11 - 20 years Data is not available More than 20 years Source: DAMVAD Note: The statement “Data is available on demand” consists of the following possible answers: “ Available for other researchers, but only upon request”, “ For other researchers, but under a license or non-disclosure agreement”, “ Could be made available with appropriate changes” SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 47 10 Lack of time, infrastructure and incentives hamper further sharing of data Although researchers’ already share data with other 10.1 Variety of barriers researchers, the study indicate that there is a potential for more sharing. As expected, the survey document that there is a variety of obstacles for sharing research data. Some Former studies and interviews point to a number of research data cannot be shared due to issue of pri- barriers for further sharing of research data. A cen- vacy, commercial issues or shared ownership. tral objective of the study is to identify the main bar- These aspects are however, not the most important riers for more sharing of research data in Norway. barriers. This chapter presents the main findings on barriers and obstacles for sharing of research data in Nor- The time involved is a main barrier to sharing data. way and possible ways to reduce these barriers. Almost one-third of the respondents pointed out that TABLE 10.1 Main barriers towards sharing research data (“Do you see any challenges in making more of your research data available for other researchers”? Maximum 3 answers). Ordered by frequency. Frequency Pct. Preparing data for open access takes away valuable time for research 386 31.4% Lack of technical infrastructure 300 24.4% Reduce possibilities of future scientific publications 300 24.4% I am afraid other researchers will not understand my data 259 21.0% I cannot give access due to sensitivity issues 249 20.2% I cannot give access due to shared ownership 212 17.2% I don't know 188 15.3% I am afraid data will be misused 147 11.9% I cannot give access due to intellectual property rights 135 11.0% Open access to research data might have a negative economic impact for me and my institution 85 6.9% It would be unethical 82 6.7% I cannot give access due to commercial issues 80 6.5% I do not believe my research data is of interest to others 73 5.9% I do not believe data is secure at a data centre, journal site or alike 59 4.8% Other 53 4.3% Total 2,608 Source: DAMVAD 48 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM preparing data for open access takes away valuable publications, arguing that researchers should have time for research. exclusive ownership to their data for an extended period of time: One-quarter of the respondents pointed out that lacking technical infrastructure is a challenge for the “Data can be made available to others, but only after sharing of data. One way of reducing the time-con- our institutions have had a reasonable period (3 straint would be to improve the technical infrastruc- years, for example) to analyse and publish in order ture. to justify the high costs for data gathering. Or else, no institution will pay for data gathering.” Further, many researchers’ are concerned that sharing data would reduce their possibilities regard- “In general I may want some time of non-access ing future scientific publications (25 percent). (say 1-3 years) giving us, the researchers carrying out the project, the possibility of presenting re- More than 20 percent of the respondents were sults/documentation first, but then, afterwards, I afraid that others would not understand their data. would be thrilled if others would apply my/our data for re-analysis or new types of analyses. / We do try Only one-fifth stated that they could not share data to support master students when they request use due to sensitivity issues or because of shared own- of our data, and I would also try to support other re- ership of the data. searchers in case of requests. / I do not know about funding of open access activity, thus any such ac- These findings support those in other international tivity will imply problems for my/our hour list.” surveys, such as Kvale (2012) and Tenopir (2011). Some respondents pointed out the risk of others not Given the opportunity to make additional comments understanding their data, and that it would require a if they chose the category ‘other’, the following com- significant effort to set up meta-data such that oth- ments were made: . ers would understand them: “I am in the process of making my research data as “Preparing data so others can easily use it in the public as possible. This takes a lot of time, and alt- right way takes a lot of time. Often this time is not hough I can't see any problem with it, there are little budgeted for and therefore the necessary data rewards except scientific/ethic satisfaction.” preparation is not possible within the given time frame for a project without taking away research “Preparing data would be very time consuming. “ time, using additional funding or using private time. / Making data available without a sufficiently de- “Research projects are often under financed and tailed description of methods and the data genera- setting up data and metadata to enable open ac- tion may lead to misinterpretations of data and pos- cess take extra time usually spent in the last part of sibly wrong use of data.” the project when the project run out of time and money.” “The limited resources and funding available for Many respondents focused on how open access long term field experiments requires very large ex- could reduce their chances of producing scientific tra input of labour from scientists as compared to SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 49 the actual hours we are paid for. One of the few in- “Most of my data could in principle be made availa- centives to continue to do this is to collect and have ble. Because of what I have informed my respond- unique access to the data. If anyone can use the ents about, the purpose of the study, and what the same data without contributing to the work that is data is going to be used for, it would not be ethically required in writing applications, designing and set- defendable to share the data with others for other ting up the experiments and collecting the actual purposes than originally planned.” data, a large part of the incentive for research is gone. Then what is left is lots of hard labour with 10.2 Relatively small differences across sec- very low hourly wages and limited credit for the tor ideas or the results - who wants that? Such a situation comes through as very unfair.” The survey does not indicate significant differences between respondent groups in terms of observed Some also comment on data sensitivity. The exam- barriers. This section summarized the observed dif- ple below shows that it is not only a question of how ferences. sensitive your data are - it is also a question of respect for the informants: Time constraints is a less important barrier for respondents working at health trusts than other types FIGURE 10.1 Main barriers for increased sharing of research data (“Do you see any challenges in making more of your research data available for other researchers”? Maximum 3 answers). Across sector. Only includes the five major obstacles. 40% 35% 35% 31% 28% 30% 23% 24% 25% 20% 28% 26% 26% 26% 23% 19% 20% 18% 16% 14% 15% 10% 5% 0% Making data Lack of technical available takes away infrastructure valuable time for research Univerities and university colleges Open access would Concerns connected Cannot give access reduce possibilities to misinterpretation due to sensitivity of scientific of data issues publications Research institutes Source: DAMVAD 50 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Health trusts (hospitals) of institutions. Researchers in health trusts are more within humanities, mathematics and the natural sci- concerned about the sensitivity of the data and the ences. Researchers in these disciplines are signifi- lack of infrastructure. Figure 10.1 shows the results. cantly more concerned about this challenge than are researchers in social sciences and technology. Researchers at institutes and universities are on the other hand more concerned with time, but also with Sensitivity issues was the key reason for not being the risk that others might misinterpret their data. able to share data within social sciences and health Within humanities and medical science, the re- science. Figure 10.2 shows the results. spondents are not particularly concerned about the misinterpretation of their data. Time is especially scarce for respondents within agriculture and fishing as well as those within mathematics and natural science. Some differences across disciplines When looking across disciplines, lack of technical infrastructure constitutes a particular challenge FIGURE 10.2 Main barriers for increased sharing of research data (“Do you see any challenges in making more of your research data available for other researchers”? Maximum 3 answers). Across research discipline. Only includes the five major obstacles. 45% 42% 39% 40% 39% 38% 33% 35% 30% 31% 29% 27% 31% 29% 25%26% 24% 25% 21% 20% 26%26% 26% 25% 23% 21% 20% 18% 18% 18% 16% 13% 15% 11% 8% 10% 7% 5% 5% 0% Making data available takes away valuable time for research Lack of technical infrastructure Open access would Concerns connected Cannot give access reduce possibilities to misinterpretation due to sensitivity of scientific of data issues publications Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 51 Younger researchers’ more concerned with sensitivity lack of experience with juridical issues or fear of In terms of professional experience, two differences are worth noting. First, the less experienced respondents did not think of time as a challenge - they might be more familiar with technology and various solutions for sharing files. The results are shown in figure 10.3 misuse. Otherwise, the survey suggests small differences in terms of the perceived current barriers and challenges for sharing data across years of experience. 10.3 Textual records are more sensible On the other hand, the respondents who were more inexperienced were more attentive and alert to possible sensitivity issues concerning their data. This seems to be less of an issue for the more experienced respondents. This might be a result of their This section discusses variations in responses across different data formats and the challenges the respondents foresaw. In figure 10.4 below, we can see that textual records involve more data that are sensitive. There are FIGURE 10.3 Main barriers for increased sharing of research data (“Do you see any challenges in making more of your research data available for other researchers”? Maximum 3 answers). Across experience Only includes the five major obstacles. 40% 36% 35% 34% 35% 30% 29% 28% 24%24% 25% 28% 27%27%27% 23% 22% 23% 24% 22%22% 19% 20% 19% 23% 21% 20% 17% 16% 15% 13% 10% 5% 0% Making data available takes away valuable time for research Lack of technical infrastructure Less than 3 years 3 - 6 years Open access would Concerns connected to Cannot give access due reduce possibilities of misinterpretation of to sensitivity issues scientific publications data 7 - 10 years Source: DAMVAD 52 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 11 - 20 years More than 20 years stronger concerns as to whether textual records will stated that open access to research data would re- be misinterpreted. The nexus of the textual records duce their possibilities regarding future scientific are thus more important than are numerical scores. publications. Furthermore, the respondents stated that they could not give access to textual scores due to sensitivity 10.4 Researchers see little support from management issues. On the other hand, there are time issues relating to The survey shows that there is a perceived lack of making numerical scores. Moreover, those re- support for open access to research data from man- spondents mainly working with numerical scores agement. FIGURE 10.4 Main barriers for more sharing of research data (“Do you see any challenges in making more of your research data available for other researchers”? Maximum 3 answers). Across data type Only includes the five major obstacles. 40% 36% 34% 35% 30% 30% 30% 28% 28% 26% 25% 23% 21% 20% 19% 20% 19% 16% 14% 14% 15% 10% 5% 0% Lack of technical infrastructure Concerns connected to Cannot give access due Making data available Open access would misinterpretation of to sensitivity issues takes away valuable reduce possibilities of data time for research scientific publications Numerical scores Textual records Images, sounds, videos and graphics Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 53 There has been little focus on implementing open data to be the responsibility of individual research- access to research data at the organizational level. ers/groups. There seem to be little focusing on long Less than 50 percent of the respondents reported term archiving and data handling.” that management encouraged open access to research data. 10.5 Limited institutional support Less than one-fifth (16 percent) of the respondents Our study suggests significant differences in the reported that their organization provided training on way research management deal with the sharing best practices for sharing research data. Guidelines and archiving of data. and standards existed for 27 percent of the respondents. Finally, 30 percent stated that their or- Management at research institutes facilitate open ganization provided tools, technical support and in- access to research data to a larger extent than the frastructure facilitating open access to research case is at universities and health trusts. Around 56 data. percent of the respondents within research institutes stated that their management, either to a high One of the respondents highlights the problem with or to some degree, “encourage[d] that our [the re- the following quote: spondent’s] research data should be open.” Only 30 percent of the respondents from health trusts stated “Institutions usually focus on measurement of per- that their management encouraged open access to formance through amount of publications and research data. coarse counting of results, leaving aspects of collecting, investigation of, archiving and handling of TABLE 10.2 Does management or the organisation support open access to research data? ( To what extent do you experience that open access to research data is implemented in your organization) Management encourage that our research data should be open My organization provides training on best practice for open access to research data To a high degree To some extent freq. pct. freq. pct. freq. pct. Freq. pct. 152 11.9% 435 34.1% 348 27.3% 340 26.7% 16 1.3% 187 14.7% 690 54.1% 382 30.0% 59 4.6% 286 22.4% 470 36.9% 460 36.1% 52 4.1% 336 26.4% 432 33.9% 455 35.7% Not at all I do not know My organization has guidelines and standards for data format and for assigning information to data My organization provides the necessary tools, technical support and technical infrastructure for open access to research data Source: DAMVAD 54 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM FIGURE 10.5 Does management encourage open access to research data? (“Management encourage that our research data should be open”) 45% Univerities and university colleges 40% 40% Research institutes 35% Hospital trusts (hospitals) FIGURE 10.6 Do the organisation provide support and solutions for open access to research data? (“My organization provides the necessary tools, technical support and technical infrastructure for open access to research data”) 50% Univerities and university colleges 45% 40% 30% 30% Research institutes Hospital trusts (hospitals) 35% 25% 31% 23% 30% 20% 15% 10% 24% 25% 16% 20% 9% 16% 15% 6% 5% 10% 0% 5% 6% To a high degree To some extent 3% 2% 0% To a high degree Source: DAMVAD To some extent Source: DAMVAD Research institutes have the best preconditions for sharing research data. The survey shows that 39 Very few answered that their organization “to a high percent of respondents at research institutes stated degree” provided the necessary solutions for open that their organization, either to a high or to some access. A high share of respondents state either extent, provided the necessary tools, technical sup- that they do not know whether their organization port and technical infrastructure for open access to provided the necessary solution or that in fact it did research data. At universities, the share is 27 per- not do so. cent and at health trusts it is 18 percent There are small or no differences across professional experience. As such the figures are not presented in this report. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 55 10.6 Researchers call for better infrastructure, citation systems and guidelines research data would make it easier to give credit to researchers preparing and generating data. The survey also asked the respondents about pos- In addition to increased funding, respondents high- sible measures that would facilitate more sharing of lighted various aspects of formal competence and research data. These answers constitute important technical support. The respondents called for the inputs to recommended actions that can facilitate implementation of guidelines, standards and more open access to research data. training on open access to research data in order to increase sharing of data. These solutions pointed Most respondents point to better infrastructure as a out are also in line with the challenges identified, es- solution to increased access to research data. pecially those concerned with the time-constraints cited earlier. Further, respondents state that the implementation of a citation system would facilitate increased sharing availability of data. A better citation system for TABLE 10.3 Solutions to facilitate increased sharing of data (What efforts would make open access to research data to publicly funded research data more interesting for you? (maximum 3 answers)) Solutions for increased sharing of data Frequency Pct. Better infrastructure for open access to research data 536 41.7% Implementation of a system for citation 510 39.7% More resources allocated for open access to research data activities 324 25.2% Implementation of guidelines 309 24.0% More training on open access to research data 281 21.9% Implementation of standards 266 20.7% Don't know 208 16.2% Make open access to research data an indicator in the funding scheme 158 12.3% Guidelines to how long I can attain ownership to data before sharing 140 10.9% Make it mandatory to explain how data will be made available 103 8.0% Not allowed to share anyways 78 6.1% 2,913 227% Total Source: DAMVAD 56 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Few differences across sectors As for the proposed solutions, we see few differences across sectors, with one important exception. The respondents based at health trusts are particularly concerned with the need for guidelines for open access to research data would facilitate increased sharing of research data. This input is perhaps not surprising, given their emphasis on sensitivity of data. Guidelines could focus on how to share sensitive data and what kinds of data that can be shared. As noted by some respondents, education might also involve how to handle open access to research data. FIGURE 10.7 Solutions to facilitate increased sharing of data, across sector (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 45% 41% 42% 43% 41% 41% 39% 40% 35% 35% 30% 29% 28% 25% 23% 22% 21% 22% 22% 18% 20% 15% 10% 5% 0% Better infrastructure More training Univerities and university colleges Implementation of a citation system Research institutes More resources Implementation of guidelines for open access Health trusts (hospitals) Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 57 A blurred picture across discipline For humanities training and to some extent guide- In general, infrastructure is the dominant proposed lines would have more influence on making open solution. It is more important for some research dis- access to research data more interesting. Around ciplines than others. Technology and mathematics 30 percent within humanities point at training and and natural sciences, with 46 and 45 percent re- guidelines. Within mathematics and natural science spectively, point to infrastructure as the most im- the corresponding share is 16 percent and 19 per- portant solution. cent. Agriculture and fishery have a stronger focus on developing citation systems than other disciplines. In figure 10.8, we see that whereas 46 percent within agriculture and fishery state citation as an important factor for making open access to research data more interesting for them this is only the case for 30 percent within humanities and 33 percent within social science. FIGURE 10.8 Solutions to facilitate increased sharing of data, across research discipline (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 50% 45% 40% 45% 43% 42% 40% 46% 46% 44% 40% 38% 36% 35% 30% 30% 25% 25% 20% 35% 33% 32% 29% 27% 24%24% 16% 19% 18% 27% 25% 19% 29% 25% 19% 21% 20% 15% 10% 5% 0% Better infrastructure More training Implementation of a citation system Implementation of guidelines for open access Humanities Agriculture and fishery Mathematics and natural science Medical science Social science Technology Source: DAMVAD 58 More resources SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Little difference across experience Further we see that whereas around 20 percent of Interesting, but not surprisingly the more inexperi- the more experienced researchers points at more enced researchers fancy training. They want to training as a measure for making open access to re- learn how to share data. More experienced re- search data more interesting the share is about 30 searchers are more engaged in matters concerning percent among the more inexperienced research- lack of resources. ers. In figure 10.9 we see that around 30 percent of the Across experience, it is thus clear that infrastructure more experienced research points at the lack of re- and citation system are the most preferred sources in order to make open access to research measures in order to make open access to research data more interesting it is only around 11 percent for data more interesting according to the researcher. the inexperienced researchers. FIGURE 10.9 Solutions to facilitate increased sharing of data, across experience (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 50% 45% 40% 43%43% 39% 44% 42%42% 41%42% 38% 35% 35% 31% 29% 28% 30% 28% 26% 25% 24% 25% 20% 20% 18% 18% 20% 23% 22% 21% 15% 11% 10% 5% 0% Better infrastructure More training Less than 3 years 3 - 6 years Implementation of a citation system 7 - 10 years More resources 11 - 20 years Implementation of guidelines for open access More than 20 years Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 59 10.7 Researchers working internationally find - time to be a bigger challenge Work in collaboration with researchers at international institutions We hypothesized that researchers working in differ- Working mainly alone was defined as the re- ent settings have different perceptions of the barri- searcher working alone more than 40 percent of ers to sharing data. their time. We asked each of the respondents to state the pro- The same threshold delimits researchers collaborat- portions of their working hours where they: ing with others within their own institution. Finally, - Work alone we defined researchers working internationally as - Work in collaboration with colleagues within those spending more than 20 percent of their time their institution collaboration with researchers at international institutions. FIGURE 10.10 Main barriers for increased sharing of research data, across researchers’ way of working (“Do you see any challenges in making more of your research data available for other researchers”? Maximum 3 answers). 40% 35% 35% 30% 27% 24% 25% 23% 23% 22% 21% 19% 20% 20% 19% 18% 19% 19% 17% 15% 13% 10% 5% 0% Making data available takes away valuable time for research Alone ( > 40 pct.) Lack of technical infrastructure Open access would Concerns connected to Cannot give access due reduce possibilities of misinterpretation of to sensitivity issues scientific publications data Collaboration within the institution ( > 40 pct.) Source: DAMVAD 60 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM International collaboration ( > 20 pct.) One might think that it exists differences among re- most important issue. There is an obvious explana- searchers’ way of working and their attitudes to- tion for this. Whereas researchers mainly working wards sharing data. However, we find little evidence alone share data to a lesser extent than others do, that this should be the case. they face fewer issues in sharing their data. The main difference between researchers way of On the other hand, the researchers collaborating in- working is found to be the challenge that making ternationally have to deal with international stand- data available takes away time for research. The re- ards and practice. Alongside few existing standards, spondents mainly working alone did not see this as multiple archiving solutions and different legislation a big issue. On the other hand, the respondents col- across borders, the time involved in sharing data is laborating internationally stated that this was the a significant challenge. FIGURE 10.11 Solutions to facilitate increased sharing of data, on how researchers work (What efforts would make open access to publicly funded research data more interesting for you? (maximum 3 answers)) 50% 45% 43% 41% 40% 36% 34% 35% 33% 34% 29% 30% 24% 24% 25% 23% 19% 20% 22% 19% 17% 14% 15% 10% 5% 0% Better infrastructure Alone ( > 20 pct.) More training Implementation of a citation system Collaboration within the institution ( > 40 pct.) More resources Implementation of guidelines for open access International collaboration ( > 40 pct.) Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 61 for every researcher, as their research aims at cre10.8 Researchers welcome data sharing as a ating new knowledge based on solid evidence. part of publishing Roughly half the respondents (54 percent) agreed We have seen that the respondents would like to with this. Slightly less than thirty percent stated that use data generated by others, but that they lack the their data and publications would be cited more. incentive’s to make their own data available for others. Many scientific journals increasingly require Interestingly enough, 11 percent did not see any that data should be made available as a part of the benefits in making data available as a part of scien- publishing process. tific publications. But only 11 percent of re- searchers have already experienced this practise. If we add the 9 percent who do not know and take Making data available through scientific journals the residual, the remaining 80 percent, can be con- could lead to increased interest from other re- sidered positive towards making data available as a searchers. As illustrated below, 50 percent see that part of scientific research. increased focus on making data available as a part of scientific publications could mean that their research becomes more interesting for others to follow. Another positive outcome would be that the research could be quality assured. This is important TABLE 10.4 Researchers welcome sharing data as part of scientific publications. (Do you welcome the trend of making data available as a part of scientific publications?) Frequency Pct. 665 50.8% 703 53.7% 367 28.1% No, I see no benefit for me 144 11.0% I do not know 112 8.6% Other 63 4.8% Total 2,054 157% Yes, it could mean that my research could be more interesting for others to follow Yes, it is a sign that my research can be quality assured Yes, it could mean that my data and or my publications will be more cited Source: DAMVAD 62 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 11 Main findings and recommendations The objective of this study has been to gain a better Only one in ten had not used research data gener- understanding of researchers’ current practice on ated by other researchers over the past three years sharing and archiving of research data. In addition, did not wish to use this type of data. we analyse the various fears and barriers involved from a researcher’s point of view and how these barriers might be overcome. The following chapter summarize the main findings and recommendations for the Research Council of Norway. Researchers see benefits in sharing their data The survey confirms that researchers in Norway see benefits of sharing and archiving their research data. Around 80 percent of the respondents agreed that open access to research data enhance research and that it is an ethical obligation to make 11.1 Main findings their data available for validation. These are also the two reasons for open access to research data Researchers share data agreed to by most researchers. Our study shows that Norwegian researchers frequently use and share research data with each other. Further, 77 percent agreed that open access to research data facilitates the education of students and new researchers, and 74 percent that open access As many as 64 percent of researchers have used research data from other researchers over the last to research data stimulated research collaboration respectively. three years. Many researchers are undecided Researchers mostly use research data generated by other researchers from their own institution, but Although most researchers agree as to the benefits only slightly more than by data from researchers at of sharing their data, many researchers are unde- other institutions outside of Norway. cided as to whether publicly funded research data should be considered public property. Potential to increase sharing of research data Of the remaining 20 percent who did not agree that About one-third (36 percent) of the researchers have not used data gathered by other researchers. Of these, 71.5 percent reported that they would like to make use of other researchers’ data. This indicates a clear potential for increasing sharing of research data. search, 15 percent were undecided and about 5 percent disagreed. This large proportion of undecided respondents may reflect the complexity of the issue. Additionally, more than 600 respondents actively The numbers indicate untapped potential for increased and improved sharing of data. open access to research data would enhance re- decided not to participate in the survey. This reluctance to participate might also be seen as an indication that questions regarding open access to research data are perceived as being irrelevant to the SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 63 individual respondents (i.e., he or she is not an ac- 3. Open access to research data might reduce re- tive researcher) or that he or she is undecided as to searchers’ possibilities for future scientific pub- the issue. lications. These 600 non-respondents correspond to 30 per- These responses indicate, inter alia, that research- cent of the actual respondents. If they are regarded ers lack adequate and user-friendly infrastructure, as undecided respondents, the share of research- guidelines and procedures, and certainty about im- ers without a clear position on the issue of open ac- material rights in order to embrace the idea of shar- cess to research data is quite significant. ing data. The survey included an open answer option where Contrary to our hypothesis, we did not find any ma- respondents could write free text. Inputs in this sec- jor differences across sectors, fields of research or tion show that many researchers find the issue of years of professional experience. open access to research data challenging and complex. Many researches are clearly positive towards sharing, but many researchers are also negative, as Archiving data on local computers and institutional servers we have tried to state in the report. The study found that 85 percent of the respondents Researcher want to remain in control of their data archived their data on their own devices or at an institutional server. This figure do not vary across sec- Most researchers share their research data with tors, disciplines or professional experience, which is other researchers. Yet research data is generally something of a paradox, since storing data on their shared under certain conditions (e.g., only upon re- own devices cannot be regarded the most secure quest, under a non-disclose agreement, in an anon- means of storage. This is especially apparent inso- ymized format). Researchers want to control who far as many of the respondents were concerned gets access to their data and how they use it. With about security and the sensitivity of their data. each researcher setting the term, there is a risk that Researchers see few initiatives from their management she becomes a gatekeeper. Lack of proper infrastructure and incentives for sharing The survey responses suggest significant differences in the way in which research managers deal with the sharing and archiving of data. Conse- When asked about the barriers to sharing more of quently, researchers see a need for greater institu- their data, the central barriers according to the re- tional support. searchers are: 1. Preparing data for open access takes valuable time away from research-activities. 2. Respondents do not have adequate technical infrastructure. Only 16 percent (within the research institutions) and 6 percent (within health trusts) perceived to a significant extent - that their management encouraged them to share data. Moreover, only six percent (within the research institutions) and two percent (within health trusts) perceived a significant degree 64 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM of solutions and technical support for sharing their data. 3. Implementing guidelines, training and standards for sharing data. Again, we find very limited differences across sec- There is a need for better infrastructure and a credit system for researchers tors, disciplines and professional experience. The survey indicates a strong relationship between 11.2 Recommendations the major barriers to sharing data and the researchPrevious studies suggest that there are multiple ob- ers’ proposed solutions to overcome them. stacles and, hence, no single solution to increase The flipside of these barriers are possible solution. the sharing and archiving of research data. Yet we These include: will present some recommendations here and in fig- 1. Better infrastructure. ure 11.1. 2. Implementing a system for citation. FIGURE 11.1 Problems, solutions and recommendations Source: DAMVAD SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 65 Former studies, as well as this analysis, suggest premise for sharing data. Even if the two matters dif- that there is a need for work directed at both the fer, they are closely linked, and should be seen in level of researchers, research institutions, research relation to one another. funders and government/international levels. Most researchers use other researchers data. Initiatives need to take place in parallel. For exam- Most researchers are also willing to let others re- ple, taking action to make more researchers share use their generated data if certain conditions and data without the proper infrastructure will most likely restrictions fulfilled. prove counterproductive. Thus, there is a strong need for a coordinated effort. The researchers are the ones who gather and analyse the data, and who will archive and share the We see that the Research Council of Norway can data in the end. Researchers want to know what play a key role promoting open access to research happens to their research data. As such, it is im- data in Norway. portant to raise awareness among researchers. Raising awareness The sharing and archiving of research data entails many obstacles and questions in which need to be However, the study also indicate that researcher need support and many does not see this support from their management. Thus, it is also important to raise awareness at the institutional level. answered. Many respondents were undecided or did not wish to participate in the survey. This might suggest that researcher’s consider sharing and ar- Giving credit as well as responsibility to researchers chiving of research data as a complex and difficult topic. The study indicates that a lack of incentives and credit for gathering data are a barrier for increased We would suggest that the Research Council of sharing of research data. Norway actively work to raise awareness on the issue, covering both the benefits and pitfalls of archiv- These findings correspond findings in former na- ing and sharing research data. tional and international studies (i.e., Kvale, 2012). In particular, exemplifying potential opportunities and value is important, inter alia, by using best prac- The respondents would be more willing to share tice cases. Emphasis should be on showing that work. One obvious way of crediting researchers sharing and archiving is worthwhile for researchers. would be by support the implementation of a cita- data if they received credit for their data generation tion or reference system for data. Accreditation is In this respect, there also seems to be a need for an important motivation for researchers. certainty as to the differences between archiving and open access to research data. The archiving process does not necessarily imply full open access to research data for all - it should be considered a 66 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM “References can be seen as a kind of normative Council of Norway withholds funding until data is payment” properly shared and archived. Ingwersen (2011) Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure We do not recommend implementing such stringent measures at the current stage, as it would require There is no well-established citation system for re- considerable work in terms of its design and in terms search data in Norway, giving researchers few incentives to prioritize time for preparation of data for sharing. The lack of a well-established citation system is also an international issue. We thus see the benefit the such systems should be coordinated at of having the proper infrastructure in place. Without proper guidelines and a sound infrastructure, such as system could be counterproductive. For recent years, research communities has been the international level. left to establish methods and practices for sharing Ideally, the system should be easy to use and work alongside existing systems for publishing. Tenopir (2011) suggests promoting good sharing practices among researchers. For example, obtaining copies of articles using a researchers’ data is one example of conditions that would encourage sharing and promoting best practice. and archiving their research data. We are concerned that this leads to a suboptimal organization of solutions. As stated earlier, we do not find large differences across sectors or research disciplines. Hence, we cannot support arguments leading to the design of tailored solutions for each specific sector or individual research discipline. Yet the work must still be inclusive of all research communities, as they The Council could also introduce some kind of requirements on researchers. Lord et al. (2006) study large-scale data sharing in life sciences based on have the knowledge and will have to implement the supposed strategies and solutions. Guidelines, rules and best practice ten case studies, and found that a laissez-faire approach to the collection and distribution of data re- Our study suggests that many researchers lack sults in waste, as such data will not entail sufficient knowledge as to what data to share and archive. In information to enable re-use. addition, researchers lack knowledge as to what form the data should have, and how proper infor- A key recommendation from Lord et al. (2006) is an mation about the data should be assigned. insistence on a data management plan that clearly defines responsibilities and goals and awareness of Thus, the study suggest a need for better guide- the needs and practices of data management. lines, standards and education relating to sharing and archiving research data. Such guidelines and The Research Council can introduce requirement of standards should be developed in close interaction data management plans as a part of the traditional with researchers, institutions and legal experts. We application procedure. recommend that implementation of guidelines and standards should be inspired by work initiated inter- It is also possible to make sharing of research data nationally to avoid creating a Norwegian bureau- a part of the financial system for basic funding. Fur- cracy alongside international standards. ther, it could be a system in which the Research SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 67 One way of promoting the use of shared data would Finally, it would lay the ground for increased sharing be by creating solid and informative platform for of research data. Debate on infrastructure invest- metadata. A metadata-platform can be a low key ac- ment should involve all relevant stakeholders while tivity, as it can be seen as a first step towards more ensuring a robust infrastructure that in turn will complex infrastructure solutions. In addition, we serve the needs of the future. We are somewhat perceive that many researchers are not aware of the cautious as to the design and scale of such a sys- possibilities of accessing data gathered by other re- tem because it could be a matter of cost and benefit. searchers. Better metadata can overcome this is- We thus see that more information on ambition’s is sue. needed. We would also suggest starting to work on data se- An ideal data infrastructure for science research lection (i.e., on defining which data are worthwhile would have a long list of technical characteristics. and which are not). Even though our study does not We refer to the wish list included in the EC white suggest any major differences in the practices and paper on scientific data, “Riding the Wave”. barriers across research disciplines and sectors, the open answers, however, indicated a strong need for better understanding and guidance as to which data to archive and share, and in which form to do so. In particular, researchers who mainly use textual data (e.g., interviews), have difficulties deciding which data to share and preserve. Infrastructure and funding Interviews and studies both suggests that the infrastructure for the sharing and archiving of data is fragmented, overlapping and inadequate. Many are satisfied with the current archiving solutions, yet researchers seems to archive most of their data on their own institutional servers or local storage devices. We found no differences across sectors or research disciplines on the topic of storage. Given the large share of storing data locally, there is clearly a need for better infrastructure solutions. Better infrastructure could increase the motivation for archiving data at data archiving centres, which could provide more secure means for archiving data and data could be restored easier. 68 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM TEXTBOX 11.1 A WISH LIST FOR E-INFRASTRUCTURE Open deposit, allowing user-community centres to store data easily Bit-stream preservation, ensuring that data authenticity will be guaranteed for a specified number of years Format and content migration, executing CPU-intensive transformations on large data sets at the command of the communities Persistent identification, allowing data centres to register a huge amount of markers to track the origins and characteristics of the information Metadata support to allow effective management, use and understanding Maintaining proper access rights as the basis of all trust A variety of access and curation services that will vary between scientific disciplines and over time Execution services that allow a large group of researchers to operate on the stored date High reliability, so researchers can count on its availability Regular quality assessment to ensure adherence to all agreements Distributed and collaborative authentication, authorisation and accounting A high degree of interoperability at format and semantic level Adapted from the PARADE (Partnership for Accessing data in Europe) White Paper (2009)19 19 Partnership for Accessing Data in Europe (PARADE) is a consortium targeting to build efficient services addressing data management needs of multiple research communities. Strategy for a European Data Infrastructure (White Paper) was published in October 2009 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 69 References Ball, A. (2012). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre Borgman Christine L. (2012). The Conundrum of sharing research data. Journal of the American Society for Information Science and Technology, 64 (6): 1059-1078 Committee on Scientific Accomplishments of Earth Observations from Space, National Research Council (2008). Earth Observations from Space: The First 50 Years of Scientific Achievements. The National Academies Press. p. 6. ISBN 0-309-11095-5. Retrieved 2010-11-24. Creswell, J. W. (2008). Educational Research: Planning, conducting, and evaluating quantitative and qualitative research (3rd ed.). Upper Saddle River: Pearson. EC (2012). Online survey on scientific information in the digital age, http://ec.europa.eu/research/sciencesociety/document_library/pdf_06/survey-on-scientific-information-digital-age_en.pdf Campbell, E. G. et al. (2002). Data withholding in academic genetics: evidence from a national survey, Journal of the American Medical Association 287, no. 4 (2002): 473–480. E-science (2005). Large-scale data sharing in the life sciences: Data standards, incentives, barriers and funding models (The “Joint Data Standards Study”), http://www.nesc.ac.uk/technical_papers/UKeS-2006-02.pdf EU (2010). Riding the wave. How Europe can gain from the rising tide of scientific data, http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf Berman, F. & Cerf, V. (2013). Who Will Pay for Public Access to Research Data? http://www.greatplains.net/download/attachments/8486930/SCIENCE2013AUGPAYINGFOROPENACCESS.pdf Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 Version 1.0 11 December 2013 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf Hanson, Sugden & Alberts, (2012). Making data maximum available, Science 331, no. (11 February 2011). Hey et al. (2009). The Fourth Paradigm Data-Intensive Scientific Di s cover Ingwersen. P. (2011). Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure. JISC Research 3.0: driving the knowledge economy, http://www.jisc.ac.uk/whatwedo/campaigns/res3.aspx Kowalczyk Stacy, Shankar Kalpana (2011), Data sharing in the sciences. Ann. Rev. Info. Sci. Tech., 45: 247– 294. Kvale, L. (2012). Data Sharing in the Life Sciences - A Study of Researchers at The Norwegian University of Life Sciences (Masters thesis) https://oda.hio.no/jspui/bitstream/10642/1269/2/Kvale_Live_Handlykken.pdf 70 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Meld. St. 18 (2012–2013). Report to the Storting, “Long lines - knowledge provides opportunities” Enke, N. et al. (2012). The user’s view on biodiversity data sharing - investigating facts of acceptance and requirements to realize a sustainable use of research data, Ecological Informatics 11 (September 2012): 25–33. doi:10.1016/j.ecoinf.2012.03.004. OECD (2002). Frascati Manual: proposed standard practice for surveys on research and experimental development, 6th edition. Retrieved 27 May 2012 from www.oecd.org/sti/frascatimanual. OECD (2007). Principles and Guidelines for Access to Research Data from Public Funding http://www.oecd.org/sti/sci-tech/oecdprinciplesandguidelinesforaccesstoresearchdatafrompublicfunding.htm PARSE.Insight (2010). PARSE.insight http://www.parse-insight.eu/ Lord, P. et al. (2006). Large-scale data sharing in the life sciences: Data standards, incentives, barriers and funding models (The “Joint Data Standards Study”). Savage and Vickers (2009). Empirical Study of Data Sharing by Authors Publishing in PLoS Journals,” PLoS ONE 4, no. 9 (2009): e7078. doi:10.1371/journal.pone.0007078. St. Meld 30 (2008-2009). Report to the Storting, “Climate for research.” Tenopir et al. (2012). Academic Libraries and Research Data Services Current Practices and Plans for the Future. An ACRL White Paper. Tenopir et al. (2011). Data Sharing by Scientists: Practices and Perceptions http://www.biomedcentral.com/1471-2105/12/S15/S3 UiO (2013): Håndtering av forskningsinfrastruktur ved Universitetet i Oslohttp://www.uio.no/om/organisasjon/ledelsen/styret/moter/kart_prot2013/04.23/infrastruktur.pdf World Data Center System (2009-09-18). "About the World Data Center System". NOAA, National Geophysical Data Center. Retrieved 2010-11-24. SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 71 72 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM Appendix Participant at workshop on open access and data management, Research Council of Norway October 25th, 2013 Øystein Godøy Norwegian Meteorological Institute Dagmar Langeggen BI Norwegian Business School Andreas Jaunsen UNINETT Sigma Koenraad De Smedt University of Bergen Vigdis Kvalheim Norwegian Social Science Data Services (NSD) Olav Hagen Sataslåtten The National Archives' Central Office Frode Arntsen BIBSYS Helge Sagen Institute of Marine Research Per Magnus The Norwegian Institute of Public Health Terje Risberg Statistics Norway Dag Undlien University of Oslo and Oslo University Hospital Jan Bjaalie University of Oslo Live Kvale University of Oslo Asbjørn Mo Research Council of Norway Roar Skålin Research Council of Norway Inngunn Sagebø Research Council of Norway Øystein Godøy Research Council of Norway Siri Lader Brun Research Council of Norway Øystein Godøy Norwegian Meteorological Institute Per Magnus The Norwegian Institute of Public Health Gunnar Simonsen University hospital of Tromsø Bjarne Strøm Norwegian University of Science and Technology Helge Sagen Institute of Marine Research Additional interviews SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM 73 Badstuestræde 20 DK-1209 Copenhagen K Tel. +45 3315 7554 Norsk adresse 123 N-2390 Oslo Tel +47 2345 1254 74 SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM