Sharing and archiving of publicly funded research data

Transcription

Sharing and archiving of publicly funded research data
11/04/14
Sharing and archiving of
publicly funded research
data
Report to the Research Council of Norway
2
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
For information on obtaining additional copies, permission
to reprint or translate this work, and all other correspondence,
please contact:
DAMVAD
info@damvad.com
damvad.com
Copyright 2014
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
3
Contents
1
2
3
Executive Summary
6
1.1
Mandate
6
1.2
Main findings
6
1.3
Recommendations
7
Sammendrag (in Norwegian)
10
2.1
Mandat
10
2.2
Sentrale funn
10
2.3
Anbefalinger
12
Background
13
3.1
Mandate
13
3.2
Context
14
3.2.1 Data is vital to research
14
3.2.2 Growing consensus on the importance of sharing publicly funded data
16
Structure of the report
17
3.3
4
5
4
Former studies used to develop the hypotheses
18
4.1
The consensus on the importance of access to data
18
4.2
Lack of recognition, time and proper infrastructure
18
4.3
Variations across disciplines and ages
20
4.4
Input from researchers and data managers in Norway
21
4.5
Hypotheses
21
Methodology
23
5.1
Conceptual clarifications
23
5.1.1 Scope
23
5.1.2 Financing
23
5.1.3 Research data
23
5.1.4 Archiving
24
5.1.5 Open access to research data
25
5.2
Selecting the population
25
5.3
Survey process
26
5.4
Response rate
27
5.5
A significant proportion of the researchers actively chose not to participate
27
6
Descriptive statistics
29
7
Researchers use data generated by other researchers
31
7.1
Data formats vary across research disciplines
31
7.2
Numerical data are easier to restore
32
7.3
Researchers frequently use other researchers’ data
33
7.4
Researchers mainly use data produced by other researchers from the same institution 35
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
7.5
8
9
10
11
Researchers would like even better access to other researchers’ data
35
Research data is rarely archived in data centres
36
8.1
Most data is archived on portable storage units or institutional servers
36
8.2
Storage reflects costs of recreation
38
8.3
Most researchers are satisfied with their current archiving solution
40
8.4
Those who are not satisfied point to security risks
40
8.5
Archiving activities are financed as a part of project- and institutional funding
41
Most researchers share research data
42
9.1
Researchers are positive to the principle of open access
42
9.2
Many researchers are left undecided
42
9.3
Health trusts are positive towards the effects of sharing data on research
43
9.4
Researchers share their research data, but upon request
45
9.5
More openness within humanities
45
9.6
More openness among more experienced researchers
47
Lack of time, infrastructure and incentives hamper further sharing of data
48
10.1
Variety of barriers
48
10.2
Relatively small differences across sector
50
10.3
Textual records are more sensible
52
10.4
Researchers see little support from management
53
10.5
Limited institutional support
54
10.6
Researchers call for better infrastructure, citation systems and guidelines
56
10.7
Researchers working internationally find time to be a bigger challenge
60
10.8
Researchers welcome data sharing as a part of publishing
62
Main findings and recommendations
63
11.1
Main findings
63
11.2
Recommendations
65
References
70
Appendix
73
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
5
1
Executive Summary
1.1
Mandate
strategy and guidelines for sharing and archiving of
publicly funded research data in Norway.
Data is an important asset in the knowledge society
and is vital to research. Open access to research
1.2
Main findings
data allows for the use of data for different purposes
and for purposes other than originally intended.
Overall, findings in this report support findings in
Sharing and archiving of data allows for further re-
other international surveys.
search, re-analysis, validation and research cooperation on complex matters. Consequently, open ac-
A total of 1,474 researchers completed the survey.
cess to research data both can enable new re-
This constitutes 21.8 percent of the selected survey
search and innovation and the dissemination of
population. Another 604 researchers actively indi-
knowledge.
cated that they did not want to participate in the survey. In total, that is a response rate at 30.6 percent.
The debate about open access to research data is
An analysis of respondents indicates a high repre-
by no means new. It has intensified in recent years
sentativity across institution types and subject mat-
due to a growing amount of data and the growing
ters.
possibilities offered by information technology,
along with growing recognition of the value of data.
Norwegian researchers frequently use and share
research data with each other. As many as 64 per-
The Organisation for Economic Co-operation and
cent of researchers had used research data from
Development (OECD) has developed guidelines on
other researchers in the last three years.
the sharing of publicly funded research data. Publicly funded research data could be considered a
The researchers mostly used research data gener-
public good, and as such should be available to the
ated by other researchers from the same institution,
greatest extent possible, not reserved for the indi-
though this is closely followed by data from re-
vidual researcher or institution.
searchers at other institutions outside of Norway
and other researchers nationally.
Nonetheless, the sharing and archiving of research
data faces technical, financial, legal and cultural ob-
The remaining 36 percent of researchers report that
stacles and questions that remain unanswered.
they have not used data gathered by other researchers. Of these, 71.5 percent report that they
The objective of this study is to gain a better under-
would have liked to make use of other researchers’
standing of researchers in Norway’s current practice
data. The numbers indicate untapped potential for
on sharing and archiving, as well as barriers to the
increased and improved sharing of data.
sharing and archiving of research data. The study
also proposes possible approaches to overcome
Only 10 percent of the researchers had not used re-
these barriers.
search data generated from other researchers over
the past three years and did not wish to use data
The study will serve as a contribution to the Re-
generated by others.
search Council of Norway's work on developing a
6
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
The survey confirms that researchers in Norway see
When asked about the barriers to sharing even
the benefits of the sharing and archiving of research
more of their data, researchers emphasized the fol-
data. Around 80 percent of the respondent re-
lowing:
searchers agreed that open access to research data
1. Preparing data for open access takes up val-
enhances research, and that it is an ethical obliga-
uable time.
tion of research to make research data available for
2. I do not have an adequate technical infra-
validation. These are also the two reasons for open
structure.
access agreed to by most researchers.
3. Open access to research data might reduce
my options for scientific publications in the
Further, 77 percent agree that open access to re-
future.
search data facilitates the education of students and
new researchers and 74 percent agree that open
These responses indicate, inter alia, that research-
access stimulates research collaboration.
ers lack adequate and user-friendly infrastructure,
guidelines and procedures, and certainty about im-
Although most researchers agree on the benefits of
material rights in order to embrace the idea of shar-
sharing data, many researches are also undecided
ing data.
about whether publicly funded research data should
be considered public property. Of the remaining 20
Contrary to our hypothesis, we did not find any ma-
percent who do not agree that open access to re-
jor differences across sectors, fields of research or
search data will enhance research, 15 percent are
years of professional experience.
undecided and around 5 percent disagree. This high
proportion of undecided researchers may reflect the
The study further finds that 85 percent of the re-
complexity of the issue and the distance between
spondents archive their data on their own devices
good intentions and practical solutions that address
or else at an institutional server. The figures do not
storage, ownership and credit, replicability of use,
vary across sectors, disciplines or scientific experi-
and other obstacles.
ence.
The survey included an open answer option where
The survey responses suggest significant differ-
respondents could write free text. Inputs in this sec-
ences in the way in which research managers’ deal
tion show that many researchers find the issue of
with the sharing and archiving of data. Conse-
open access challenging and complex.
quently, researchers see a need for greater institutional support.
Most researchers share their research data with
other researchers. Yet research data is generally
1.3
Recommendations
shared under certain conditions (e.g., only upon request, under a non-disclose agreement, in an anon-
The study reveals multiple obstacles and, therefore,
ymized format). Researchers want to control who
that there is no single solution as to how to increase
gets access to their data and how they use it. With
the sharing and archiving of research data. Both this
each researcher setting the term, there is a risk that
and former studies suggest that there is a need for
she becomes a gatekeeper.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
7
work directed at the level of researchers, data man-
It is also important to communicate that the archiv-
agers and research funders, as well as govern-
ing of data does not necessarily imply full open ac-
ment/international levels.
cess to research data for all, but should be seen
more as a premise for the sharing of data.
Overall, researchers agree in principle as to the
value of sharing data. However, increased sharing
Second, our study indicates that a lack of incentives
is hampered by uncertainty about how to go about
for the crediting of data is a barrier. This could be
it technically; it being felt that it takes away valuable
addressed by clarifying and implementing a system
time for research and that it will reduce academic
for citation but also outlining the inherent responsi-
credentials.
bility and expectation on the part of researchers.
The flipside of these barriers are possible solution.
For example, the Research Council of Norway can
These include:
introduce requirement of data management plans

Better infrastructure.
and support implementation of systems for crediting

Implementing a system for citation.
to raise awareness, experience and recognition

Implementing guidelines, training and stand-
among researchers. Ideally, such measures should
ards for sharing data.
be easy to use, similar to international systems and
work alongside the system for scientific publication.
The Research Council of Norway can play a key
role. Specific recommendations include raising
Third, many researchers lack knowledge as to what
awareness, finding ways to recognize data sharing,
data to share and archive and how to do so. This
putting in place standards, rules and best practice,
includes information about what form the data
providing technical infrastructure, and making fund-
should be archived in and how proper information
ing available for necessary infrastructure and train-
about the data should be assigned.
ing.
There is a need for guidelines, standards and trainOur recommendations are summarized in Figure 1.
ing on the sharing and archiving of research data.
Defining what data to share and what is worth ar-
First, we suggest that the Research Council of Nor-
chiving (or not) could help clarify the debate. These
way actively work to raise awareness on the bene-
should be developed in close interaction with re-
fits and pitfalls of the archiving and sharing of re-
searchers, institutions and legal experts. Such work
search data.
should be inspired by work initiated internationally
to avoid creating a Norwegian bureaucracy along-
In particular, exemplifying potential opportunities
and their value is important, inter alia, by using best
side international standards.
practice cases. Focus should be placed on showing
Furthermore, selective investments in infrastruc-
that sharing and archiving is also worthwhile for re-
ture and technical skills are necessary. Both inter-
searchers.
views and studies suggest that the infrastructure for
sharing and archiving data is fragmented, overlapping and insufficient. Our study also suggests that
8
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
many researchers archive most of their data on their
own servers or portable computers.
Better infrastructure could increase the motivations
for archiving data at data archiving centres. This
could provide a more secure means of archiving
data and the data could be more easily restored.
Finally, archiving will lay ground for the sharing of
more research data. Infrastructure investments
should involve all relevant stakeholders while also
ensuring a robust infrastructure which will serve the
needs of the future.
FIGURE 1
Problems, solutions and recommendations
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
9
2
Sammendrag (in Norwegian)
2.1
Mandat
forskere i Norge deler og arkiverer forskningsdata,
og deres utfordringer knyttet til økt deling og arkive-
Data er en verdifull ressurs i dagens kunnskaps-
ring.
samfunn. Åpen tilgang til data gir muligheter for bruk
av data til ulike formål og for andre formål enn op-
Studien vil tjene som et kunnskapsgrunnlag for
prinnelig ment. Således kan åpen tilgang til data
Forskningsrådet i deres arbeid med å utvikle en
legge grunnlaget for utvikling av nye produkter, nye
strategi for deling og arkivering av offentlig finan-
tjenester og utvikling av demokratiet.
sierte forskningsdata i Norge.
Data er også et sentralt grunnlag for forskning. De-
2.2
Sentrale funn
ling og arkivering av forskningsdata gir mulighet for
videre forskning, gjenskaping av analyser, valide-
Undersøkelsen er basert på en spørreundersøkelse
ring og forskningssamarbeid om komplekse pro-
blant forskere i Norge. I alt 1474 forskere har gjen-
blemstillinger.
nomført undersøkelsen. 604 forskere aktivt signalisert at de ikke ønsker å delta i undersøkelsen.
Debatten om åpen tilgang til forskningsdata er ikke
ny. Imidlertid har debatten blitt intensivert de siste
Undersøkelsen viser at mange norske forskere bru-
årene på grunn av en økende mengde data og nye
ker og deler forskningsdata med hverandre. Så
muligheter for analyser av store datamengder som
mange som 64 prosent av forskerne i undersøkel-
følge av den teknologiske utviklingen.
sen har brukt forskningsdata fra andre forskere i de
siste tre årene.
Norske myndigheter vil, sammen med internasjonale organisasjoner som OECD og EU fremme mer
Forskerne bruker hovedsakelig forskningsdata ge-
deling og arkivering av offentlig finansierte forsk-
nerert av andre forskere fra samme institusjon, tett
ningsdata.
fulgt av forskere ved andre institusjoner utenfor
Norge og av forskere for andre institusjoner nasjo-
Det er særlig to grunner til dette;
nalt.
For det første, kan offentlig finansierte forsknings-
Motsatt, 36 prosent av forskerne hadde ikke brukt
data anses som et offentlig gode som bør utnyttes i
andre forskres data. Av disse oppgir 71,5 prosent at
størst mulig grad og ikke reserveres for den enkelte
de gjerne vil gjøre bruk av andre forskeres data.
forsker eller institusjon.
Dette et indikerer klart potensiale økt deling av data.
Dernest kan bedre utnyttelse av forskningsdata
Kun 10 prosent av alle forskerne har ikke brukt data
styrke kvaliteten og ressursutnyttelse i norsk forsk-
generert av andre forskere i løpet av de siste tre
ning.
årene, og ikke ønsker å bruke data generert av andre.
En rekke studier viser at deling og arkivering av
10
forskningsdata er forbundet med tekniske, økono-
Undersøkelsen bekrefter at forskere i Norge ser nyt-
miske, kulturelle og juridiske hindringer. Målet med
ten av deling og arkivering av forskningsdata. Rundt
denne studien er å få bedre forståelse av hvordan
80 prosent av respondentene er enige om at åpen
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
tilgang til forskningsdata styrker forskningen, og at
Respondentene er bekymret for at offentliggjøring
det er en etisk forpliktelse å gjøre forskningsdata
vil kreve ressurser som ikke kan kompenseres for.
tilgjengelig for validering. Dette er de to grunnene
for åpen tilgang som de fleste forskere er enige i.
Det kan være flere grunner til at forskerne anslår at
tilrettelegging av data tar opp verdifull tid. Eksem-
Videre er 77 prosent og 74 prosent av forskeren
pler på dette er manglende tilgang til passende in-
enige i at åpen tilgang til forskningsdata er fordelak-
frastruktur, mangel på bruk og kjennskap til standar-
tig i utdanningen av studenter og nye forskere, og at
der og retningslinjer om hvilke og hvordan forsk-
åpen tilgang stimulerer forskningssamarbeid.
ningsdata skal deles.
Selv om de fleste forskerne er enige om fordelene
I motsetning til hva vi forventet finner vi lite forskjel-
ved å dele data, tyder vår undersøkelse også på at
ler når det erfaringer og barrier på tvers av sektorer,
mange forskere er usikre på fordelene ved deling av
forskningsfelt eller år med vitenskapelig erfaring.
data. 20 prosent er ikke enige om at åpen tilgang til
forskningsdata vil styrke forskningen, av dette er 15
Arkivering av forskningsdata er sentralt for å vali-
prosent er usikre og rundt 5 prosent er uenige.
dere forskningsresultater. Arkivering av data kan
Denne høye andelen usikre forskere kan reflektere
også legge til rette for reanalyser og videre forsk-
kompleksiteten i problemstillingen.
ning dersom dataene tilgjengeliggjøres.
De åpne svarene i undersøkelsen avslører også at
Studien finner videre at 85 prosent av responden-
mange forskere finner spørsmålet om åpen tilgang
tene arkiverer sine data lokalt, enten på sin egen
utfordrende og mange forskere er positive og
portal datalagringsenhet eller institusjonsserver.
mange negative til åpen tilgang.
Andelen varierer i liten grad på tvers av sektorer,
fagfelt eller år med erfaringer.
De fleste forskere deler sine forskningsdata med
andre forskere. 64 prosent har brukt data generert
Vår studie tyder på at forskerne i liten grad opplever
av andre forsker i løpet av de siste tre årene.
at deres institusjonsledelse arbeider oppfordrer og
Imidlertid er forskningsdata generelt delt under
legger til rette før deling og arkivering av data. Bare
visse restriksjoner (kun på forespørsel, under en
12 prosent oppgir at deres ledelse i «høy grad» op-
konfidensiell eller i en anonymisert form). Funnene
pfordrer til deling av data. Bare 4 prosent oppgir at
tyder på at forskere ønsker å kontrollere hvem som
dere organisasjonen i «høy grad» har de nødven-
får tilgang til sine data, og hvordan dataene brukes.
dige løsninger og retningslinjer for deling av data.
Når de blir spurt om de hindringer for å dele enda
Undersøkelsen viser en sterk sammenheng mellom
mer av sine data svarer forskerne at de sentrale bar-
de barrierer forskere opplever for deling av data og
rierene er:
de løsninger som forskerne anbefaler for økt deling

av data.
Forberedelse av data for åpen tilgang tar opp
verdifull tid

Ikke tilgang på tilstrekkelig teknisk infrastruktur

Åpen tilgang til data kan redusere muligheten
for vitenskapelige publikasjoner i fremtiden
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
11
Forskere ser et behov for;
For det tredje indikerer vår studie at forskerne

Bedre infrastruktur
mangler kunnskap om hvilke data som skal deles og

Systemer for sitering og kreditering av data
arkiveres og hvordan dette skal gjøre. Det synes å

Utvikling
av
retningslinjer,
opplæring
og
standarder for deling av data
være et behov for retningslinjer, standarder og opplæring om deling og arkivering av forskningsdata. Vi
anbefaler at retningslinjer og normene bør utvikles i
Igjen finner vi bare minimale forskjeller på tvers av
nært samspill med forskere, institusjoner og juridi-
sektorer, fag og vitenskapelige erfaringer.
ske eksperter. Vi anbefaler slikt arbeid til å bli inspirert av arbeidet startet internasjonalt for å unngå å
2.3
Anbefalinger
skape et norsk byråkrati på siden av internasjonale
standarder.
Vår studie tyder på at det er flere hindringer for deling og arkivering av data. Derfor er det heller ikke
For det fjerde tyder vår studie på at forskerne i stor
en enkelt løsning på hvordan data i større grad kan
grad er fornøyd med arkiveringsløsninger. Likevel
deles og arkiveres. Både denne og tidligere studier
ser mange at manglende infrastruktur er et hinder
tyder på at det er et behov for arbeid på flere nivåer
for økt lagring og arkivering. Både intervjuer og stu-
– rettet mot både forskere, forskningsinstitusjoner,
dier tyder på at det er behov mer bedre og mer til-
datasentre og forskningsfinansiører, og på myn-
passet infrastrukturen for deling og arkivering. Be-
dighetsnivå.
hovet for infrastruktur understøttes også av at
mange prosentforskere arkiverer sin forskningsdata
Mange forskere er enige i prinsippet om å dele data
lokalt enten på egne datalagringsenheter eller insti-
– samtidig viser undersøkelsen at deling forhindres
tusjonsservere.
av at deling tar opp verdifull tid, manglende infrastruktur og at deling kan redusere muligheten for
Bedre infrastruktur kunne øke motivasjonene for de-
fremtidig publisering. Forskningsrådet kan spille en
ling og arkivering av data. Infrastruktur i investerin-
nøkkelrolle i å overvinne disse barrierene.
ger bør involvere alle relevante interessenter og
samtidig sikre en robust infrastruktur, som vil tjene
For det første, anbefaler vi at Forskningsrådet ar-
fremtidige behov.
beider aktivt arbeide for å øke bevisstheten om fordelene og fallgruvene ved arkivering og deling av
forskningsdata. Vi anbefaler å spre kunnskap og bevissthet om mulighetene ved økt deling av data,
men også kommunisere at arkivering av data innebærer ikke nødvendigvis fullstendig åpen tilgang til
data for alle.
For det andre viser vår studie at forskere ikke har
insentiver for å dele data. Vi anbefaler at Forskningsrådet arbeider med implementering av et system for sitering av data og at et slikt system bør
utarbeides i tråd med internasjonal praksis.
12
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
3
Background
This chapter presents the background and context
sociological obstacles. While overall policy goals
for the report.
and benefits are agreed, many questions still stand
in the way of effective and successful implementa-
3.1
Mandate
tion of the principles of open access to research
data.
Data constitutes knowledge and is a valuable asset
in the knowledge society. Sharing of research data
The Norwegian Government has mandated the Re-
allows for the use of data for purposes other than
search Council of Norway to explore and facilitate
originally intended, linking of data across different
work on sharing and archiving of research data.
data sets and validation of data. Accessible data
also underpins democratic processes by making in-
The Research Council of Norway is the National
formation available to a wider audience.
strategic and funding agency for research activities
in Norway. The goal of the Research Council is to
Retrieving information and allowing new genera-
strengthen the Norwegian research and innovation
tions of researchers to “stand on the shoulders of
system and its infrastructure through the effective
giants” is the very essence of research (PARSE.In-
use of public resources. As previously, noted, en-
sight. 2012).
hanced access to research data can be seen as a
measure to help achieve these goals.
The Norwegian Government - alongside international organizations such as the OECD and EU –
The Research Council of Norway is also principal
seeks to promote more sharing and archiving of re-
source of expertise and advice on research policy
search data. In its most recent White Paper 1 on re-
for the Norwegian Government, the central govern-
search policy, the Ministry of Education noted that:
ment administration and the overall research community, including universities, research institutes
“Better access to research data helps facilitate re-
and health trusts.
search and to increase the quality of research. The
government wishes to facilitate increased availabil-
In the autumn of 2012, the Research Council of Nor-
ity of publicly funded research data.”
way initiated an internal project called "Principles for
open access to publicly funded research data", led
Better utilization of research data could thus
by the Department for Research Infrastructure. The
strengthen the quality of Norwegian research and
main objective of the project was to provide a
ensure a more efficient use of resources. Conse-
knowledge base for further work shaping the Coun-
quently, enhanced access to research data is a key
cil's policy in line with the OECD guidelines from
measure to reach overall research policy objectives.
2007. 2
Yet archiving and sharing of research data brings to
A working group has been formed and a number of
the fore a number of technical, financial, legal and
activities are being undertaken in close cooperation
1 Meld. St. 18 (2012–2013) Report to the Storting, “Long lines -
2
knowledge provides opportunities” freely translated by DAMVAD.
OECD (2007) Principles and Guidelines for Access to Research Data
from Public Funding
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
13
with the research communities and data managers
3.2
Context
to explore how open access to research data can
be strengthened.
3.2.1 Data is vital to research
It is in this context that the Research Council of Nor-
Research can be defined in many ways. In the
way has commissioned DAMVAD to undertake a
OECD Frascati manual3 research is defined as
survey among researchers in Norway. The objective
of the survey is to gain a better understanding of re-
"(…) creative work undertaken on a systematic ba-
searchers’ current practices and position regarding
sis in order to increase the stock of knowledge, in-
the archiving and sharing of research data.
cluding knowledge of man, culture and society, and
the use of this stock of knowledge to devise new ap-
The topic clearly involves a broad range of stake-
plications."
holders, including the government, research organizations, researchers, research institutes and civil
Other definitions can be used, but regardless of the
society. This study exclusively investigates the
definition applied, data remains a vital part of re-
viewpoint of researchers.
search.
The two overall questions investigated in this study
Data is vital to researchers in investigating events,
are:
features, and correlations, in adjusting findings from
1. How do researchers in Norway share and
archive research data?
previous research, solving new or existing problems, supporting theorems and developing new theories for the benefit of society.
2. What are the obstacles to the increased
sharing and archiving of research data?
The debate on open access to research data is not
new. The concept and related policy goals were in-
Based on results and analysis on the two questions,
stitutionalized by the establishment of the World
the study discusses measures to reduce or over-
Data Centre system, in preparation for the Interna-
come identified barriers.
tional Geophysical Year of 1957-1958.4
The study feeds into the Research Council of Nor-
The International Council of Scientific Unions (now
way’s work on developing strategies and guidelines
the International Council for Science) established
for sharing and archiving of research data in Nor-
several World Data Centres to minimize the risk of
way.
data loss and maximize data accessibility, further
recommending in 1955 that all research data should
be made available in machine-readable form.5
OECD (2002) “Frascati Manual: proposed standard practice for surveys
on research and experimental development”, 6th edition. Retrieved 27
May 2012 from www.oecd.org/sti/frascatimanual.
3
National Research Council (2008). “Earth Observations from Space: The
First 50 Years of Scientific Achievements.” The National Academies
Press.
4
5 World Data Center System (2009-09-18). "About the World Data Center
System". NOAA, National Geophysical Data Center.
14
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
The debate on open access to research data has
been intensified in recent years following the growing amount of data and growing number of data pos-
FIGURE 3.1
Data management as an integrated part of Research life cycle
sibilities offered by information technology.
The rapidly increasing amount of data allows for the
analysis of complex issues involving large datasets.
New technology generates big data which carry significant data analysis opportunities, but also challenges in terms of storage, communication and processing software, and ownership issues. Examples
include information-sensing mobile devices, aerial
sensory technologies (remote sensing), software
logs, cameras, microphones, etc., generates “big
data”.6
Information technology and the Internet have increased the amount of available data. This implies
new and more extensive opportunities for collecting,
analysing, storing and sharing data.
Information technology has also affected the way in
Source: JISC Research 3.0: driving the knowledge economy.
which research is done. Science has become more
collaborative, data-intensive and computational,
This new, data-intensive research environment of
leaving academic researchers with new data man-
scientific study has been called the “fourth para-
agement needs that have to be addressed as an in-
digm” of scientific inquiry, where “all science litera-
tegrated part of the data lifecycle.7
ture is online, all of the science data is online and
they interoperate with each other” (Hey et al. 2009):8
“We must all accept that science is data and that
data are science, and thus provide for and justify
the need for the support of much-improved data
curation”9
“Big data” is a term used for large and complex data sets; see, for example, http://mike2.openmethodology.org/wiki/Big_Data_Definition.
6
7 JISC Research 3.0: driving the knowledge economy and Tenopir et al.
(2011)
8
Tony Hey, Stewart Tansley and Kristin Tolle, eds.,(2009):”The Fourth
Paradigm: Data-Intensive Scientific Discovery”
9 Hanson, Sugden & Alberts,(2011) “Making Data Maximally Available”
Science Vol 331 11
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
15
3.2.2 Growing consensus on the importance of
sharing publicly funded data
The OECD principles were endorsed by the OECD
Council in December 2006 and published in 2007.
The OECD "Principles and Guidelines for Access to
There is growing consensus that data in all its forms
Research Data from Public Funding" (2007) essen-
represents
today's
tially recommends that research data generated
knowledge society. Access to data and the infra-
through publicly funded research is to be made pub-
structure allowing for the utilization of data has be-
licly available to others:
a
significant
resource
in
come a resource that should be protected and utilized in an efficient manner.
“The value of data lies in their use. Full and open
access to scientific data should be adopted as the
A growing number of governments, organizations
international norm for the exchange of scientific
and research funders are actively working to in-
data derived from publicly funded research.”
crease openness to data. This is not limited to research data but all publicly funded data.
National Research Council study, Bits of Power. Sited in the
OECD Guidelines (2007)
The relevance of sharing publicly funded research
data rests on two argument.
A “recommendation” is a legal instrument of the
OECD that is not legally binding and which is often
Firstly, publicly funded research data should be uti-
referred to as “soft law”. As such, there are no legal
lized to the greatest extent possible and not be re-
obligations towards publishing data. However, when
served for individual researchers or institutions. Fur-
a recommendation is endorsed by a country the
ther, open access to research data can be a mean
country is obligated to work towards fulfilling that
to utilise resources more efficiently.
recommendation.
“Sharing and open access to publicly funded
The Norwegian government has endorsed the
research data not only helps to maximize the re-
OECD guidelines in, for example, the previous white
search potential of new digital technologies and
paper on research from 2009:
networks, but provides greater returns from the
public investment in research.”
“Increased availability of research data,
both in Norway and in the partner countries, helps
OECD Guidelines (2007)
to facilitate research and disseminate knowledge
across borders. This is fundamental to the quality
In 2004, the governments of the 30 OECD countries
and something the government wants to facilitate.
as well as China, Israel, Russia and South Africa
adopted the “Declaration on Access to Research
The Government intends to follow up on the
Data from Public Funding”. In this declaration, they
OECD principles and guidelines for access to
recognized the importance of access to research
publicly funded research data. “
data and invited the OECD to develop a set of
St. Meld 30 (2008-2009) Report to the Storting, “Climate for research.”
Freely translated by DAMVAD
OECD guidelines based on commonly agreed principles to facilitate optimal cost-effective access to
digital research data from public funding.
16
Alongside the work at the OECD level, the European Commission is also working towards more
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
openness of research data. Efforts have been made
both in terms of building
competences10
Although the OECD guidelines have been endorsed
and infra-
in Norway, it has largely been left to the various in-
structure as well as in developing European-wide
stitutions and disciplines to develop methods for im-
policies and guidelines.
plementing them.
In 2010, the High Level Expert Group on Scientific
In addition, the infrastructure for sharing and archiv-
Data submitted its report “Riding the wave. How Eu-
ing research data in Norway is fragmented, with a
rope can gain from the rising tide of scientific data”
decentralized system of local, regional, national and
to the European Commission.
international data centres. There are wide variations
between different subjects and disciplines.
“Riding the Wave” offers a vision of how Europe,
through the efficient use of research resources, can
The government of Norway and the Research
strengthen research and innovation in Europe and,
Council of Norway now see a need for more coordi-
thereby, strengthen Europe’s competitiveness in the
nated efforts to ensure that more data are shared
global economy. Since the beginning of its Seventh
and archived. However, knowledge as to practices
Framework Programme (FP7) for research and in-
and the obstacles faced is needed. This study will
novation in 2008, the European Commission has
serve as input for such work.
operated an Open access pilot to ensure open access to research publications from the FP7-funded
3.3
Structure of the report
projects.
Following this chapter on the mandate for and conBased on these experiences, the European Com-
text of the report, the report provides a brief sum-
mission has communicated that not only publica-
mary of the main findings from former studies and
tions but also research data from the EU-funded
interviews in Chapter 4. Chapter 5 gives a detailed
projects should be openly available (when possible)
description of the methodology applied, covering
in the future.
conceptual clarifications, the selection of the population and the survey process, etc.
In December 2013, the European Commission published “Guidelines on Open Access to Scientific
The results of the surveys are presented in Chap-
Publications and Research Data in Horizon 2020”.11
ters 6 through chapter 10. The results are presented
Such initiatives is likely to affect Norwegian re-
in the following order: presentation of the respond-
searchers in the times to come.
ents, the respondents’ practices regarding data usage and generation, the respondents’ practices regarding data archiving and, last but not least, the respondents’ practices and obstacles in relation to the
sharing of data. The main findings and recommendations are included in the final chapter.
10
For example, through the funding of Parse.Insight, a two-year project
co-funded by the European Union under the Seventh Framework Programme. It was concerned with the preservation of digital information in
science, from primary data through analysis to the final publications resulting from the research.
11
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
17
4
Former studies used to develop the hypotheses
Numerous studies have devoted themselves to the
support for open access to publicly financed re-
definition and importance of sharing research data
search data among various stakeholders. Nine out
(Borgman 2012, Kowalczyk & Shankar 2011).
of ten respondents stated that research data that is
publicly available and publicly funded, has to be - as
Several studies have addressed the technical as-
a matter of principle - available for re-use and free
pects of infrastructure and data management (Te-
of charge on the Internet.
nopir 2012, Graaf et al. 2011), while strategy papers
and policy documents have focused on the research
Similarly does studies find support for importance of
process and proposed policies for the promotion of
archiving of data. PARSE.Insight's (2012) European
data sharing (PARSE.Insight 2012, EC 2012, Hey
study on data archiving (preservation) concur that
et al. 2009).
the preservation of research output is important, the
reasons being that it may stimulate the advance-
Various studies have also focused on the practices
ment of science and that it allows for the re-analysis
of and barriers to sharing and archiving from the
and validation of research13.
viewpoint of researchers.
4.2
Lack of recognition, time and proper infrastructure
The following chapter summarizes the main findings
from studies dealing with the current practices of
and obstacles to sharing and archiving from a re-
Previous studies show that however, data are often
search point of view. The findings from previous
unavailable for various reasons. One of the key
studies have yielded significant insights into the
challenges for sharing research data concerns the
matter and have been used in the development of
legal issues involved. Data must be stored and
the hypotheses and questions of our study.
shared in a way that safeguards privacy. Laws and
regulatory policies in this area comprise provisions
4.1
The consensus on the importance of ac-
that have their origin in general social considera-
cess to data
tions and the need to protect citizens.
Several studies find that researchers acknowledge
There may also be other legal challenges relating to
the benefits open access to research data that is
who owns the rights to data when multiple funders
publicly funded.
are involved in a given research activity.
A European Commission12 study from 2012 on “sci-
Tenopir et al. (2011) conducted a survey among
entific information in the digital age” found strong
1,329 scientists,14 exploring current data sharing
practices and perceptions of the barriers to - and
12
The EC Online. The online survey on “scientific information in the digital age” was open from July 2011 to September 2011. The team received
1,140 responses in total from all Member States, except Ireland, Malta,
Slovenia and Slovakia. 37 percent of all responses were submitted by
German respondents. The responses represented the different stakeholders, 429 of which were individual researchers; six respondents (not
limited to researchers) hailed from Norway.
13
Apparently, validation of research is a growing global concern, see
“Trouble at the Lab”, The Economist (October 2013).
18
14
In Tenopir et al. (2011), the survey was open from October 2009 to
July 2010. Initially, the investigators used a snowball sampling method.
They sent an email cover letter to DataONE team members (about 35 individuals throughout the world, but primarily in the United States). To increase international response, surveys were sent by an academic publisher to its database of over 7,000 previous authors. Ultimately, 1329 respondents answered at least one question. It is not unreasonable to estimate that the survey instrument reached 15,000 people, in which case
the response rate was approximately 9 percent.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
enabling of - data sharing. In this survey, the princi-
privacy concerns, concerns about publishing oppor-
pal reasons stated by scientists for not sharing data
tunities, and the desire to retain exclusive rights to
were insufficient time and a lack of funding.
data.
The respondents in the EU study previously referred
Many journals require authors to share their data
also stated funding as a central barrier. In addition,
with other investigators, either by depositing the
lack of credit given to researchers for making data
data in a public repository or else by making it freely
available was raised as a concern. Most of the re-
available upon request. Caroline J. Savage and An-
searchers (81.1 percent), in the EU study, rated in-
drew J. Vickers (2009) endeavoured to determine
sufficient credit given as a very important or im-
how well authors comply with such policies by re-
portant barrier to accessing research data, followed
questing data from authors who had published in
by lack of funding to develop and maintain the nec-
one of two journals that had clear data-sharing poli-
essary data infrastructures (78.7 percent) and insuf-
cies. They received only one of 10 raw data sets re-
ficient national or regional strategies (74.6 percent).
quested. This suggests that journal policies requiring data sharing do not lead to authors making their
The European-wide study
Parse.Insight15
found that
data sets available to independent investigators.
researchers often had major concerns about legal
issues, misuse of data and incompatible data-types,
Researchers who choose to withhold datasets often
all of which interfered with data-sharing practices.
have specific reasons for doing so. Savage and
Vickers (2009) noted that these reasons included
Enke et al. (2012) found a diverse mix of both tech-
concerns about patient privacy (for medical fields),
nological (e.g., a lack of appropriate data-
concerns about future publishing opportunities and
bases/mechanisms) and sociological (e.g., time,
the desire to retain exclusive rights to data that had
funding, etc.) causes that may impede scientists
taken many years to produce.
from sharing data. The main reason for not sharing
data (cited in their international survey on data shar-
The studies presented above have provided in-
ing in the field of biodiversity) was “loss of control”
sights into research practices and views regarding
over the data, followed closely by the amount of time
sharing and archiving.
that would be needed to invest in sharing data sets.
Nonetheless, the studies suggest that various barriStudies indicate that sharing research data reflects
ers entails. We sum up the findings in a simple illus-
personal factors, such as attitudes and culture. Te-
tration in Figure 4.1. There seems to be a diverse
nopir et al. (2012) found that barriers to sharing re-
mix of barriers involved, of which privacy issues, los-
search data were deeply rooted in the practices and
ing control over data, lack of credit, time for prepa-
culture of the research process, as well as in the re-
ration and lack of proper infrastructure appear to be
searchers themselves. These factors can include
the most important (highlighted in Figure 4.1).
15
cludes 1,389 responses from researchers, 262 responses from data managers and 178 responses from publishing. All parts of Europe were represented in these surveys.
Parse.Insight (2010) was a two-year project co-funded by the European
Commission under the Seventh Framework Programme (FP7) on Research Infrastructures. Major surveys were held within three stakeholder
domains: research, publishing and data management. The survey in-
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
19
4.3
Variations across disciplines and ages
Atmospherics scientists were most inclined to making their data most available to others.
Although the studies often highlight many of the
same obstacles, the importance of the barrier itself
Interestingly, when asked whether a lack of access
might differ between the studies. This can be an in-
to other researchers' or institutions' data were a ma-
dicator of differences in the nature of barriers across
jor impediment to their research, social scientists
the respondent group, but could also be a conse-
agreed that this was the case more than other re-
quence of the different methodologies applied. Sur-
spondents (80 percent compared to 60 percent
veys are typically sensitive to the way questions are
across disciplines).
articulated and which context they are placed in.
Such a lack of data sharing may also be a question
Consequently, thus one should be careful when
of competition. Campbell et al. (2002) found that
comparing different studies. This said, some studies
fields with increased opportunities for commercial
have investigated the differences between different
applications, such as genetics, were less likely to
types of respondents within the same study and still
share data when compared to less competitive
found variations, especially across disciplines and
fields.
age ranges.
Younger researchers tend to be less likely to share
Some research disciplines are typically more reluc-
data. This may be due to concerns regarding their
tant to share data than others. Tenopir et al. (2011)
career path. Tenopir et al. (2011) found differences
found that the actual rate of data sharing varied con-
in responses based on the age of respondents.
siderably according to subject discipline, age, and
Younger people were less likely to make their data
geographic location. Researchers in medicine and
available to others, whereas people above 50 years
social science were the least likely to share data.
old showed more interest in sharing their data.
FIGURE 4.1
Barriers to sharing research data
Legal
Sociological
Technical
•Privacy
•Shared ownership to data
•Lack of knowlegde on legal issues related to data
•Lack of incentives/credit to researcher
•Concernes about researchers freeriding on data gathered by other researchers
•Fear of loosing controll over data
•Fear of loosing scientific edge
•Fear others might not understand data
•Lack of infrastructure
•Sharing data is time-consuming
•Lack of standards for sharing and preparing metdadata
•Lack of technical skills
Source: DAMVAD based on Tenopir et al. (2011), Enke et al. (2012), EC (2012), Kvale (2012), PARSE.Insight (2010).
20
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
These results correspond to findings in Kvale's
data formats and metadata17, and a lack of profes-
(2012)16
sional data curators responsible for facilitating shar-
study of life science researchers in Nor-
way. Kvale (2012) found the argument that publicly
ing and archiving on behalf of researchers.
funded research should become public property to
be stronger among researchers with more experi-
A list of informants is included in the appendix.
ence. However, the proposition that the sharing of
research data might stimulate inter-disciplinary col-
4.5
Hypotheses
laborations stood out as an argument with much
stronger support among younger researchers than
The international literature, workshop, and prelimi-
more experienced ones.
nary interviews served as a basis for formation of
hypotheses to explore through the survey. We pre-
4.4
Input from researchers and data manag-
sent the hypothesis in Table 4.1.
ers in Norway
Together, the different hypotheses allow for a deAs part of preparing this report, DAMVAD partici-
tailed analysis of the practice of sharing and archiv-
pated in a workshop organized by the Research
ing of research data in Norway, what the main bar-
Council on sharing and archiving of research data
riers for sharing and archiving are, and how these
in Norway in October 2013. Interviews and partici-
barriers can be reduced.
pation in this workshop provided certain insights
and allowed for the detailed discussion on the practise of sharing and archiving in Norway.
Further, informants amongst researchers and data
managers in Norway offered further insights into the
barriers to sharing and archiving in the Norwegian
context. Many of the barriers (such as issues relating to privacy, lack of credit and time) identified in
the former studies were confirmed.
Data managers are also typically concerned about
the technical aspects, describing the Norwegian
data management infrastructure as fragmented and
overlapping. The informants point to a lack of central coordination, a lack of established standards for
16 A survey were conducted by Kvale as a part of her Master's thesis on
data sharing in the life sciences of researchers at the Norwegian University
of Life Sciences in 2012. The questions in the survey were largely similar
to the questions included in the Parse.Insight survey in 2009. Of the 650
researchers and PhD students at the Norwegian University of Life Sciences (UMB) selected as a sample population for the questionnaire, 147
respondents (or 23 percent) replied.
17
Metadata is "data about data" i.e. information or content of data.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
21
TABLE 4.1
Hypothesis to be tested in the survey
Researchers see the benefit of accessing other researchers' data, but want to retain control of their
own data.
There are various barriers to sharing research data (legal, technical, ethical and financial). Some data
cannot be shared; nonetheless, a lack of incentives, time and infrastructure remain as central obstacles of sharing research data.
Research data is archived for later reanalysis and validation.
Sharing and archiving activities is financed as a part of research project funding.
The barriers differ significantly between sector, discipline and age.
Younger researchers are more negative about sharing research data than the older scientists.
Researchers in the institute sector are more concerned with future revenue, whereas researchers in
the university sector are more concerned with loosing scientific edge.
Researchers in disciplines using numerical data are more experienced with sharing data.
Internationally-oriented researchers are more open to sharing data than those that primarily work
alone.
Management supports the sharing and archiving of data.
Work to increase sharing and archiving of research data needs to take place on many levels: policy
level (guidelines, standards etc.), infrastructure/data management level, institutional level and research level.
Source: DAMVAD
22
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
5
Methodology
This chapter describes the methodology of the sur-
As an introduction to the survey, we informed the
vey: definitions used, how we selected the survey
respondents about the definition of publicly funded
population, and analysis of respondents.
research data:
5.1
Publicly funded research data is defined as the
Conceptual clarifications
use and generation of research data that is publicly
This study contributes to a field of growing interest
funded (e.g., fully or partly funded by the Research
from researchers, research organizations, govern-
Council of Norway, hospital trusts, universities and
ments and civil society. Several studies have sought
colleges, ministries and other public entities).
to investigate the field from a range of angles using
Research that is fully funded by private or interna-
different methodologies, concepts and terms.
tional organizations is not included in this survey.
We have largely used the terms and definitions offered in accordance with the OECD guidelines and
completed the necessary delineations to make the
study relevant for the work of the Research Council
of Norway.
5.1.3 Research data
Various definitions of research data can be found in
literature on the topic. This study uses the term in
accordance with the OECD guidelines. As part of
the introduction to the survey, we informed the re-
5.1.1 Scope
spondents about the definition of research data:
The researchers relevant to the study included researchers working at research institutes, universi-
Research data are defined in accordance with the
ties
health
OECD guidelines for open access to research data,
trusts (Helseforetak) in Norway. Researchers out-
in which research data comprises factual records
side such institutions (e.g., researchers employed in
(numerical scores, textual records, images and
private companies) are not the included in the sur-
sounds) used as primary sources for scientific re-
vey. This delineation ensured that the study focused
search and which are commonly accepted in the
on the activities of those researchers one might ex-
scientific community as necessary to validate re-
pect to be publicly financed.
search findings. A research data set constitutes a
and
university
colleges
and
systematic, partial representation of the subject being investigated.
5.1.2 Financing
This term does not cover the following: laboratory
The study will serve as input to the Council's work
notebooks, preliminary analyses and drafts of sci-
on drawing up guidelines for publicly funded re-
entific papers, plans for future research, peer re-
search data. The survey has also sought to focus
views, or personal communications with colleagues
on publicly funded research data but not data that
or physical objects.
have been gathered for other reasons (such as for
commercialization). This is in line with OECD guide-
The OECD guidelines are primarily aimed at re-
lines.
search data in digital, computer-readable format. It
is in this format that the greatest potential lies for
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
23
improvements in the efficient distribution of data and
we have not applied this distinction between types
its application to research, largely because of the
of research data in the survey.
marginal costs of transmitting data through the Internet. However, it could also apply to analogue research data in situations where the marginal cost of
One of the research questions posed includes the
term ‘metadata’. Metadata can be understood as
giving access to such data can be kept reasonably
structured data and information about data, of any
low.
sort in any media, which imposes order on a disordered information universe. Typically, metadata
PARSE.Insight. (2010) used the term digital re-
comprises index files and data dictionaries that
search data for all output in research. In practical
store administrative information.
terms, raw data, processed data, publications and
post-publication materials are all covered by the
same term.
We have not used the term ‘digital’, as we would like
to cover the entire range of research data. Moreover, we did not wish the respondents to make subjective valuations as to what type of data the survey
covers. One can imagine research data that has not
been made digital but which can be digitalized in the
future.
It is common to use several data sources in research. It is useful to delineate between source
data and output data. Source data is data that already exist independently of the research to be un-
5.1.4 Archiving
‘Storage’, ‘archiving’ and ‘preservation’ are all terms
used to describe how access to data at some later
point in time is ensured. Although no clear distinction between the three terms can be made, storage
might be understood as the saving of data during a
project, archiving as the medium- to long-term saving of data after a project, and preservation as professional saving for even longer periods.
This study focuses on the viewpoints of researchers
and how they deal with their research data; in this
study, we have used the term ‘data archiving’ to denote storage beyond the lifetime of a project.
dertaken. This may be information that is collected
for a different purpose (e.g., administrative data or
clinical data) or physical or digitized collections of
As an introduction to the survey, we informed the
respondents as to how archiving is defined:
objects and texts (such as libraries, text corpuses
and other scientific collections).
Data archiving refers to the long-term storage of
scientific data and methods. Typically, data are ar-
Output data is data generated through research.
This can be data generated through new analysis or
a compilation of existing data sources, but it can
also be completely new data generated through
new data collection. Typically, such data will be data
from experiments, simulations, field work or interviews. However, the distinction between primary (output) and secondary (source) data can
sometimes be subjective and contextual. As such
chived at the end of a research project or else after
a scientific publication or report has been prepared.
Parse.Insight (2010) used the term ‘digital preservation’ to refer to a set of processes and activities that
ensure continued access to information in digital
form. It denotes the process of storing digital information in such a way that it remains accessible, understandable and usable over the long-term (usually
five, 10 or 50 or more years). The survey explored
24
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
several related activities, such as taking into ac-
Open access to research data is the practice of
count environmental changes (preservation watch),
providing access on equal terms at the lowest pos-
preservation planning (what needs to be done and
sible cost, preferably at no more than the marginal
when) and preservation actions (e.g., migration and
cost of dissemination.
emulation).
Concerning the term open access; this is in some
We have chosen not to use the term digital preser-
instances merely used for open access to research
vation as a term in the survey, as it can be under-
publications and not to data. Therefore, we have
stood as an activity for professional data managers
specified throughout the survey that we do deal with
rather than as an active action that researchers un-
open access to research data.
dertake in their everyday research activities.
5.2
5.1.5 Open access to research data
Selecting the population
To ensure robustness of survey results, it was im-
We have used the term ‘open access’ to research
data in line with OECD guidelines, which state that
‘openness’ refers to access on equal terms for the
international research community at the lowest possible cost, preferably at no more than the marginal
cost of dissemination.
portant to obtain a representative number of completed answers from each sub-population (i.e., the
university sector, research institutes and health
trusts). With representative sub-samples, we are
able to compare different groups of respondents.
We sampled our population by randomly selecting
The OECD guidelines also states that open access
to research data from public funding should be
easy, timely, user-friendly and - preferably - Internet-based.
researchers from CRIStin.18 In addition to the mentioned sub-populations, we sought representativeness within research disciplines in research institutes, universities and university colleges. All the
sub-populations had a representative number of
The latter part of these guidelines can be seen as a
normative judgement rather that a definition of the
term; therefore, to avoid misunderstandings and dif-
completed surveys once the survey had ended, with
the exception being the Humanities within research
institutions.
ferences in interpretation, we have not included this
definition in the survey.
In turn, we have used the following definition, of
which the respondents were also informed as an introduction to the survey:
18
The Research Information System CRIStin is a tool aimed at the recording and promotion of publication data, projects, units and competency profiles. The system is also used to report publication points.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
25
5.3
Survey process
confirm the hypotheses and gain a better understanding of the status of the archiving of and open
In-depth understanding and knowledge reduces the
access to publicly funded research data in Norway.
risk of misinterpreting questions and ensures that a
survey can cover all areas of the topic in question.
An initial draft of the survey was developed in Ena-
Prior to designing the survey, we conducted exten-
lyzer Survey Solution. We tested the draft exten-
sive desk research including a literature review.
sively, both internally and in collaboration with the
Based on this, we formed a survey grounded on a
Research Council of Norway. These tests helped to
proper understanding of the obstacles and barriers
ensure that the survey addressed the central hy-
to open access for research data.
potheses. Further, it was important that the questions asked should be unambiguous and easy to un-
Getting the researchers’ views also helped to define
derstand on the part of the respondent. Finally, it
the questions and their response alternatives. As
was of particular importance that the survey should
such, we conducted explorative and in-depth inter-
draw a clear distinction between what information
views and participated at a workshop on data man-
was needed and what information would be useful
agement organized by the Research Council of Nor-
to have. Thus, we did not want a survey that was
way. With the information provided, we were able to
too long or contained irrelevant information.
TABLE 5.1
Population, invites, response rates and the degree of representativeness
Universities and university colleges
Population
Invites
Response
rate
Degree of
representation
2,360
876
22.9%
114.9%
699
576
24.1%
93.3%
Mathematics and natural science
1,599
599
31.1%
110.1%
Medical science
3,779
716
28.2%
112.2%
Social science
2,488
746
28.6%
121.0%
Humanities
Agriculture and fishery
Technology
1,767
557
28.7%
93.6%
Health trusts
Medical science
1,867
501
28.9%
84.3%
Research institutes
Humanities
101
83
38.6%
50.0%
1,334
438
41.8%
110.2%
Mathematics and natural science
555
407
33.9%
97.9%
Medical science
588
411
37.2%
107.0%
Social science
564
386
41.2%
112.0%
Technology
1,162
486
34.4%
102.5%
Total
18,863
6,782
30.6%
Agriculture and fishery
Source: DAMVAD
Note: The degree of representativeness covers how close the survey are to be representative for each subpopulation allowing for a 6 percent error level
at a 90 percent confidence interval. This means that within a 6 percent margin the analytic is 90 percent confident that the population is representative.
26
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
After developing the survey internally, DAMVAD invited 68 randomly drawn researchers from CRIStin
5.5
A significant proportion of the researchers actively chose not to participate
to test the survey electronically using Enalyzer Survey Solution. Ten researchers completed the pilot,
Approximately 600 researchers actively chose not
which provided us with good feedback.
to participate in the survey. Health trusts saw the
highest share of researchers not willing to partici-
After adjusting the pilot survey, the final survey was
pate, as shown in Figure 5.1.
launched by email through Enalyzer Survey Solution on the 18th of December 2013 (week 51).
The share of respondents that did not want to participate in the survey was higher than what we have
5.4
Response rate
experienced in other surveys. There are variations
across sectors and research disciplines. 16.6 per-
DAMVAD invited 9,262 researchers to participate in
cent of those working in health trusts did not wish to
the survey, of which 2,480 email addresses were no
participate in the survey. Likewise, 14.6 percent
longer working. This left us with 6,782 active re-
working in the research institute sector in agriculture
spondents. 1,474 researchers completed the survey
and fishery did not wish to participate.
while 604 actively chose not to participate.
Researchers at universities were keener on particiThe response rate for the population as a whole was
pating. An average of six percent did not wish to par-
30.6 percent, while it varied between 23 percent and
ticipate, with the lowest share being in the research
42 percent within different sub-populations. Figure
disciplines of mathematics and the natural sciences,
5.1 includes a complete overview of the number of
whereby five percent did not wish to participate.
invites, the response rates and the population size
of our sample.
One of the main objectives of the survey was to ensure representation in all the relevant sub-populations. The representative number of completed surveys varied according to the size of the total population.
As the population size increases, the number of
completed surveys needed for a representative
sample as a percentage of the population will fall.
That is, for small populations, a large portion of the
actual population needs to complete the survey in
order to generate a representative sample. The degree of representation is smaller for medical science
performed at health trusts (84.2 percent) in comparison to medical science performed at research institutes (107 percent).
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
27
FIGURE 5.1
A significant share of researchers actively chose not to participate.
20%
17%
15%
12%
10%
6%
5%
0%
Health trusts
20%
16,6 %
15%
10%
14,6 % 14,4 % 13,3 %
Research institutes
11,3 %
9,8 %
9,3 %
5%
0%
Source: DAMVAD
28
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
8,3 %
Universities and university
colleges
7,6 %
5,9 %
5,9 %
5,2 %
5,0 %
6
Descriptive statistics
This chapter presents the characteristics of the survey respondents. Information about the researchers
that have completed the survey is useful both to assess robustness of findings and to illustrate the
TABLE 6.1
Participation across sector (“At what type of institution is your main occupation?”)
Sector
Freq.
Pct.
Research institute
649
44.0%
category is important for later discussions and com-
University and university college
607
41.2%
parisons of findings and differences between sec-
Hospital trust
158
10.7%
Other
60
4.1%
complexity of the researcher population.
A sufficient numbers of respondents within each
tors, research disciplines and scientific experience.
Table 6.1 shows the distribution of the respondents
for different sectors. Research institutes and univer-
Source: DAMVAD
Note: Other covers private organisations, non-profit and foundations
sities together cover 85 percent of the respondents.
Eleven percent are in hospital trusts while the last
four percent comprise others, covering inter alia
Table 6.2 shows the differences in gender across
companies.
male respondents, at 60 percent yet respondents
respondents. There is an over-representation of
represent a representative sample of both genders.
The distribution between research institutes and
universities is relatively even, which allows for comparisons between the two respondent groups. The
number of respondents from hospital trusts is lower
than for the two other sectors.
TABLE 6.2
Participation across gender (“What is your gender”?)
Gender
Freq.
Pct.
Female
602
40.8%
Male
872
59.2%
Total
1,474
100%
Table 6.1 shows the distribution of respondents by
affiliation. One concern is the level of respondents
within the hospital trust. From table 5.1 we saw that
we only reached an 84.3 percent representative
level. Though 84.3 percent is relatively high it still
Source: DAMVAD
not qualify as full statistically representative. With
158 observations, we still find that we can use the
Although the survey allows for analysis based on
category when comparing with other sectors. Nev-
gender, gender is not used extensively to compare
ertheless, we will keep in mind the limitations of this
the results. This dimension is interesting, but less
category.
relevant to specific policies and strategies going forward, where most efforts will need to cut across
gender.
Table 6.3 shows the distribution of respondents
across research disciplines. Social sciences and
health sciences are the disciplines with the highest
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
29
TABLE 6.3
Participation across research discipline (“Which is
your primary research discipline?”)
Field of research
Freq.
Pct.
TABLE 6.4
Participation across scientific experience (“For how
many years have research constituted a major part
of your work (including PhD or similar?”)
Scientific experience
Social science
323
Pct.
Less than 3 years
123
9.5%
3 - 6 years
298
23.1%
7 - 10 years
233
18.1%
21.9%
Health science
319
21.6%
Mathematics and science
271
18.4%
Technology
178
12.1%
11 - 20 years
330
25.6%
Farming and fishery
159
10.8%
More than 20 years
306
23.7%
Humanities
122
8.3%
Total
1,290
100%
Other
102
6.9%
Total
1,474
100%
Source: DAMVAD
Source: DAMVAD
Note: “Others” typically covers multi-disciplinary research
amount of responses, whereas humanities have the
fewest.
In total, we estimate that the different categories are
well represented, enabling robust analysis and comparisons across different research disciplines.
Finally, table 6.4 shows the distribution of respondents by scientific experience. We measure scientific
experience in terms of the number of years the respondents have been conducting research (i.e., the
number of years since and including their PhD).
The distribution of the respondents is on this aspect
as well. One fourth have conducted research for 11
to 20 years, and almost the same share have conducted research for more than 20 years.
30
Freq.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
7
Researchers use data generated by other researchers
This section includes the findings related to how re-
Compared with health trusts, only half of the re-
searchers generate and use data. This constitutes
spondents working in universities generated numer-
an important context to understand possibilities and
ical scores. Researchers at universities mainly use
limitations for archiving but especially sharing.
textual records – as 28 percent stated that they
mainly generated textual records. This is shown in
We have applied a definition of research data in line
figure 7.1 on the following page.
with the OECD guidelines, which makes the distinction between numerical records, textual records,
Textual records are more common within social sci-
sounds, images, videos and graphics.
ences and humanities. Almost 50 percent of the responding researchers in these fields stated that they
The questions about data formats are included in
mainly generate textual records, in contrast to math-
the survey for two reasons. In particular, they offer
ematics and natural sciences where very few (7 per-
an interesting perspective as to what kind of re-
cent) primarily generated textual records.
search data are most commonly used. Further, they
also allow for the investigation of whether researchers’ views on the sharing and archiving of data differ
across data formats.
7.1
Data formats vary across research disci-
TABLE 7.1
Type of data generated (“What is the main format of
your research data?”)
Freq.
Pct.
Numerical scores
865
58.7%
Textual records
337
22.9%
Images, sounds, videos and
graphics
72
4.9%
I do not generate any research
data
62
4.2%
Other
138
9.4%
Total
1,474
100%
plines
Three-quarters of the respondents generated numerical data, (e.g., quantitative data, data models,
data series, statistics, etc.). Health trusts in particular use numerical data in their research. Of all the
respondents, almost 60 percent stated that they
mainly generate numerical data. This is especially
true for agriculture and fishing, as well as in mathematics, the natural sciences and medicine. A total of
Source: DAMVAD
23 percent generate textual records and 5 percent
most frequently generated images, sounds, videos
and so forth.
Some researchers report that they do not generate
data at all. This is true for 6 percent of the respond-
In humanities, numerical data is rare. Researchers
ents at universities, 3 percent at research institutes
in humanities typically base their research on tex-
and 2 percent at health trusts did not generate data
tual records (qualitative data, field report, inter-
at all. There are differences between research dis-
views, social studies, etc.), images, sound and
ciplines as to who does not generate data. For ex-
alike, or else they do not generate data at all. Only
ample, no data is reported by 1 percent within agri-
14 percent of the respondents within the Humanities
culture and fishing, but 7 percent within technology.
stated that they mainly generate numerical data.
This is not showed in the figure.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
31
The distribution of the respondents also shows that
7.2
Numerical data are easier to restore
approximately 10 percent within each group have
answered ‘other’. Going through the survey, this
As illustrated in Figure 7.2, numerical data are eas-
most often implies that they generate both numeri-
ier to restore. Fifteen percent answered that the nu-
cal and textual data.
merical data could be restored very easily, and almost 50 percent stated that they could restore their
We found little evidence of differences in terms of
numerical data with the same effort as they used
the type of data generated by experience, which
when producing the data.
means that we will not present or comment upon the
types of data generated by researchers with differ-
Textual records is the source of data that is hardest
ent levels of experience.
to restore. Almost 50 percent answered that textual
records are either impossible to restore or at least
difficult to restore such data.
FIGURE 7.1
Data format, by institution (“What is the main format of the research data you generate?”)
80%
76%
70%
63%
60%
50%
50%
40%
28%
30%
22%
20%
8%
10%
6%
4%
6%
4%
3%
2%
0%
Numerical scores
Textual records
Univerities and university colleges
Research institutes
Source: DAMVAD
32
Images, sounds, videos
and graphics
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Do not generate data
Health trusts (hospitals)
FIGURE 7.2
Numerical data is easily restored (“If your research data gets lost, how easily can you recreate it”?)
60%
48%
50%
47%
40%
34%
32%
30%
20%
24%
19%
12%
16%
15%
10%
15%
10%
13%
9%
6%
1%
0%
Numerical data
Hardly
Not possible
Textual records
Very easily
Images, sounds, videos and
graphics
With same effort
I don’t know
Source: DAMVAD
7.3
Researchers frequently use other researchers’ data
The survey also asked researchers about the extent
TABLE 7.2
Use of other researchers’ data (“Have you within the
last three years used research data gathered by
other researchers?”)
to which they use other researchers’ data in their
Freq.
Pct.
work, and the extent to which they share their own
data with other researchers.
No
508
36.0%
Many researchers have utilized research data of
Yes
904
64.0%
1,412
100%
other researchers. Almost two thirds of the responding researchers had utilized research data provided
by researchers within the past three years.
Total
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
33
Across affiliations, most researcher use data gath-
years. This a little less common within health trusts.
ered by other researchers.
Yet 52 percent within health trusts have used data
gathered by other researcher within the last three
FIGURE 7.3
Researchers use other researcher’s data, by sector (“Have you within the last three years used research data gathered by other researchers”)
80%
Univerities and
university
colleges
68%
62%
60%
52%
search disciplines.
The use of other researchers’ data seems more
commonplace within mathematics and natural sciences and - to a lesser extent - within humanities
and medical science. This corresponds to our hy-
Research
institutes
40%
years. Differences are more important across re-
pothesis and international studies across disciplines
(Tenopir, 2011).
20%
Health trusts
(hospitals)
Specifically, 50 percent of the respondents within
humanities and 44 percent in medical science report
0%
not to have used research gathered by other reSource: DAMVAD
Note: The figure only include those that have answered “yes” to the question: “Have you within the last three years used research data gathered by
other researchers?”
searchers within the last three years. In comparison,
the share is 24 percent within mathematics and the
natural sciences, and 32 percent within agriculture
Figure 7.3 shows that 68 percent of the respondents
and fishery.
within research institutes have used research data
gathered by other researchers within the last three
FIGURE 7.4
Researchers use of other researcher’s data, across disciplines (“Have you within the last three years
used research data gathered by other researchers»)
76%
80%
68%
70%
60%
63%
67%
56%
50%
50%
50%
44%
40%
37%
32%
30%
33%
24%
20%
10%
0%
Humanities
Agriculture and fishery
Mathematics and natural science
Medical science
Social science
Technology
Source: DAMVAD
34
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
7.4
Researchers mainly use data produced
Table 7.3 contrasts this finding with their reported
interest in using data.
by other researchers from the same institution
Almost three-quarters (71.5 percent) of the respondents would like to make use of other researchResearchers do not travel far in their search for re-
ers’ data. In other words, there is a substantial un-
search data. Two thirds of the researchers used re-
met demand for research data generated by other
search data produced by other researchers from the
researchers. As few as 144 respondents (10 per-
same institution.
cent of the total respondent group) have not used
research data generated from other researchers for
However, many respondents also utilize data gath-
the past three years, and do not wish to use data
ered by researchers from international institutions.
generated by others.
Across all respondents, 56 percent stated that they
used data from other researchers at international in-
Nine out of ten respondents either want, or are al-
stitutions.
ready using, research data gathered by other researchers.
TABLE 7.3
Researchers use data produced by other researchers at their institute (“Whose research data have you
used the most within the past 3 years?” Multiple answers allowed)
Freq.
Research data from other researchers at my institution
Pct.
605
67.4%
435
48.4%
TABLE 7.4
Researchers that have not used other researcher’s
data, but would like to do so. (“If «no» to the above
question: Would you like to make use of research
data gathered by other researchers or institutions?”)
Freq.
Pct.
No
144
28.5%
Yes
362
71.5%
Total
506
100%
Research data from other researchers at national institutions
Research data from other researchers at international insti-
503
56.0%
Other
25
2.8%
Total
1 568
175%
Source: DAMVAD
tutions.
Source: DAMVAD
7.5
Researchers would like even better access to other researchers’ data
As illustrated in Table 7.2 36 percent of researchers
have not used data gathered by other researchers.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
35
8
Research data is rarely archived in data centres
Data archiving refers to the long-term storage of sci-
When asked about the most common way of archiv-
entific data and methods. That is data which are ar-
ing data, the vast majority of research data is stored
chived at the end of a research project or else after
a scientific publication or report has been prepared.
locally, either on researchers’ own personal computers, USB or CD/DVD/floppy disks, or on local
servers at their institutes. More than 80 percent ar-
Archiving of data is an important prerequisite for val-
chive data locally (Table 8.1).
idation of research findings. Infrastructure for archiving data can also be an important enabling fac-
One out of ten stored their data at central data ar-
tor for sharing data.
chive centres, either at their organizations or at national centres. Finally, less than two percent used
This chapter presents the findings related to re-
archive solutions outside of Norway.
searchers’ practise concerning archiving of research data.
These findings are both surprising and cause for
concern. The major concern relates to data security.
8.1
Most data is archived on portable storage
If sensitive data is stored on CD/DVDs or personal
units or institutional servers
computers, they are vulnerable to Internet-based intrusions. Institutional servers are better at keeping
Various systems for data archiving exits. One can
intruders out, but they are still not as good, or as
easily imagine that researchers use a variety of data
secure, as more professional data archive centres
archiving solutions. Sometimes data are archived at
(either local or national) which specialize in taking
the institutional server, other times at a national data
care of sensitive data.
archive centre.
TABLE 8.1
Data archiving (“What is the most common way of archiving your research data after results are ready or
beyond the life of a project?”)
Where do you mainly store the data you generate?
Freq.
Pct.
Portable storage unit
235
18.2%
Institutional server
850
65.6%
Data is submitted to digital archive centre in my organisation
108
8.3%
Data is submitted to a national digital archive centre
23
1.8%
Data is submitted to an international digital archive centre
22
1.7%
Do not archive
34
2.6%
Other
23
1.8%
Total
1,295
100%
Source: DAMVAD
36
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Further, there is an issue concerning the restoration
84 percent for research institutes and 88 percent for
of data if they are lost. If a USB is lost or if a personal
hospital trust.
computer crashes, it can be rather difficult to restore
the data, and months of work can be lost.
That in turn leaves a rather limited share of respondents that archive their data on archiving centres ei-
There are some differences between types of affili-
ther nationally or internationally. At universities and
ations. Storing locally on portable storage units is
university colleges it is 3 percent that mainly store
more common at the universities compared to the
their data on national archiving centres. The same
institute sector. Researchers at research institutes
figure research institutes are 1 percent whereas 2
more often use institutional servers to store their
percent within health trust archive mainly share their
data.
data at national archiving centres.
But in general figure 8.1 confirms that 85 percent of
respondents mainly store their data on a portable
storage unit or at the institutional server. The figure
is 82 percent for universities and university collages,
FIGURE 8.1
Data archiving across institution (“What is the most common way of archiving your research data after
results are ready or beyond the life of a project?”)
80%
74%
71%
70%
60%
54%
50%
40%
30%
28%
17%
20%
10%
10%
7%
10%
7%
3% 1% 2%
2% 1% 2%
4% 2%
1%
National digital
archive/data
center
International
digital
archive/data
center
Do not archive
data
0%
Portable storage
unit
Institutional
server
Organizational
digital
archive/data
center
Universities and university colleges
Research institutes
Health trusts (hospitals)
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
37
FIGURE 8.2
Data archiving by research discipline (“What is the most common way of archiving your research data after
results are ready or beyond the life of a project?”)
90%
81%
80%
70%
70%
59%
60%
30%
20%
10%
61%
48%
50%
40%
73%
34%
21%
19%
19%
7%
14%
9%9%7% 8%
6%
11%
2% 3%2%3%1%
0%
5%
1%1% 1%1%1%
6%
4%
1%2%1% 0%
International
digital
archive/data
center
Do not archive
data
0%
Portable storage
unit
Humanities
Institutional
server
Agriculture and fishery
Organizational
digital
archive/data
center
National digital
archive/data
center
Mathematics and natural science
Medical science
Social science
Technology
Source: DAMVAD
percent of researchers store their data on instituThese differences between how stores at local port-
tional servers. This is illustrated in Figure 8.2
able units or institutional servers are largely explained by differences between research disci-
8.2
Storage reflects costs of recreation
plines. Humanities, where portable storage is more
common, are strongly represented in the university
The implications of losing data is particularly signifi-
sector, while agriculture and fishery are strongly
cant for data that would be costly or impossible to
represented in the research institutes and report a
regenerate.
higher share of centralized storage.
Data that can be regenerated with the same effort
38
Within humanities, 34 percent store their data on a
as its initial creation is more commonly stored on
portable storage unit. In agriculture and fishing, 81
portable storage units or institutional servers than
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
by other storage methods. 9 percent of data stored
In total we see that those researcher that do not ar-
at a portable unit can be restored very easily.
chive their data also find it harder or even impossi-
Whereas 27 percent of data stored at international
ble to restore data. Almost 60 percent of those not
archives and data centres can easily be restored.
archiving their data cannot restore their data without
extensive efforts. This should be compared to those
On the other hand, we see that 29 percent of the
archiving at national or international archiving cen-
respondents, who do not archive their data, does
tres where 50 percent or more can restore their data
not have the opportunity to restore data. That is only
with the same effort or even a lesser effort.
the case for 9 percent of those storing their data on
national archiving centres.
FIGURE 8.3
Data archiving by data regeneration (“What is the most common way of archiving your research data after
results are ready or beyond the life of a project?”)
60%
50%
50%
45%
40%
43%
35%
29%
30%
29%
27%
26%26%
24%
23%
22%
23%
20%
17%
13%13%
10%
15%
14%
14%
12%
11%
9%
9%
9%
0%
Not possible
Hardly
With same effort
Very easily
Portable storage unit
Institutional server
Organizational digital archive/data center
National digital archive/data center
International digital archive/data center
Do not archive data
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
39
8.3
Most researchers are satisfied with their
current archiving solution
Most respondents seem to be satisfied with their
TABLE 8.3
Satisfaction with current archiving solutions (“If
“not”, why are you not satisfied with your archiving
solution?” Multiple answers allowed)
current solution for the archiving of data. Two thirds
stated that they were satisfied with their current so-
Too complicated to use
lutions, while 15 percent did not know.
TABLE 8.2
Satisfaction with current archiving solutions (“Are
you satisfied with your archiving solution?”)
Yes
No
I don’t know
Total
Freq.
Pct.
841
235
66.7%
18.6%
185
1,261
14.7%
100%
Source: DAMVAD
Freq.
Pct.
41
17.5%
Too expensive
3
1.3%
Too little capacity
51
21.8%
Too many archiving solutions
64
27.4%
Not secure enough
115
49.1%
Other
64
27.4%
Total
338
144%
Source: DAMVAD
These are all important barriers to the active use of
archiving solutions as a part of sharing data. Many
researchers deal with sensitive information, and
hence, security is essential. The presence of too
Most respondents were satisfied with their current
many archiving solutions means that it can be diffi-
archiving solutions. Two-thirds of the respondents
cult for researchers to know where to archive their
stated that they were satisfied with their current ar-
data and that it can be time-consuming for research-
chiving solution.
ers to archive their data. Further, it is noticeable that
such systems are too complicated to use, which
8.4
Those who are not satisfied point to se-
again will have the consequence that researchers
curity risks
will have to use valuable time to archive their data.
One-fifth report that they are not satisfied. Of these,
half point out lacking security as a problem. Others
pointed out that archiving is too complicated, that
there is not enough capacity, and even that there
are too many possible solutions.
Interestingly, more than 25 percent answered ‘other’
to the question about satisfaction. When answering
‘other’, the respondents were able to add comments
describing what they meant by this.
The frequently used arguments are categorized and
included in table 8.4.
40
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
8.5
Archiving activities are financed as a part
Nearly 11 percent respond “other” to this question.
of project- and institutional funding
Many of these argue that archiving activities is not
funded or that they do not know how it is funded.
Archiving activities are financed mainly as a part of
Some even say that they have paid for archiving so-
institutional funding. Further, many such activities
lutions them selves.
are financed on a project-by-project basis.
TABLE 8.4
How archiving activities is financed (“How is your archiving activities financed?”)
Freq.
Pct.
435
34.4%
70
5.5%
Part of institutional funding
626
49.5%
Other
134
10.6%
Total
1265
100.0%
Part of research projects
Part of funding for researchbased operative tasks
Source: DAMVAD
TABLE 8.5
Frequently used argument in the open answer on why researchers are not satisfied with their existing archiving solution
Argument posed
The archiving solution does not enable sharing data with others
Not easily accessible by other researchers
Too complicated and time consuming to use archiving systems, quoting the respondents
Too time consuming to do all the back-up solutions are non-standard ad-hoc
No common procedure for archiving makes it difficult and often not properly done.
Lack of routines about how to store raw data.
Lack of security and stability of the archiving systems
Damage to hard drives pose a risk
The back-up regime is not reliable
Data has been lost due to change in storing technology, e.g. magnetic tapes were discarded
without transfer of content to a new media.
We do not trust the back up and use our own external hard disk
Data can be lost at system upgrades etc.
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
41
9
Most researchers share research data
Research is cumulative in the way that research of-
9.2
Many researchers are left undecided
ten build on previous research. Similarly does researchers often use research data of other re-
Although most researchers agree on the benefits of
searchers’ in line with the principles of the OECD
sharing data, many are undecided as to whether
guidelines.
publicly funded research data should be open or
whether it should be considered public property. Ta-
9.1
Researchers are positive to the principle
ble 9.1 illustrates this.
of open access
Of the 20 percent who either do not agree that open
Researchers clearly see the benefits of the sharing
access to research data will enhance research or
and archiving of research data. About 80 percent of
that it is an ethical obligation of research to make
the respondents agree that open access to research
research data available for validation, between 15
data enhanced research.
and 16 percent are undecided and 4-5 percent disagree.
In addition, 79 percent agree that it is an ethical obligation of research to make research data available
Similarly, 53 percent agree that publicly funded re-
for validation. These are the two reasons for open
search data should be public property, 31 percent
access to research data that most researchers
are undecided and 15 percent disagree. In both
agree to. Only 6.5 percent agreed that open access
cases, the share of undecided is higher than the
to research data would lead to less interesting re-
share of disagreeing.
search.
The relatively high number of researchers who did
Further, 77 percent and 74 percent agree that open
not want to participate in the survey can also be
access to research data facilitates the education of
seen as an indication of the complexity of the issue.
students and new researchers and that open access to research data stimulates research collabo-
For both those who support and those who disagree
ration respectively.
with the overall principle of open access to research
data, views are elaborated in the below:
Below there a comment underpinning a positive attitude towards sharing data:
“I don't see any challenges. Free access to everything. “
“As a matter of principle, generated data of a certain
magnitude (small-scale surveys exempted) on pub-
“Researchers should not hoard their data, espe-
licly funded projects should be shared with the rest
cially if publicly funded. After publishing their work -
of the research community. A data set can in most
the data should ideally be available to others for ro-
cases be used and analysed for diverse purposes.
bustness testing, replication, and the exploration of
In my view this is a matter of research ethics and
new hypotheses.”
should be included in the guidelines of the National
Committee for Research Ethics in the Social Sciences and the Humanities (NESH).”
42
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
“Researchers will often dislike open access be-
9.3
cause they are afraid of having made mistakes that
Health trusts are positive towards the effects of sharing data on research
might be revealed or having missed important patterns in the data that others might get good publica-
There are only small differences across sectors, re-
tions from. More important however, is that science
search disciplines and professional experience.
is a social process where checking and challenging
each other is what moves us forward - even if the
When looking across sectors, researchers at health
process itself can be painful for the participants.”
trusts are a bit more positive towards the effects of
sharing data.
I see this as a hyped up issue. As long as interpretations are needed to make sense of the data, there
Figure 9.1 illustrates this. The figure shows re-
is no way those data are useful for others unless the
spondents’ positions in relation to the question on
original researchers are also part of a new study in-
whether open access to research data would en-
volving the data.
hance research.
TABLE 9.1
Attitudes towards open access to research data (“Please indicate if you agree to the following statements
related to open access to research data:”)
Agree
Open access to research data will enhance
research
Open access to research data will stimulate
more research collaborations
Open access to research data will make research less interesting
Open access to research data will facilitate
education of students and new researchers
Publicly funded research data should not be
public property
Undecided
Disagree
Freq.
Pct.
Freq.
Pct.
Freq.
Pct.
1 098
80.2%
209
15.3%
62
4.5%
1 012
73.9%
264
19.3%
93
6.8%
89
6.5%
207
15.1%
1 073
78.4%
1 050
76.7%
255
18.6%
64
4.7%
213
15.6%
428
31.3%
728
53.2%
290
21.2%
482
35.2%
597
43.6%
1 084
79.2%
217
15.9%
68
5.0%
Lack of open access to research data has
restricted my ability to answer scientific
questions
It is an research-ethical obligation to make
data available
Source: DAMVAD
Note: We have collapsed the positive statements in the survey “I strongly agree” and “I agree” and called it “agree” in the table. Likewise we have collapsed “I strongly disagree” and “I disagree” and called it “Disagree” in the table.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
43
When looking across, disciplines respondents in agFIGURE 9.1
Attitude towards open access to research data
(“Open access to research data will to research data
will enhance research”)
100%
74%
73%
ents within humanities are most positive. As illustrated in figure 9.2, only 2 percent within humanities
disagreed with the statement that “open access to
Univerities and
university
colleges
83%
80%
riculture and fishing are less positive and respond-
60%
research data will stimulate to more research collaboration” while the number is 14 percent within ag-
Research
institutes
40%
20%
riculture and fishing.
Respondents from agriculture and fishing, along-
Hospital trusts
(hospitals)
0%
side those within social sciences, were also the
most undecided. As many as 27 percent within so-
Agree
cial sciences and 24 percent within agriculture and
Source: DAMVAD
Note: We have collapsed the positive statements in the survey “I strongly
agree” and “I agree” and called it “agree”
fishing declared themselves undecided as to the
statement “open access to research data will stimulate to more research collaboration.”
FIGURE 9.2
Attitude towards open access to research data (“Open access to research data to research data will stimulate to more research collaboration”)
90%
84%
77%
80%
70%
80%
79%
66%
62%
60%
50%
40%
30%
27%
24%
20%
14%
14%
7%
10%
2%
5%
7%
16% 16%
15%
5%
0%
Agree
Humanities
Agriculture and fishery
Disagree
Mathematics and natural science
Undecided
Medical science
Social science
Technology
Source: DAMVAD
Note: We have collapsed the positive statements in the survey “I strongly agree” and “I agree” and called it “agree” in the figure. Likewise we have
collapsed “I strongly disagree” and “I disagree” and called it “Disagree” in the figure.
44
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Differences across professional experience are
adequate explanation of the data and will best en-
negligible and thus not reported here.
sure that my contribution is adequately acknowledged.”
9.4
Researchers share their research data,
but upon request
Many researchers use research data generated by
TABLE 9.2
Data availability (“Which of the following applies to
the accessibility of most of your research data?”)
others. Even more researchers support the idea of
Freq.
Pct.
Data is available to all
195
15.7%
Data is available to other researchers
148
11.9%
472
37.9%
73
5.9%
122
9.8%
Data is not available
206
16.5%
Other
29
2.3%
Total
1 245
100%
sharing. Logically, one would expect that many researchers also share research data with other researchers.
The survey confirms that most researchers share
data with other researchers. Only 16% of the respondents stated that most of their research data is
not available to other researchers.
Further, 16 percent of the generated research data
is available to everyone, while 12 percent of the
generated research data is only available to other
researchers.
Available for other researchers,
but only upon request
For other researchers, but under a license or non-disclosure
agreement
Could be made available with
appropriate changes
About half of the respondents state that their data is
available to other researchers, but only upon re-
Source: DAMVAD
quest or under certain conditions. Researchers typically prefer to keep track of who is accessing their
9.5
More openness within humanities
data and for what purpose. Consequently, each researcher becomes a gatekeeper for her own data.
There seems to be more openness towards data
sharing within humanities compared to other re-
There are many reasons for being more restrictive
search disciplines.
in practice about one’s own data than in principle.
In medical sciences and social sciences, 12 percent
One is to ensure data is understood and used cor-
report that they generate data that is readily availa-
rectly. One researcher commented that:
ble to others. The corresponding share within humanities is one third (see figure 9.3). Indeed, one
“Generally, there is no big impediment against shar-
might argue that the research data and sharing pos-
ing my research data. I feel, however, that in most
sibilities are fundamentally different between the
cases it is best done on a case-by-case basis upon
medical sciences and the humanities.
a personal request because this allows me to give
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
45
For example, and as illustrated in figure 7.1, 76 per-
of data. Researchers in humanities are somewhat
cent of data generated within the medical sciences
more inclined to unconditionally share data than
is numerical data and 12 percent textual scores.
their colleagues in social sciences.
This is in contrast to humanities, where 14 percent
is numerical data and 48 percent consist of textual
Sharing data that is not otherwise available is more
scores.
common within health and social sciences, at 24
and 25 percent respectively. Only 6 percent within
The comparison is perhaps more interesting be-
mathematics stated that most of their research data
tween social sciences and humanities.
is not available. For agriculture and fishing, humanities and technology, the share is between 12 per-
The two have an equal share of respondents gener-
cent and 15 percent
ating textual scores, though they are somewhat different when it comes to numerical data. These two
disciplines differ in their approaches to the sharing
Figure 9.3
Data availability across research discipline (“Which of the following applies to the accessibility of most of
your research data?”)
70%
60%
60%
57%
54%
52%
50%
50%
43%
40%
33%
30%
20%
24% 25%
19%
14%
17%
16%
12% 12%
10%
10%
17%
14%
13%
9%
7%
15%
12%
6%
0%
Data is available
Humanities
Data is available for other
researchers
Agriculture and fishery
Data is available on
demand
Mathematics and natural science
Medical science
Data is not available
Social science
Technology
Source: DAMVAD
Note: The statement “Data is available on demand” consists of the following possible answers: “ Available for other researchers, but only upon request”, “ For other researchers, but under a license or non-disclosure agreement”, “ Could be made available with appropriate changes”
46
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
9.6
More openness among more experienced
researchers
Researchers with more experience appear more
confident in sharing data. Among the respondents
with more than 20 years of research experience, 19
pct report that their data is free to use. In comparison, 11 percent of the less experienced researchers
make their freely available.
FIGURE 9.4
Data availability across experience (“Which of the following applies to the accessibility of most of your
research data?”)
70%
60%
60%
53%
55%
51%
50%
45%
40%
30%
23%
19%
20%
14%
18%
18%
15% 16%
15% 15%
15%
12%
11%
12%
8% 9%
10%
0%
Data is available
Data is available for other
researchers
Less than 3 years
3 - 6 years
7 - 10 years
Data is available on
demand
11 - 20 years
Data is not available
More than 20 years
Source: DAMVAD
Note: The statement “Data is available on demand” consists of the following possible answers: “ Available for other researchers, but only upon request”,
“ For other researchers, but under a license or non-disclosure agreement”, “ Could be made available with appropriate changes”
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
47
10 Lack of time, infrastructure and incentives hamper further
sharing of data
Although researchers’ already share data with other
10.1 Variety of barriers
researchers, the study indicate that there is a potential for more sharing.
As expected, the survey document that there is a
variety of obstacles for sharing research data. Some
Former studies and interviews point to a number of
research data cannot be shared due to issue of pri-
barriers for further sharing of research data. A cen-
vacy, commercial issues or shared ownership.
tral objective of the study is to identify the main bar-
These aspects are however, not the most important
riers for more sharing of research data in Norway.
barriers.
This chapter presents the main findings on barriers
and obstacles for sharing of research data in Nor-
The time involved is a main barrier to sharing data.
way and possible ways to reduce these barriers.
Almost one-third of the respondents pointed out that
TABLE 10.1
Main barriers towards sharing research data (“Do you see any challenges in making more of your research
data available for other researchers”? Maximum 3 answers). Ordered by frequency.
Frequency
Pct.
Preparing data for open access takes away valuable time for research
386
31.4%
Lack of technical infrastructure
300
24.4%
Reduce possibilities of future scientific publications
300
24.4%
I am afraid other researchers will not understand my data
259
21.0%
I cannot give access due to sensitivity issues
249
20.2%
I cannot give access due to shared ownership
212
17.2%
I don't know
188
15.3%
I am afraid data will be misused
147
11.9%
I cannot give access due to intellectual property rights
135
11.0%
Open access to research data might have a negative economic impact for me
and my institution
85
6.9%
It would be unethical
82
6.7%
I cannot give access due to commercial issues
80
6.5%
I do not believe my research data is of interest to others
73
5.9%
I do not believe data is secure at a data centre, journal site or alike
59
4.8%
Other
53
4.3%
Total
2,608
Source: DAMVAD
48
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
preparing data for open access takes away valuable
publications, arguing that researchers should have
time for research.
exclusive ownership to their data for an extended
period of time:
One-quarter of the respondents pointed out that
lacking technical infrastructure is a challenge for the
“Data can be made available to others, but only after
sharing of data. One way of reducing the time-con-
our institutions have had a reasonable period (3
straint would be to improve the technical infrastruc-
years, for example) to analyse and publish in order
ture.
to justify the high costs for data gathering. Or else,
no institution will pay for data gathering.”
Further, many researchers’ are concerned that
sharing data would reduce their possibilities regard-
“In general I may want some time of non-access
ing future scientific publications (25 percent).
(say 1-3 years) giving us, the researchers carrying
out the project, the possibility of presenting re-
More than 20 percent of the respondents were
sults/documentation first, but then, afterwards, I
afraid that others would not understand their data.
would be thrilled if others would apply my/our data
for re-analysis or new types of analyses. / We do try
Only one-fifth stated that they could not share data
to support master students when they request use
due to sensitivity issues or because of shared own-
of our data, and I would also try to support other re-
ership of the data.
searchers in case of requests. / I do not know about
funding of open access activity, thus any such ac-
These findings support those in other international
tivity will imply problems for my/our hour list.”
surveys, such as Kvale (2012) and Tenopir (2011).
Some respondents pointed out the risk of others not
Given the opportunity to make additional comments
understanding their data, and that it would require a
if they chose the category ‘other’, the following com-
significant effort to set up meta-data such that oth-
ments were made: .
ers would understand them:
“I am in the process of making my research data as
“Preparing data so others can easily use it in the
public as possible. This takes a lot of time, and alt-
right way takes a lot of time. Often this time is not
hough I can't see any problem with it, there are little
budgeted for and therefore the necessary data
rewards except scientific/ethic satisfaction.”
preparation is not possible within the given time
frame for a project without taking away research
“Preparing data would be very time consuming. “
time, using additional funding or using private time.
/ Making data available without a sufficiently de-
“Research projects are often under financed and
tailed description of methods and the data genera-
setting up data and metadata to enable open ac-
tion may lead to misinterpretations of data and pos-
cess take extra time usually spent in the last part of
sibly wrong use of data.”
the project when the project run out of time and
money.”
“The limited resources and funding available for
Many respondents focused on how open access
long term field experiments requires very large ex-
could reduce their chances of producing scientific
tra input of labour from scientists as compared to
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
49
the actual hours we are paid for. One of the few in-
“Most of my data could in principle be made availa-
centives to continue to do this is to collect and have
ble. Because of what I have informed my respond-
unique access to the data. If anyone can use the
ents about, the purpose of the study, and what the
same data without contributing to the work that is
data is going to be used for, it would not be ethically
required in writing applications, designing and set-
defendable to share the data with others for other
ting up the experiments and collecting the actual
purposes than originally planned.”
data, a large part of the incentive for research is
gone. Then what is left is lots of hard labour with
10.2 Relatively small differences across sec-
very low hourly wages and limited credit for the
tor
ideas or the results - who wants that? Such a situation comes through as very unfair.”
The survey does not indicate significant differences
between respondent groups in terms of observed
Some also comment on data sensitivity. The exam-
barriers. This section summarized the observed dif-
ple below shows that it is not only a question of how
ferences.
sensitive your data are - it is also a question of respect for the informants:
Time constraints is a less important barrier for respondents working at health trusts than other types
FIGURE 10.1
Main barriers for increased sharing of research data (“Do you see any challenges in making more of your
research data available for other researchers”? Maximum 3 answers). Across sector. Only includes the five
major obstacles.
40%
35%
35%
31%
28%
30%
23% 24%
25%
20%
28%
26%
26%
26%
23%
19% 20%
18%
16%
14%
15%
10%
5%
0%
Making data
Lack of technical
available takes away
infrastructure
valuable time for
research
Univerities and university colleges
Open access would Concerns connected Cannot give access
reduce possibilities to misinterpretation due to sensitivity
of scientific
of data
issues
publications
Research institutes
Source: DAMVAD
50
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Health trusts (hospitals)
of institutions. Researchers in health trusts are more
within humanities, mathematics and the natural sci-
concerned about the sensitivity of the data and the
ences. Researchers in these disciplines are signifi-
lack of infrastructure. Figure 10.1 shows the results.
cantly more concerned about this challenge than
are researchers in social sciences and technology.
Researchers at institutes and universities are on the
other hand more concerned with time, but also with
Sensitivity issues was the key reason for not being
the risk that others might misinterpret their data.
able to share data within social sciences and health
Within humanities and medical science, the re-
science. Figure 10.2 shows the results.
spondents are not particularly concerned about the
misinterpretation of their data.
Time is especially scarce for respondents within agriculture and fishing as well as those within mathematics and natural science.
Some differences across disciplines
When looking across disciplines, lack of technical
infrastructure constitutes a particular challenge
FIGURE 10.2
Main barriers for increased sharing of research data (“Do you see any challenges in making more of your
research data available for other researchers”? Maximum 3 answers). Across research discipline. Only
includes the five major obstacles.
45%
42%
39%
40%
39%
38%
33%
35%
30%
31%
29%
27%
31%
29%
25%26%
24%
25%
21%
20%
26%26%
26%
25%
23%
21%
20%
18%
18%
18%
16%
13%
15%
11%
8%
10%
7%
5%
5%
0%
Making data
available takes away
valuable time for
research
Lack of technical
infrastructure
Open access would Concerns connected Cannot give access
reduce possibilities to misinterpretation due to sensitivity
of scientific
of data
issues
publications
Humanities
Agriculture and fishery
Mathematics and natural science
Medical science
Social science
Technology
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
51
Younger researchers’ more concerned with sensitivity
lack of experience with juridical issues or fear of
In terms of professional experience, two differences
are worth noting. First, the less experienced respondents did not think of time as a challenge - they
might be more familiar with technology and various
solutions for sharing files. The results are shown in
figure 10.3
misuse.
Otherwise, the survey suggests small differences in
terms of the perceived current barriers and challenges for sharing data across years of experience.
10.3 Textual records are more sensible
On the other hand, the respondents who were more
inexperienced were more attentive and alert to possible sensitivity issues concerning their data. This
seems to be less of an issue for the more experienced respondents. This might be a result of their
This section discusses variations in responses
across different data formats and the challenges the
respondents foresaw.
In figure 10.4 below, we can see that textual records
involve more data that are sensitive. There are
FIGURE 10.3
Main barriers for increased sharing of research data (“Do you see any challenges in making more of your
research data available for other researchers”? Maximum 3 answers). Across experience Only includes the
five major obstacles.
40%
36%
35%
34%
35%
30%
29%
28%
24%24%
25%
28%
27%27%27%
23%
22%
23%
24%
22%22%
19%
20%
19%
23%
21%
20%
17%
16%
15%
13%
10%
5%
0%
Making data available
takes away valuable
time for research
Lack of technical
infrastructure
Less than 3 years
3 - 6 years
Open access would Concerns connected to Cannot give access due
reduce possibilities of misinterpretation of to sensitivity issues
scientific publications
data
7 - 10 years
Source: DAMVAD
52
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
11 - 20 years
More than 20 years
stronger concerns as to whether textual records will
stated that open access to research data would re-
be misinterpreted. The nexus of the textual records
duce their possibilities regarding future scientific
are thus more important than are numerical scores.
publications.
Furthermore, the respondents stated that they could
not give access to textual scores due to sensitivity
10.4 Researchers see little support from management
issues.
On the other hand, there are time issues relating to
The survey shows that there is a perceived lack of
making numerical scores. Moreover, those re-
support for open access to research data from man-
spondents mainly working with numerical scores
agement.
FIGURE 10.4
Main barriers for more sharing of research data (“Do you see any challenges in making more of your research data available for other researchers”? Maximum 3 answers). Across data type Only includes the
five major obstacles.
40%
36%
34%
35%
30%
30%
30%
28%
28%
26%
25%
23%
21%
20%
19%
20%
19%
16%
14%
14%
15%
10%
5%
0%
Lack of technical
infrastructure
Concerns connected to Cannot give access due Making data available Open access would
misinterpretation of to sensitivity issues takes away valuable reduce possibilities of
data
time for research
scientific publications
Numerical scores
Textual records
Images, sounds, videos and graphics
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
53
There has been little focus on implementing open
data to be the responsibility of individual research-
access to research data at the organizational level.
ers/groups. There seem to be little focusing on long
Less than 50 percent of the respondents reported
term archiving and data handling.”
that management encouraged open access to research data.
10.5 Limited institutional support
Less than one-fifth (16 percent) of the respondents
Our study suggests significant differences in the
reported that their organization provided training on
way research management deal with the sharing
best practices for sharing research data. Guidelines
and archiving of data.
and standards existed for 27 percent of the respondents. Finally, 30 percent stated that their or-
Management at research institutes facilitate open
ganization provided tools, technical support and in-
access to research data to a larger extent than the
frastructure facilitating open access to research
case is at universities and health trusts. Around 56
data.
percent of the respondents within research institutes stated that their management, either to a high
One of the respondents highlights the problem with
or to some degree, “encourage[d] that our [the re-
the following quote:
spondent’s] research data should be open.” Only 30
percent of the respondents from health trusts stated
“Institutions usually focus on measurement of per-
that their management encouraged open access to
formance through amount of publications and
research data.
coarse counting of results, leaving aspects of collecting, investigation of, archiving and handling of
TABLE 10.2
Does management or the organisation support open access to research data? ( To what extent do you
experience that open access to research data is implemented in your organization)
Management encourage that our research data
should be open
My organization provides training on best practice
for open access to research data
To a high degree
To some extent
freq.
pct.
freq.
pct.
freq.
pct.
Freq.
pct.
152
11.9%
435
34.1%
348
27.3%
340
26.7%
16
1.3%
187
14.7%
690
54.1%
382
30.0%
59
4.6%
286
22.4%
470
36.9%
460
36.1%
52
4.1%
336
26.4%
432
33.9%
455
35.7%
Not at all
I do not
know
My organization has guidelines and standards for
data format and for assigning information to data
My organization provides the necessary tools,
technical support and technical infrastructure for
open access to research data
Source: DAMVAD
54
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
FIGURE 10.5
Does management encourage open access to research data? (“Management encourage that our research data should be open”)
45%
Univerities and university colleges
40%
40%
Research institutes
35%
Hospital trusts (hospitals)
FIGURE 10.6
Do the organisation provide support and solutions
for open access to research data? (“My organization
provides the necessary tools, technical support and
technical infrastructure for open access to research
data”)
50%
Univerities and university colleges
45%
40%
30%
30%
Research institutes
Hospital trusts (hospitals)
35%
25%
31%
23%
30%
20%
15%
10%
24%
25%
16%
20%
9%
16%
15%
6%
5%
10%
0%
5%
6%
To a high degree
To some extent
3%
2%
0%
To a high degree
Source: DAMVAD
To some extent
Source: DAMVAD
Research institutes have the best preconditions for
sharing research data. The survey shows that 39
Very few answered that their organization “to a high
percent of respondents at research institutes stated
degree” provided the necessary solutions for open
that their organization, either to a high or to some
access. A high share of respondents state either
extent, provided the necessary tools, technical sup-
that they do not know whether their organization
port and technical infrastructure for open access to
provided the necessary solution or that in fact it did
research data. At universities, the share is 27 per-
not do so.
cent and at health trusts it is 18 percent
There are small or no differences across professional experience. As such the figures are not presented in this report.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
55
10.6 Researchers call for better infrastructure,
citation systems and guidelines
research data would make it easier to give credit to
researchers preparing and generating data.
The survey also asked the respondents about pos-
In addition to increased funding, respondents high-
sible measures that would facilitate more sharing of
lighted various aspects of formal competence and
research data. These answers constitute important
technical support. The respondents called for the
inputs to recommended actions that can facilitate
implementation of guidelines, standards and more
open access to research data.
training on open access to research data in order to
increase sharing of data. These solutions pointed
Most respondents point to better infrastructure as a
out are also in line with the challenges identified, es-
solution to increased access to research data.
pecially those concerned with the time-constraints
cited earlier.
Further, respondents state that the implementation
of a citation system would facilitate increased sharing availability of data. A better citation system for
TABLE 10.3
Solutions to facilitate increased sharing of data (What efforts would make open access to research data to
publicly funded research data more interesting for you? (maximum 3 answers))
Solutions for increased sharing of data
Frequency
Pct.
Better infrastructure for open access to research data
536
41.7%
Implementation of a system for citation
510
39.7%
More resources allocated for open access to research data activities
324
25.2%
Implementation of guidelines
309
24.0%
More training on open access to research data
281
21.9%
Implementation of standards
266
20.7%
Don't know
208
16.2%
Make open access to research data an indicator in the funding scheme
158
12.3%
Guidelines to how long I can attain ownership to data before sharing
140
10.9%
Make it mandatory to explain how data will be made available
103
8.0%
Not allowed to share anyways
78
6.1%
2,913
227%
Total
Source: DAMVAD
56
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Few differences across sectors
As for the proposed solutions, we see few differences across sectors, with one important exception.
The respondents based at health trusts are particularly concerned with the need for guidelines for open
access to research data would facilitate increased
sharing of research data. This input is perhaps not
surprising, given their emphasis on sensitivity of
data. Guidelines could focus on how to share sensitive data and what kinds of data that can be shared.
As noted by some respondents, education might
also involve how to handle open access to research
data.
FIGURE 10.7
Solutions to facilitate increased sharing of data, across sector (What efforts would make open access to
publicly funded research data more interesting for you? (maximum 3 answers))
45%
41% 42%
43%
41%
41%
39%
40%
35%
35%
30%
29%
28%
25%
23%
22% 21%
22% 22%
18%
20%
15%
10%
5%
0%
Better infrastructure
More training
Univerities and university colleges
Implementation of a
citation system
Research institutes
More resources
Implementation of
guidelines for open
access
Health trusts (hospitals)
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
57
A blurred picture across discipline
For humanities training and to some extent guide-
In general, infrastructure is the dominant proposed
lines would have more influence on making open
solution. It is more important for some research dis-
access to research data more interesting. Around
ciplines than others. Technology and mathematics
30 percent within humanities point at training and
and natural sciences, with 46 and 45 percent re-
guidelines. Within mathematics and natural science
spectively, point to infrastructure as the most im-
the corresponding share is 16 percent and 19 per-
portant solution.
cent.
Agriculture and fishery have a stronger focus on developing citation systems than other disciplines. In
figure 10.8, we see that whereas 46 percent within
agriculture and fishery state citation as an important
factor for making open access to research data
more interesting for them this is only the case for 30
percent within humanities and 33 percent within social science.
FIGURE 10.8
Solutions to facilitate increased sharing of data, across research discipline (What efforts would make open
access to publicly funded research data more interesting for you? (maximum 3 answers))
50%
45%
40%
45%
43%
42%
40%
46%
46%
44%
40%
38%
36%
35%
30%
30%
25%
25%
20%
35%
33%
32%
29%
27%
24%24%
16%
19%
18%
27%
25%
19%
29%
25%
19%
21%
20%
15%
10%
5%
0%
Better infrastructure
More training
Implementation of a
citation system
Implementation of
guidelines for open
access
Humanities
Agriculture and fishery
Mathematics and natural science
Medical science
Social science
Technology
Source: DAMVAD
58
More resources
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Little difference across experience
Further we see that whereas around 20 percent of
Interesting, but not surprisingly the more inexperi-
the more experienced researchers points at more
enced researchers fancy training. They want to
training as a measure for making open access to re-
learn how to share data. More experienced re-
search data more interesting the share is about 30
searchers are more engaged in matters concerning
percent among the more inexperienced research-
lack of resources.
ers.
In figure 10.9 we see that around 30 percent of the
Across experience, it is thus clear that infrastructure
more experienced research points at the lack of re-
and citation system are the most preferred
sources in order to make open access to research
measures in order to make open access to research
data more interesting it is only around 11 percent for
data more interesting according to the researcher.
the inexperienced researchers.
FIGURE 10.9
Solutions to facilitate increased sharing of data, across experience (What efforts would make open access
to publicly funded research data more interesting for you? (maximum 3 answers))
50%
45%
40%
43%43%
39%
44%
42%42%
41%42%
38%
35%
35%
31%
29%
28%
30%
28%
26%
25%
24%
25%
20%
20%
18%
18%
20%
23%
22%
21%
15%
11%
10%
5%
0%
Better infrastructure
More training
Less than 3 years
3 - 6 years
Implementation of a
citation system
7 - 10 years
More resources
11 - 20 years
Implementation of
guidelines for open
access
More than 20 years
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
59
10.7
Researchers working internationally find
-
time to be a bigger challenge
Work in collaboration with researchers at international institutions
We hypothesized that researchers working in differ-
Working mainly alone was defined as the re-
ent settings have different perceptions of the barri-
searcher working alone more than 40 percent of
ers to sharing data.
their time.
We asked each of the respondents to state the pro-
The same threshold delimits researchers collaborat-
portions of their working hours where they:
ing with others within their own institution. Finally,
-
Work alone
we defined researchers working internationally as
-
Work in collaboration with colleagues within
those spending more than 20 percent of their time
their institution
collaboration with researchers at international institutions.
FIGURE 10.10
Main barriers for increased sharing of research data, across researchers’ way of working (“Do you see any
challenges in making more of your research data available for other researchers”? Maximum 3 answers).
40%
35%
35%
30%
27%
24%
25%
23% 23%
22%
21%
19%
20%
20%
19%
18%
19% 19%
17%
15%
13%
10%
5%
0%
Making data available
takes away valuable
time for research
Alone ( > 40 pct.)
Lack of technical
infrastructure
Open access would Concerns connected to Cannot give access due
reduce possibilities of misinterpretation of to sensitivity issues
scientific publications
data
Collaboration within the institution ( > 40 pct.)
Source: DAMVAD
60
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
International collaboration ( > 20 pct.)
One might think that it exists differences among re-
most important issue. There is an obvious explana-
searchers’ way of working and their attitudes to-
tion for this. Whereas researchers mainly working
wards sharing data. However, we find little evidence
alone share data to a lesser extent than others do,
that this should be the case.
they face fewer issues in sharing their data.
The main difference between researchers way of
On the other hand, the researchers collaborating in-
working is found to be the challenge that making
ternationally have to deal with international stand-
data available takes away time for research. The re-
ards and practice. Alongside few existing standards,
spondents mainly working alone did not see this as
multiple archiving solutions and different legislation
a big issue. On the other hand, the respondents col-
across borders, the time involved in sharing data is
laborating internationally stated that this was the
a significant challenge.
FIGURE 10.11
Solutions to facilitate increased sharing of data, on how researchers work (What efforts would make open
access to publicly funded research data more interesting for you? (maximum 3 answers))
50%
45%
43%
41%
40%
36%
34%
35%
33% 34%
29%
30%
24%
24%
25%
23%
19%
20%
22%
19%
17%
14%
15%
10%
5%
0%
Better infrastructure
Alone ( > 20 pct.)
More training
Implementation of a
citation system
Collaboration within the institution ( > 40 pct.)
More resources
Implementation of
guidelines for open
access
International collaboration ( > 40 pct.)
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
61
for every researcher, as their research aims at cre10.8 Researchers welcome data sharing as a
ating new knowledge based on solid evidence.
part of publishing
Roughly half the respondents (54 percent) agreed
We have seen that the respondents would like to
with this. Slightly less than thirty percent stated that
use data generated by others, but that they lack the
their data and publications would be cited more.
incentive’s to make their own data available for others. Many scientific journals increasingly require
Interestingly enough, 11 percent did not see any
that data should be made available as a part of the
benefits in making data available as a part of scien-
publishing process.
tific publications.
But only 11 percent of re-
searchers have already experienced this practise.
If we add the 9 percent who do not know and take
Making data available through scientific journals
the residual, the remaining 80 percent, can be con-
could lead to increased interest from other re-
sidered positive towards making data available as a
searchers. As illustrated below, 50 percent see that
part of scientific research.
increased focus on making data available as a part
of scientific publications could mean that their research becomes more interesting for others to follow.
Another positive outcome would be that the research could be quality assured. This is important
TABLE 10.4
Researchers welcome sharing data as part of scientific publications. (Do you welcome the trend of making
data available as a part of scientific publications?)
Frequency
Pct.
665
50.8%
703
53.7%
367
28.1%
No, I see no benefit for me
144
11.0%
I do not know
112
8.6%
Other
63
4.8%
Total
2,054
157%
Yes, it could mean that my research could be more interesting for
others to follow
Yes, it is a sign that my research can be quality assured
Yes, it could mean that my data and or my publications will be more
cited
Source: DAMVAD
62
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
11 Main findings and recommendations
The objective of this study has been to gain a better
Only one in ten had not used research data gener-
understanding of researchers’ current practice on
ated by other researchers over the past three years
sharing and archiving of research data. In addition,
did not wish to use this type of data.
we analyse the various fears and barriers involved
from a researcher’s point of view and how these barriers might be overcome.
The following chapter summarize the main findings
and recommendations for the Research Council of
Norway.
Researchers see benefits in sharing their data
The survey confirms that researchers in Norway see
benefits of sharing and archiving their research
data. Around 80 percent of the respondents agreed
that open access to research data enhance research and that it is an ethical obligation to make
11.1 Main findings
their data available for validation. These are also the
two reasons for open access to research data
Researchers share data
agreed to by most researchers.
Our study shows that Norwegian researchers frequently use and share research data with each
other.
Further, 77 percent agreed that open access to research data facilitates the education of students and
new researchers, and 74 percent that open access
As many as 64 percent of researchers have used
research data from other researchers over the last
to research data stimulated research collaboration
respectively.
three years.
Many researchers are undecided
Researchers mostly use research data generated
by other researchers from their own institution, but
Although most researchers agree as to the benefits
only slightly more than by data from researchers at
of sharing their data, many researchers are unde-
other institutions outside of Norway.
cided as to whether publicly funded research data
should be considered public property.
Potential to increase sharing of research data
Of the remaining 20 percent who did not agree that
About one-third (36 percent) of the researchers
have not used data gathered by other researchers.
Of these, 71.5 percent reported that they would like
to make use of other researchers’ data. This indicates a clear potential for increasing sharing of research data.
search, 15 percent were undecided and about 5
percent disagreed. This large proportion of undecided respondents may reflect the complexity of the
issue.
Additionally, more than 600 respondents actively
The numbers indicate untapped potential for increased and improved sharing of data.
open access to research data would enhance re-
decided not to participate in the survey. This reluctance to participate might also be seen as an indication that questions regarding open access to research data are perceived as being irrelevant to the
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
63
individual respondents (i.e., he or she is not an ac-
3. Open access to research data might reduce re-
tive researcher) or that he or she is undecided as to
searchers’ possibilities for future scientific pub-
the issue.
lications.
These 600 non-respondents correspond to 30 per-
These responses indicate, inter alia, that research-
cent of the actual respondents. If they are regarded
ers lack adequate and user-friendly infrastructure,
as undecided respondents, the share of research-
guidelines and procedures, and certainty about im-
ers without a clear position on the issue of open ac-
material rights in order to embrace the idea of shar-
cess to research data is quite significant.
ing data.
The survey included an open answer option where
Contrary to our hypothesis, we did not find any ma-
respondents could write free text. Inputs in this sec-
jor differences across sectors, fields of research or
tion show that many researchers find the issue of
years of professional experience.
open access to research data challenging and complex. Many researches are clearly positive towards
sharing, but many researchers are also negative, as
Archiving data on local computers and institutional
servers
we have tried to state in the report.
The study found that 85 percent of the respondents
Researcher want to remain in control of their data
archived their data on their own devices or at an institutional server. This figure do not vary across sec-
Most researchers share their research data with
tors, disciplines or professional experience, which is
other researchers. Yet research data is generally
something of a paradox, since storing data on their
shared under certain conditions (e.g., only upon re-
own devices cannot be regarded the most secure
quest, under a non-disclose agreement, in an anon-
means of storage. This is especially apparent inso-
ymized format). Researchers want to control who
far as many of the respondents were concerned
gets access to their data and how they use it. With
about security and the sensitivity of their data.
each researcher setting the term, there is a risk that
Researchers see few initiatives from their management
she becomes a gatekeeper.
Lack of proper infrastructure and incentives for
sharing
The survey responses suggest significant differences in the way in which research managers deal
with the sharing and archiving of data. Conse-
When asked about the barriers to sharing more of
quently, researchers see a need for greater institu-
their data, the central barriers according to the re-
tional support.
searchers are:
1. Preparing data for open access takes valuable
time away from research-activities.
2. Respondents do not have adequate technical
infrastructure.
Only 16 percent (within the research institutions)
and 6 percent (within health trusts) perceived to a significant extent - that their management encouraged them to share data. Moreover, only six percent
(within the research institutions) and two percent
(within health trusts) perceived a significant degree
64
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
of solutions and technical support for sharing their
data.
3. Implementing
guidelines,
training
and
standards for sharing data.
Again, we find very limited differences across sec-
There is a need for better infrastructure and a
credit system for researchers
tors, disciplines and professional experience.
The survey indicates a strong relationship between
11.2 Recommendations
the major barriers to sharing data and the researchPrevious studies suggest that there are multiple ob-
ers’ proposed solutions to overcome them.
stacles and, hence, no single solution to increase
The flipside of these barriers are possible solution.
the sharing and archiving of research data. Yet we
These include:
will present some recommendations here and in fig-
1. Better infrastructure.
ure 11.1.
2. Implementing a system for citation.
FIGURE 11.1
Problems, solutions and recommendations
Source: DAMVAD
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
65
Former studies, as well as this analysis, suggest
premise for sharing data. Even if the two matters dif-
that there is a need for work directed at both the
fer, they are closely linked, and should be seen in
level of researchers, research institutions, research
relation to one another.
funders and government/international levels.
Most researchers use other researchers data.
Initiatives need to take place in parallel. For exam-
Most researchers are also willing to let others re-
ple, taking action to make more researchers share
use their generated data if certain conditions and
data without the proper infrastructure will most likely
restrictions fulfilled.
prove counterproductive. Thus, there is a strong
need for a coordinated effort.
The researchers are the ones who gather and analyse the data, and who will archive and share the
We see that the Research Council of Norway can
data in the end. Researchers want to know what
play a key role promoting open access to research
happens to their research data. As such, it is im-
data in Norway.
portant to raise awareness among researchers.
Raising awareness
The sharing and archiving of research data entails
many obstacles and questions in which need to be
However, the study also indicate that researcher
need support and many does not see this support
from their management. Thus, it is also important to
raise awareness at the institutional level.
answered. Many respondents were undecided or
did not wish to participate in the survey. This might
suggest that researcher’s consider sharing and ar-
Giving credit as well as responsibility to researchers
chiving of research data as a complex and difficult
topic.
The study indicates that a lack of incentives and
credit for gathering data are a barrier for increased
We would suggest that the Research Council of
sharing of research data.
Norway actively work to raise awareness on the issue, covering both the benefits and pitfalls of archiv-
These findings correspond findings in former na-
ing and sharing research data.
tional and international studies (i.e., Kvale, 2012).
In particular, exemplifying potential opportunities
and value is important, inter alia, by using best prac-
The respondents would be more willing to share
tice cases. Emphasis should be on showing that
work. One obvious way of crediting researchers
sharing and archiving is worthwhile for researchers.
would be by support the implementation of a cita-
data if they received credit for their data generation
tion or reference system for data. Accreditation is
In this respect, there also seems to be a need for
an important motivation for researchers.
certainty as to the differences between archiving
and open access to research data. The archiving
process does not necessarily imply full open access
to research data for all - it should be considered a
66
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
“References can be seen as a kind of normative
Council of Norway withholds funding until data is
payment”
properly shared and archived.
Ingwersen (2011) Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure
We do not recommend implementing such stringent
measures at the current stage, as it would require
There is no well-established citation system for re-
considerable work in terms of its design and in terms
search data in Norway, giving researchers few incentives to prioritize time for preparation of data for
sharing. The lack of a well-established citation system is also an international issue. We thus see the
benefit the such systems should be coordinated at
of having the proper infrastructure in place. Without
proper guidelines and a sound infrastructure, such
as system could be counterproductive.
For recent years, research communities has been
the international level.
left to establish methods and practices for sharing
Ideally, the system should be easy to use and work
alongside existing systems for publishing.
Tenopir (2011) suggests promoting good sharing
practices among researchers. For example, obtaining copies of articles using a researchers’ data is
one example of conditions that would encourage
sharing and promoting best practice.
and archiving their research data. We are concerned that this leads to a suboptimal organization
of solutions. As stated earlier, we do not find large
differences across sectors or research disciplines.
Hence, we cannot support arguments leading to the
design of tailored solutions for each specific sector
or individual research discipline. Yet the work must
still be inclusive of all research communities, as they
The Council could also introduce some kind of requirements on researchers. Lord et al. (2006) study
large-scale data sharing in life sciences based on
have the knowledge and will have to implement the
supposed strategies and solutions.
Guidelines, rules and best practice
ten case studies, and found that a laissez-faire approach to the collection and distribution of data re-
Our study suggests that many researchers lack
sults in waste, as such data will not entail sufficient
knowledge as to what data to share and archive. In
information to enable re-use.
addition, researchers lack knowledge as to what
form the data should have, and how proper infor-
A key recommendation from Lord et al. (2006) is an
mation about the data should be assigned.
insistence on a data management plan that clearly
defines responsibilities and goals and awareness of
Thus, the study suggest a need for better guide-
the needs and practices of data management.
lines, standards and education relating to sharing
and archiving research data. Such guidelines and
The Research Council can introduce requirement of
standards should be developed in close interaction
data management plans as a part of the traditional
with researchers, institutions and legal experts. We
application procedure.
recommend that implementation of guidelines and
standards should be inspired by work initiated inter-
It is also possible to make sharing of research data
nationally to avoid creating a Norwegian bureau-
a part of the financial system for basic funding. Fur-
cracy alongside international standards.
ther, it could be a system in which the Research
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
67
One way of promoting the use of shared data would
Finally, it would lay the ground for increased sharing
be by creating solid and informative platform for
of research data. Debate on infrastructure invest-
metadata. A metadata-platform can be a low key ac-
ment should involve all relevant stakeholders while
tivity, as it can be seen as a first step towards more
ensuring a robust infrastructure that in turn will
complex infrastructure solutions. In addition, we
serve the needs of the future. We are somewhat
perceive that many researchers are not aware of the
cautious as to the design and scale of such a sys-
possibilities of accessing data gathered by other re-
tem because it could be a matter of cost and benefit.
searchers. Better metadata can overcome this is-
We thus see that more information on ambition’s is
sue.
needed.
We would also suggest starting to work on data se-
An ideal data infrastructure for science research
lection (i.e., on defining which data are worthwhile
would have a long list of technical characteristics.
and which are not). Even though our study does not
We refer to the wish list included in the EC white
suggest any major differences in the practices and
paper on scientific data, “Riding the Wave”.
barriers across research disciplines and sectors, the
open answers, however, indicated a strong need for
better understanding and guidance as to which data
to archive and share, and in which form to do so. In
particular, researchers who mainly use textual data
(e.g., interviews), have difficulties deciding which
data to share and preserve.
Infrastructure and funding
Interviews and studies both suggests that the infrastructure for the sharing and archiving of data is
fragmented, overlapping and inadequate. Many are
satisfied with the current archiving solutions, yet researchers seems to archive most of their data on
their own institutional servers or local storage devices. We found no differences across sectors or research disciplines on the topic of storage.
Given the large share of storing data locally, there
is clearly a need for better infrastructure solutions.
Better infrastructure could increase the motivation
for archiving data at data archiving centres, which
could provide more secure means for archiving data
and data could be restored easier.
68
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
TEXTBOX 11.1
A WISH LIST FOR E-INFRASTRUCTURE

Open deposit, allowing user-community centres to store
data easily

Bit-stream preservation, ensuring that data authenticity will
be guaranteed for a specified number of years

Format and content migration, executing CPU-intensive
transformations on large data sets at the command of the
communities

Persistent identification, allowing data centres to register a
huge amount of markers to track the origins and characteristics of the information

Metadata support to allow effective management, use and
understanding

Maintaining proper access rights as the basis of all trust

A variety of access and curation services that will vary between scientific disciplines and over time

Execution services that allow a large group of researchers
to operate on the stored date

High reliability, so researchers can count on its availability

Regular quality assessment to ensure adherence to all
agreements

Distributed and collaborative authentication, authorisation
and accounting

A high degree of interoperability at format and semantic
level
Adapted from the PARADE (Partnership for Accessing data in
Europe) White Paper (2009)19
19
Partnership for Accessing Data in Europe (PARADE) is a consortium
targeting to build efficient services addressing data management needs of
multiple research communities. Strategy for a European Data Infrastructure (White Paper) was published in October 2009
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
69
References
Ball, A. (2012). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre
Borgman Christine L. (2012). The Conundrum of sharing research data. Journal of the American Society for
Information Science and Technology, 64 (6): 1059-1078
Committee on Scientific Accomplishments of Earth Observations from Space, National Research Council
(2008). Earth Observations from Space: The First 50 Years of Scientific Achievements. The National Academies Press. p. 6. ISBN 0-309-11095-5. Retrieved 2010-11-24.
Creswell, J. W. (2008). Educational Research: Planning, conducting, and evaluating quantitative and qualitative research (3rd ed.). Upper Saddle River: Pearson.
EC (2012). Online survey on scientific information in the digital age, http://ec.europa.eu/research/sciencesociety/document_library/pdf_06/survey-on-scientific-information-digital-age_en.pdf
Campbell, E. G. et al. (2002). Data withholding in academic genetics: evidence from a national survey, Journal
of the American Medical Association 287, no. 4 (2002): 473–480.
E-science (2005). Large-scale data sharing in the life sciences: Data standards, incentives, barriers and funding
models (The “Joint Data Standards Study”), http://www.nesc.ac.uk/technical_papers/UKeS-2006-02.pdf
EU (2010). Riding the wave. How Europe can gain from the rising tide of scientific data, http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf
Berman, F. & Cerf, V. (2013). Who Will Pay for Public Access to Research Data? http://www.greatplains.net/download/attachments/8486930/SCIENCE2013AUGPAYINGFOROPENACCESS.pdf
Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 Version 1.0
11 December 2013 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
Hanson, Sugden & Alberts, (2012). Making data maximum available, Science 331, no. (11 February 2011).
Hey et al. (2009). The Fourth Paradigm Data-Intensive Scientific Di s cover
Ingwersen. P. (2011). Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure.
JISC Research 3.0: driving the knowledge economy, http://www.jisc.ac.uk/whatwedo/campaigns/res3.aspx
Kowalczyk Stacy, Shankar Kalpana (2011), Data sharing in the sciences. Ann. Rev. Info. Sci. Tech., 45: 247–
294.
Kvale, L. (2012). Data Sharing in the Life Sciences - A Study of Researchers at The Norwegian University of Life
Sciences (Masters thesis) https://oda.hio.no/jspui/bitstream/10642/1269/2/Kvale_Live_Handlykken.pdf
70
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Meld. St. 18 (2012–2013). Report to the Storting, “Long lines - knowledge provides opportunities”
Enke, N. et al. (2012). The user’s view on biodiversity data sharing - investigating facts of acceptance and requirements to realize a sustainable use of research data, Ecological Informatics 11 (September 2012): 25–33.
doi:10.1016/j.ecoinf.2012.03.004.
OECD (2002). Frascati Manual: proposed standard practice for surveys on research and experimental development, 6th edition. Retrieved 27 May 2012 from www.oecd.org/sti/frascatimanual.
OECD (2007). Principles and Guidelines for Access to Research Data from Public Funding
http://www.oecd.org/sti/sci-tech/oecdprinciplesandguidelinesforaccesstoresearchdatafrompublicfunding.htm
PARSE.Insight (2010). PARSE.insight http://www.parse-insight.eu/
Lord, P. et al. (2006). Large-scale data sharing in the life sciences: Data standards, incentives, barriers and
funding models (The “Joint Data Standards Study”).
Savage and Vickers (2009). Empirical Study of Data Sharing by Authors Publishing in PLoS Journals,” PLoS ONE
4, no. 9 (2009): e7078. doi:10.1371/journal.pone.0007078.
St. Meld 30 (2008-2009). Report to the Storting, “Climate for research.”
Tenopir et al. (2012). Academic Libraries and Research Data Services Current Practices and Plans for the Future. An ACRL White Paper.
Tenopir et al. (2011). Data Sharing by Scientists: Practices and Perceptions http://www.biomedcentral.com/1471-2105/12/S15/S3
UiO (2013): Håndtering av forskningsinfrastruktur ved Universitetet i
Oslohttp://www.uio.no/om/organisasjon/ledelsen/styret/moter/kart_prot2013/04.23/infrastruktur.pdf
World Data Center System (2009-09-18). "About the World Data Center System". NOAA, National Geophysical Data Center. Retrieved 2010-11-24.
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
71
72
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
Appendix
Participant at workshop on open access and data management, Research Council of Norway
October 25th, 2013
Øystein
Godøy
Norwegian Meteorological Institute
Dagmar
Langeggen
BI Norwegian Business School
Andreas
Jaunsen
UNINETT Sigma
Koenraad
De Smedt
University of Bergen
Vigdis
Kvalheim
Norwegian Social Science Data Services (NSD)
Olav
Hagen Sataslåtten
The National Archives' Central Office
Frode
Arntsen
BIBSYS
Helge
Sagen
Institute of Marine Research
Per
Magnus
The Norwegian Institute of Public Health
Terje
Risberg
Statistics Norway
Dag
Undlien
University of Oslo and Oslo University Hospital
Jan
Bjaalie
University of Oslo
Live
Kvale
University of Oslo
Asbjørn
Mo
Research Council of Norway
Roar
Skålin
Research Council of Norway
Inngunn
Sagebø
Research Council of Norway
Øystein
Godøy
Research Council of Norway
Siri
Lader Brun
Research Council of Norway
Øystein
Godøy
Norwegian Meteorological Institute
Per
Magnus
The Norwegian Institute of Public Health
Gunnar
Simonsen
University hospital of Tromsø
Bjarne
Strøm
Norwegian University of Science and Technology
Helge
Sagen
Institute of Marine Research
Additional interviews
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM
73
Badstuestræde 20
DK-1209 Copenhagen K
Tel. +45 3315 7554
Norsk adresse 123
N-2390 Oslo
Tel +47 2345 1254
74
SHARING AND ARCHIVING OF PUBLICLY FUNDED RESEARCH DATA | DAMVAD.COM