User-generated data in Danish academic libraries
Transcription
User-generated data in Danish academic libraries
User-generated data in Danish academic libraries: a survey of the experience, awareness and opinion of librarians Nicholas Paul Atkinson Supervisor: Helene Høyrup Masters Thesis Spring 2012 No. of Standard Pages: 74 Abstract The research literature regarding user-generated data applications and concepts is copious and diverse. Despite such diversity, or perhaps because of it, the issues remain clouded in complexity. The literature does, however, appear to form a consensus that user-generated data concepts and applications are tools that must be understood and utilised by the modern academic library. This study presents an overview of the relevant literature, both theoretical and empirical, regarding the use of user-generated data in the academic library, through the prism of three historical phases and with particular focus on social tagging. The thesis also present the results of empirical research, a survey of academic librarians in Denmark; seeking to gauge the extent to which user-generated data is employed in academic institutions across the country and the knowledge and opinions regarding user-generated data held by academic librarians. 106 respondents from 50 institutions were represented in the results. There has been little research into academic librarian knowledge and attitudes of user-generated data, and even less information exists on the extent to which Danish academic libraries are making use of usergenerated data related applications and concepts. As success in utilising and understanding user-generated data appears to be a prerequisite for the modern academic library’s success, such information would have high value for policy makers and the individual library institutions themselves. 1 Contents 1 Introduction .................................................................................................................................................... 4 2 Literature review ............................................................................................................................................ 6 2. 1 Concepts and applications ...................................................................................................................... 6 2.1.1 Web 2.0, Library 2.0 and user-generated data................................................................................. 6 2.1.2 Social tagging .................................................................................................................................... 8 2.1.3 User ratings and reviews .................................................................................................................. 9 2.1.4 User tracking ................................................................................................................................... 10 2.1.5 Blogs, Forums and Wikis ................................................................................................................. 11 2.1.6 MySpace, Facebook, Twitter .......................................................................................................... 11 2.2 The academic library and user-generated data..................................................................................... 12 2.2.1 Phase 1: Idealism ............................................................................................................................ 13 2.2.2 Phase 2: Pragmatism ...................................................................................................................... 17 2.3.3 Phase3: Realism .............................................................................................................................. 22 3 Problem Statement ...................................................................................................................................... 26 4 Research question ........................................................................................................................................ 26 4.1 User-generated data.............................................................................................................................. 27 4.2 The academic library ............................................................................................................................. 28 5 Research design ............................................................................................................................................ 31 5.1 Language................................................................................................................................................ 32 5.2 Finding the sample ................................................................................................................................ 33 5.3 Survey software ..................................................................................................................................... 34 5.4 Survey design ......................................................................................................................................... 36 5.4.1 Respondents ................................................................................................................................... 36 5.4.2 Situation.......................................................................................................................................... 37 5.4.3 Awareness, opinion and prediction ................................................................................................ 37 5.4.4 Ordering and arrangement............................................................................................................. 38 5.5 Email ...................................................................................................................................................... 40 5.6 Pilot test................................................................................................................................................. 41 6. Results ......................................................................................................................................................... 41 6. 1 Respondents ......................................................................................................................................... 41 6.1.1 Institutions ...................................................................................................................................... 43 6.1.2 Disciplines ....................................................................................................................................... 45 2 6.1.3 Positions ......................................................................................................................................... 46 6.1.4 Patrons............................................................................................................................................ 46 6.2 Situation................................................................................................................................................. 47 6.3 Awareness & opinion............................................................................................................................. 53 6.4 Prediction .............................................................................................................................................. 61 7 Discussion of results ..................................................................................................................................... 62 7.1 Response................................................................................................................................................ 62 7.2 How user-generated data is being utilised in Danish academic libraries.............................................. 63 7.3 The knowledge and opinions held by Danish academic librarians about user-generated data applications and concepts? ......................................................................................................................... 66 7.4 Clarity of concepts and questions ......................................................................................................... 68 7.5 Language issues ..................................................................................................................................... 69 7.6 International relevancy.......................................................................................................................... 70 8 Conclusion .................................................................................................................................................... 71 9 Bibliography .................................................................................................................................................. 73 10 Appendices ................................................................................................................................................. 76 Appendix I: Original Email in Danish............................................................................................................ 76 Appendix II: Original survey in Danish ......................................................................................................... 77 Appendix III: List of research libraries taken from http://www.dfdf.dk/index.php?option=com_content&view=article&id=132&Itemid=69 ........................ 82 Appendix IV: Libraries that responded ........................................................................................................ 84 3 1 Introduction As a student at IVA I was given the opportunity of immersing myself in a broad range of the issues and challenges currently facing the library and information services, particularly those brought on by an increasingly digitised world; new literacies, smartphones, e-books, e-government, the rise of Google, etc. What surprised me most during this tour of the contemporary concerns of the library world was the overwhelming number of papers given over to the issue of social tagging, especially given the apparent (to my eyes) lack of any real-world implementation of the concept in our libraries and institutions. While there are, of course, a series of high profile, international, tagging-based websites, such as Flickr, LibraryThing and Delicious, and an abundance of papers on pioneering ‘experiments’ conducted at various libraries and universities, lasting, real-world applications, conducted at a local, institutional level are conspicuous by their absence. Social tagging is such a neat, clean idea - simple to explain, simple to participate in - but as with so many simple ideas its applications and implications are anything but simple. Examples of social tagging successes are as different from one another as can be imagined. All have their own peculiar contextual anomalies supporting the usefulness and efficacy of the tagging concept in that given situation. Another complicating factor is that any analysis of the use of tagging is difficult without reference to a much broader family of applications and concepts. Social tagging is a form of user-generated data, which itself is an aspect of Web 2.0. It is often difficult to separate social tagging out from these larger sets and treat it as an isolated concept in any meaningful way, especially when searching for pragmatic solutions to real-world challenges as library science research tends to do. Given such complications it is perhaps not all that surprising that there should be such a body of literature on the subject, but so much, so fast, and so diverse - and why this apparent discrepancy between research and practice? Such a glut of disconnected and disjointed data must present a confusing picture to any librarian or policy maker actively seeking workable solutions and guidelines. 4 The initial aim of this research was to attempt to summarise and review instances of social tagging’s use in academic libraries in Denmark. The aim of such a review would be to develop a best practice model based on the examples available, and view it in the light of an international model of best practice gleaned from the research literature. However, as I have noted, separating social tagging out from its larger sets of usergenerated-data and Web 2.0 is problematic, and I have therefore chosen to extend the scope to include all examples of user-generated data. The reasons for doing this will be explained more fully in section 4 of this thesis, while an outline of the distinction itself can be found in section 2. Moreover, as will also be made clear in section 4, my preliminary research suggested that I would find very few examples of such use in Danish academic libraries. I chose, therefore, to move the main focus of the empirical research away from applications and implementations and instead concentrate on the academic librarians themselves, their knowledge and opinions of user-generated data and its uses. The great advantage of doing this would be that I could still build up an image of what was occurring in Denmark with regard to user-generated data projects, but whether I found any or not I would also be gathering valuable data on what librarians themselves thought of the situation. What do they know about user-generated and what it can be used for? Do they welcome it, or are they resisting it? This study will, firstly, in section 2, present the concepts of tagging, user-generated data and Web 2.0; their origins and overlaps, describing them and defining the terms as I intend to use them throughout the study. Then, I will give an overview of the literature, both theoretical and empirical, regarding the use of usergenerated data in the academic library, through the prism of three historical phases and with particular focus on social tagging. In section 3 I outline the problem statement and in sections 4 and 5 I will describe the reasoning behind my choice of subject, my research questions and my methodology. In section 5 and 6 I present the results of my own empirical research, a survey of academic librarians in Denmark; seeking to gauge the extent to which user-generated data is employed in academic institutions across the country and the knowledge and opinions regarding user-generated data held by the library staff. 5 2 Literature review 2. 1 Concepts and applications 2.1.1 Web 2.0, Library 2.0 and user-generated data The term Web 2.0 emerged in 2004 and was coined in recognition of the fact that the World Wide Web was becoming a much more participatory space of multimedia collaboration than had been the case previously (O'Reilly, 2005). There was therefore no single rolling out of a physical thing called Web 2.0 as the name might suggest (sounding as it does like a new software release,) nor does it denote any single new type of functionality, it was more akin to the naming of a historical epoch, such as the Age of Discovery or the Renaissance. Defining it is therefore a fuzzy business, blogging began as early as 1997 (O'Reilly, 2005), yet blogging is considered a central part of the Web 2.0 concept. So it is not that the internet was not a collaborative technology right from the start, it was, but around 2004/5 (O'Reilly, 2005) collaborative technologies were reaching the mainstream and shifting the centre of gravity of the internet, from a space where ordinary users could retrieve information from information providers to a place where information was being simultaneously created, shared and published by everyone. As Chalon, Di Pretoro and Kohn point out, any commentator’s definition of Web 2.0 is likely to be coloured by their particular perspective (Chalon et al, 2008, pgs 1-2); the three authors, one with an IT background, one a social scientist and one a knowledge manager, each had differing interpretations springing from their own unique academic backgrounds. Right from the start the Web 2.0 idea, dealing as it does with giving ordinary people the means to share, publish and retrieve information had obvious interest to librarians and thus it was as early as 2005 that the term Library 2.0 was termed by Michael Casey (Maness, 2005). Maness has defined Library 2.0 as, "the application of interactive, collaborative, and multi-media web-based technologies to web-based library services and collections” (Maness, 2005). Library 2.0 then, much like Web 2.0, is less a unified concept than it is a statement of a general idea that something is taking place around us on the internet and in our 6 homes that the library world simply cannot afford to ignore. Nor would this be the last example of attaching the 2.0 appendage to a concept in the library world, as will be noted later. So what are these so called Web 2.0/Library 2.0 technologies? Well, perhaps it is easier to highlight first what is not Web 2.0. A static website would not fall into the Web 2.0 subset nor would a website that was updated regularly but made no additional attempt to push those updates directly to the users. In other words if users have to independently search the internet to find the latest information on your website then it is does not form part of the Web 2.0 family. Figure 1. We only start to enter the realms of Web 2.0 when we consider multimedia based applications or applications that push information out to users more directly and in a more targeted fashion; RSS and SMS 7 for example and to a certain extent, Facebook, although Facebook is somewhat more than this depending on just how it is being used. I find it useful to categorise Web 2.0 applications into three groups based on the nature and direction of the communication taking place. I have tried to do this in Fig. 1, where I have also placed a selection of applications into their respective, and sometimes overlapping, groups (this positioning is tentative; applications can be utilised in innovative ways that may take them out of the groupings I place them in here). I have already outlined the first of these groups; those concerned with Library-to-User communication. A second can be categorised as applications which allow information providers to conduct a two-way dialogue with the user, but not necessarily for the purposes of building a catalogue of such discussions, the discussion is not stored and added to an archive (at least not under normal circumstances) these applications are the various Instant Messaging and internet-based video conferencing and telephony applications, such as Skype. The third and final group are the applications that allow the users to become directly involved in content creation, in cataloguing and reviewing. Some applications are explicit in the way they accept, store and utilize user data, such as social tagging and reviewing. Others, such as Twitter or user tracking can be utilised in order to build up an archive of user tweets or behaviours. It is this third type that I am concerned with in this study, those that can be termed user-generated data. What follows is a brief outline of the different applications that we can find in the user-generated data subset. 2.1.2 Social tagging Social tagging is a key aspect of the Web 2.0 concept. According to Mendes et al., social tagging can be defined as, ”[t]he decentralized practice and method by which individuals and groups create, manage, and share terms, names, etc. (called ‘tags’) to annotate and categorize digital resources in an online ‘social’ environment” (Mendes et al., p.32, 2008). Tagging systems have been contrasted with traditional taxonomical classification systems because of their open, non-hierarchical, non-authoritative qualities, leading some to claim that social tagging represents the future of classification (Shirky, 2005). However, social tagging has also been criticized for its uncontrolled and uncontrollable nature, inevitably leading to 8 issues of, “ambiguity, polysemy, synonymy and basic level variation” (Spiteri, 2007). While its ability to aid search and retrieval is the subject of much dispute, social tagging’s ability to aid browsing, via the navigational technique of tag clouds, is less controversial (Anfinnsen et al., 2011). Tag clouds present user submitted tags as visual word maps, with words of increasing popularity emerging larger and bolder in the midst of a cloud of words. According to Vander Wal, who coined the term "folksonomy", social tagging has its origins in the 1990s (Vander Wal, 2007), but it remains one of today's major successes through websites such as Delicious (tagging webpages), Flickr (tagging images) and Youtube (tagging videos). Within the library world, much attention has been paid to LibraryThing, a social tagging based website that allows users to upload records of their entire book collection (Santolaria, 2009). LibraryThing have developed a feature they call LibraryThing for Libraries (LTFL) which allows individual libraries to make use of the vast amounts of user-generated data applied to the books represented in the system (Santolaria, 2009). In order to counter some of the problems tagging has with ambiguity, polysemy, etc., efforts have been made to design techniques that can automatically prune ‘bad’ words and construct semantic hierarchies, often called ‘tag gardening’ (Weller & Peters, 2008). Efforts have also been made to merge data from tagging systems with data from traditional library classification systems and form hybrid classification systems (DeZelar-Tiedman, 2008). New generation OPACs have been developed which can incorporate tagging into its traditional classification systems, social OPACs (Chalon et al., 2008), the original and perhaps most famous being the SOPAC, developed by John Blyberg at the Ann Arbor University (Blyberg, 2007). 2.1.3 User ratings and reviews In addition to providing classification-based information, users can also be asked to provide their opinions on the materials in the collection. The success of Amazon, and in particular the success of its comments, reviews and ratings, have demonstrated how user opinion can have a very powerful effect on the choices made by other users. Indeed, many of the tags in tagging systems themselves already consist of what we could term ‘opinion’ and as such represent a new facet to enter the classification strategies that libraries can use (Holst & Berg, 2008). Many libraries and library web systems have begun to incorporate the facility 9 for users to make such additions to the catalogue. Such functionalities also exist in some new generation OPACs (Chalon et al., 2008). 2.1.4 User tracking User tracking, the automatic monitoring of the online behaviour of users while they are browsing a website, is a rather different beast in many ways and is not so clearly a part of the Web 2.0 concept (while still a type of user-generated data, it is not necessarily voluntarily or consciously provided). A significant portion of the navigational functionalities provided by Amazon are now based on the data obtained by this technique; the “Customers Who Bought This Item Also Bought” functionality, for example, which aggregates the data obtained by their many users so that individual users can see what is popular with like minded browsers. Such data can also be used to personalize a user’s browsing experience, filtering content based on the user’s past navigational choices; Amazon’s “Inspired by Your Browsing History” is an example of this, or it can be done less explicitly by simply tailoring the data you receive by default, as Google has begun to do with their search engine. Such methods have been criticised by authors such Pariser, for delivering information to users based on fallible, automated algorithms, and setting up an insular, selfperpetuating filtering effect screening users from anything in the world unconnected with what they already know (Pariser, 2011). But the fact remains that users seem to react positively towards personalization, and have come to expect such functionalities from modern web services (Walker, 2010). Recommender systems such as Ex libris’ BX, which perform similar tracking-based functions as those found on Amazon, are becoming a common addition to library web services (Walker, 2010). In Denmark, the On_tracks conference (http://ontracks.dk/), which took place in November, 2011 with the objective “to inspire the Danish research library community to increased use of user data in their services” (On_Tracks, 2011), and including such speakers as Tim Spalding, the originator of LibraryThing, demonstrated the increasing importance of this tool for libraries in Denmark as well as internationally. Moreover, user tracking has been incorporated into bibliotek.dk’s in the form of an ‘Others have borrowed’ functionality (Holst & Berg, 2008). 10 2.1.5 Blogs, Forums and Wikis As Cahill says, “Blogging was the first Web 2.0 tool really to hit the mainstream” (Cahill, 2009, p.3). Blogging represented the internet’s move towards being a collaborative space, allowing as it did, ordinary users to publish their own content, often taking the form of diaries or informative information bulletins. Forums are similar to blogs except in that where a blog tends to be an ongoing record of one person/institution’s content, forums allow a number of users to add their contributions under a specific topic. Thus, forums tend to be subject-based while blogs centre on an individual. Taking the blogging/forum idea further still is the Wiki. “A Wiki is a collaborative software tool that allows the creation of webpages which can be edited and modified by multiple users” (Cahill, 2009, pgs.7-8), the most famous collection of Wikis being the collaborative online encyclopaedia Wikipedia. Wikis have been used by libraries for such things as “sharing information, supporting association work, collecting software documentation, supporting conferences, facilitating librarian-to-faculty collaboration, creating digital repositories, managing Web content, creating Intranets, providing reference desk support, creating knowledge bases, creating subject guides, and collecting reader reviews” (Bejune, 2007, p.32). Today, if you need an answer to a question and you rely on Google search for your answer the chances are that Wikipedia might provide the home of your first result, while endless results from content contained within blogs and forums will follow swiftly after. 2.1.6 MySpace, Facebook, Twitter Social networking sites are the runaway success story of the past few years by any standards. The rise of Facebook has almost reached an extent that, in many countries if you are not on Facebook your very existence might be called into question. This is no less true for libraries; “for libraries serving communities of at least 500,000 people, the ratio of those with a Facebook presence jumped from barely one in ten in 2008 (11%) to 4 out of 5 (80%) in 2010” (Lietzau & Helgren, 2011, p.iii). Because of this incredible rise in popularity, social networks represent a particular challenge as well as an opportunity for libraries. Unlike most other internet tools, where librarians often tend to be ahead of their users, Facebook represents one of an increasing group of tools (including Twitter) that go from unknown to ubiquitous before libraries, and indeed the academic world, have had chance to fully absorb, analyse and assess the phenomenon. The ups 11 and downs, and social impact, of these applications are proving unpredictable and dramatic. MySpace, similar to Facebook in many ways, while used by many libraries in the past has seen a decline in recent years that puts its very relevance in jeopardy. As Clevenger et al., note, “[t]he subject itself is something of a moving target; social media and social networking are subject to rapid change, with new services and sites emerging each year (Twitter, Tumblr,Foursquare) and the popularity of some older ones eroding seemingly overnight (MySpace), while some never seem to get off the ground at all (Google Buzz)” (Clevenger et al., 2011, p.2). Twitter is perhaps the most recent of those applications that are still in the ascendant, it consists of messages of 140 characters or less, called tweets that can be instantly posted to an individual/institution’s Twitter feed and, thereby, instantly inform all feed subscribers of the sender’s message - as subscribers to a feed can often number the thousands (or over 20,000,000 in the case of popular media figures) this becomes a very powerful modern communication tool. Twitter’s popularity is particularly bound-up with the simultaneous rise of mobile technology and smartphones, from where such Tweets can be instantly delivered as well as received. Like Facebook, this rise has been mirrored by its introduction to library services; in 2010 68% of libraries in the US serving communities of at least 500,000 people were using Twitter (Lietzau & Helgren, 2011, p.14). According to Aharony (who reported on an analysis of Twitter in libraries in 2010), using Twitter enables libraries, “to broadcast and share information about their activities, opinions, status, and professional interests […] Twitter serves as a channel that keeps the libraries in touch with various patrons and enables them to provide their patrons with useful professional or personal information” (Aharony, 2010, p.347). 2.2 The academic library and user-generated data The literature available concerning user-generated data in academic libraries is overwhelming. The search; [("social tagging" OR folksonomy OR "user-generated data") AND ("academic library" OR "research library")] in Google Scholar returns over 400 papers from the past six years. Remove everything after the “AND” and this becomes 16,000 studies, the vast majority of which must surely stem from the field of Information Science. A search on LISA (Library and Information Science Abstracts) using the query ["social 12 tagging" OR folksonomy OR "user-generated data"] returns 150 results from the same period. Clearly usergenerated data is a subject that has been exercising the minds of librarians and library science a great deal recently. But the literature, as well as being abundant, is also very confused and diverse. More than is usual amongst information management techniques, user-generated data based ideas like social tagging have implications that go far wider than their ability to manage knowledge; its marketing qualities and its ability to stimulate social inclusion being perhaps the most obvious of these (Mahmood & Richardson Jr, 2011). Thus, the very basis on which user-generated data is analysed and judged can be very different from study to study. Its very recency and dramatic success in certain spheres is another factor that adds to the confusion; discussion is often skewed towards isolated successes which may not necessarily be representative of the nature of social tagging as a whole. Flickr, Delicious, LibraryThing and Amazon have all been subject to a good deal of generalized academic debate on tagging, yet each of these examples is very different from the next in nature, aims and use, and they are all very different from any user-generated data application that might be implemented by an independent library, academic or otherwise. The complexity and diversity of these issues make it hard to tease out any consistency and clarity in the research. Policy makers seeking to determine a Web 2.0 strategy for future library services are not helped by 400 articles all veering off wildly in different directions. In order to give a summary overview with some sense of direction I have chosen to present it within three generalized themes or phases. These phases form a historical progression which is somewhat artificial, as the viewpoints and empirical results I address in each can be found occurring at any point during the entire period within which this research covers. However, I believe these phases crystallize how the crux of academic opinion has adapted during this relatively short space of time and if nothing else it gives a more coherent narrative to an otherwise erratic body of literature. 2.2.1 Phase 1: Idealism In 2004/5, social tagging was touted as being no less than the future of classification. Theoretical arguments were brought to bear to justify its superiority over traditional classification methods (Shirky, 13 2005). The tagging concept emerged, however, not out of theory but out of practice. Its success and value started to become apparent in 2003 when bookmarking sites like Delicious emerged and quickly grew in popularity and stature, but became undeniable after such major international successes as Flickr and Youtube (Vander Wal, 2007). The viewpoint of the idealists, therefore, though often cited in the academic literature, tends to be found in its original form in books and internet articles written by authors reacting to such obvious success. They were able to draw on philosophical theoretic criticisms of traditional taxonomies in their championing of what they saw as a truly new socially-centred classification phenomenon which allowed everyone to give names to objects free from the constraints of tradition or other authoritative power structure. In 2004 Thomas Vander Wal, coined the term folksonomy, a name that would help cement this idea of social tagging as a fundamentally new system; the people’s taxonomy (Vander Wal, 2007). According to Vander Wal, “[t]he value in this external tagging is derived from people using their own vocabulary and adding explicit meaning, which may come from inferred understanding of the information/object. People are not so much categorizing, as providing a means to connect items (placing hooks) and provide their meaning in their own understanding” (Vander Wal, 2007). Clay Shirky, a writer and consultant on Internet technologies wrote a much cited internet article entitled “Ontology is overrated: Categories Links and Tags.” in 2005 (Shirky, 2005). Here he built the case against traditional classification. “We are moving away from binary classification” (Shirky, 2005), Shirky claimed - by which he meant the placing of objects into mutually exclusive categories. “The signal benefit of these systems is that they don't recreate the structured, hierarchical categorization so often forced onto us by our physical systems. Instead, we're dealing with a significant break -- by letting users tag URLs and then aggregating those tags, we're going to be able to build alternate organizational systems” (Shirky, 2005). Shirky asserted that social tagging was, “a radical break with previous categorization strategies, rather than an extension of them” (Shirky, 2005), but such statements were hard to justify even then, given that social tagging inherited those very traditions being rejected, via the minds of the taggers themselves. Taggers would 14 almost inevitably replicate in some way the taxonomic categories they had grown used to, when attempting to give their own names to things. From 2005 and onwards a great many studies and experiments were performed to see how tagging really performed against existing taxonomies. In 2007 Spiteri looked at tags from the popular bookmarking sites of Delicious, Furl and Technorati and matched them against the National Information Standards Organization’s (NISO) guidelines for the construction of controlled vocabularies (Spiteri, 2007). She concluded that tagging suffered from many of the issues commonly associated with uncontrolled vocabularies, but that ‘most’ of the tags did, nevertheless, conform to the NISO guidelines. Spiteri concluded that if more attention was paid to providing guidelines to taggers, social tagging “could serve as a very powerful and flexible tool for increasing the user-friendliness and interactivity of public library catalogues and may be useful also for encouraging other activities, such as informal online communities of readers and user-driven readers' advisory services” (Spiteri, 2007). Such ‘guidelines’, however, were hardly in tune with the idealists’ view of tagging as a free, democratic, non-authoritative system of classification. In 2009 Heymann & Garcia-Molina, compared LibraryThing tags and Library of Congress Subject Headings (Heymann & Garcia-Molina, 2009). They were interested in analysing the equivalence between keywords in both these systems, and establishing whether or not they were being used in the same way, i.e. were they being applied to the same objects. They found a significant overlap amongst the keywords themselves; 50% of the LCSH headings having identical LibraryThing tags, with the rest having somewhat “similar, though not exactly equivalent, tags” (Heymann & Garcia-Molina, 2009, p.4). Thomas et al., also compared LibraryThing tags with LCSH headings in 2009. They found that taggers primarily tagged for their own purposes and that it took, “a larger number of tags [...] to clearly separate out the high-level descriptors from the noise of single use personal tags” (Thomas et al., 2009, p.430). However they also found that a significant amount of valuable information could be found within the tagging data. In particular they point to their finding that 35% of tags were relevant terms, unrecognised by 15 LCSH. Social tagging in LibraryThing therefore had the potential to provide useful additional metadata that had been overlooked by expert classifiers. Cahn & Yi considered the possibility of using LCSH data and structures to provide a surrogate hierarchical structure for social tagging data by word matching between the two systems (Chan & Yi, 2009). In this study they chose to compare the LCSH with tags taken from Delicious. They extracted the tag histories of 4,598 pages, resulting in 388 unique tags. They found a 61% exact match of tags and LCSH headings. Thus the matching was somewhat imperfect, mainly due to reasons which they list as; “specific technologyrelated tags” which did not exist in the LCSH headings, “inconsistent forms”, a commonly recognised problem of tagging systems (Spiteri, 2007), and “incompatibility between word forms such as abbreviations or acronyms” (Cahn & Yi, 2009, p.897). Nevertheless, they concluded that, ”the linking of two such resources is valuable in that it can integrate the views of both the users and systems in indexing and information organization” (Cahn & Yi, 2009, p.897) and that such matching still had ‘great potential’ if these discrepancies could be ‘normalized’. In order to assess the potential of actually importing tags from LibraryThing into a large academic library catalogue, DeZelar-Tiedman compared the LibraryThing tags and the corresponding LCSH headings from a random sample of 383 titles taken from the University of Minnesota Library’s catalogue (DeZelar-Tiedman, 2008). By looking at the overlap between the two datasets she found that the tags added little to the quality and quantity of the metadata except where the titles were recent, popular or literary works. Her ultimate conclusion, then, was that LibraryThing tags were, “not a strong source of enhanced metadata for large academic libraries” (DeZelar-Tiedman, 2008). But added that user tags reflecting ‘opinion’, which would never have formed a part of traditional classification systems, have some value: “Some ‘personal’ terms might actually be useful to a wider audience. Readers may gravitate toward a book that many have tagged as a ‘favourite’ or may want to read something ‘funny’ or discover which books were discussed on a particular television program.” (DeZelar-Tiedman, 2008). 16 2.2.2 Phase 2: Pragmatism The tagging vs. taxonomy debate did not end with one system supplanting the other. Social tagging did not become the new taxonomy. But what was becoming clear was that tagging comprised a new and useful tool which could be utilised, where appropriate, in addition to the existing tools information scientists already had at their disposal. Take, for example, this observation from Voss made in 2007:”The astonishing popularity of tagging led some even claim that it would overcome classification systems, but it is more likely that there is no dichotomy and tagging just stresses certain aspects of subject indexing” (Voss, 2007, p.3), or this from Smith in 2008: “Three years ago some pundits suggested that folksonomies might replace IA altogether. Today we’re seeing tags, taxonomies and facets intermingling to create new and valuable information structures” (Smith, 2008, p.14). While it had been shown that tagging could only imperfectly resemble a taxonomy (DeZelar-Tiedman, 2008; Cahn & Yi, 2009; Thomas et al., 2009; Heymann & Garcia-Molina, 2009), it had also been shown that it could enhance existing classification systems through its dynamic ability to extract terms based on contemporary and expert knowledge (Cahn & Yi, 2009;Thomas et al., 2009). Moreover, it had been shown that it could enhance existing classification systems via the addition of a new facet - ‘opinion’ (DeZelarTiedman, 2008; Holst & Berg, 2008); a valuable addition when viewed in the light of such successes as Amazon’s reviews and ratings system. And tagging was not the only way opinion could enter the library data store. Other aspects of user-generated data such as ratings, comments and user tracking could also provide valuable data on user opinion. The popularity and ubiquity of social networks coupled with the explosion of data, described by Borgman as the data deluge (Borgman, 2007), particularly from new media, demanded that library services sought new, more socially inclusive ways of managing their growing collections. Thus, the pragmatic approach sought to discover how user-generated data might be integrated into the library web services without destabilising the integrity and the reliability of existing systems. The emphasis was on understanding these swiftly changing phenomena and understanding how to integrate them affordably and effectively. 17 One method whereby academic libraries can implement user-generated data into their web services is to programme a tailored system in-house. In a time of decreasing funds and increasing speed of technological change this is perhaps the option that is furthest from most libraries reach, but those libraries that succeed could of course then go on to act as models of best practice, perhaps even to the extent of being able to package the solution and transfer it directly to institutions with a similar need. One of the first, and perhaps the most famous, example of this was executed by John Blyberg at the Ann Arbor District Library in Michigan, US in 2007 (Blyberg, 2007). Using Drupal, a free, open-source content management software application, Blyberg built a module on top of the existing catalogue of the Ann Arbor District Library which gave users the “ability to rate, review, comment-on, and tag items” (Blyberg, 2007). This information, stored in a separate repository, could then be made available through the same interface to all users together with existing data from the catalogue. He called his solution the SOPAC - the Social OPAC – and subsequently continued to work on the idea, keeping the project open-source and portable to other libraries who may want to incorporate the SOPAC solution with their own ILS. Though no longer with Ann Arbor Library, Blyberg continues to develop the module that is now SOPAC 2.0 (http://thesocialopac.net/) Blyberg is, however, circumspect about the successes of social tagging: “SOPAC was by-and-large a success, but its use of user-contributed tags is a failure. For the past nine months, the top ten tags have included “fantasy”, “manga”, “anime”, “time travel”, “shonen”, “shonen jump”, and “shape-changing”. As a onetime resident of Ann Arbor, I can assure you that these are not topics that dominated the collective hive mind” (Blyberg, 2008). In 2011 Anfinnsen et al developed a social tagging based system with which they could conduct an experiment at Brunel University's library in the UK (Anfinnsen, et al., 2011). After an implementation period that was for testing purposes only, they conducted a user survey to gauge reaction to the system. They found it to have been a great success among users. The system, which was programmed in Ajax and designed to meet a requirements specification obtained via meetings with domain experts and a focus group of students, included opportunities to see the tags applied to individual objects, as well as navigating 18 to content via tag clouds and lists of tags. An interesting inclusion in their experiment was the use of colour coded tags. Different colours would indicate three different levels of user; students, PhD students and academic staff. The post experimental survey was conducted via questionnaires among library stakeholders. All but one expressed a preference to see the test system fully implemented. Their experiment, they say, “has proven that there is a demand for tagging system in the library at Brunel University“ (Anfinnsen, et al., 2011, p.69). The hybrid idea, of maintaining two individual sets of data which come together in the user interface, has been a popular way of merging user-generated data with the traditional library catalogue without allowing the one to ‘pollute’ the other. Penntags is a system developed by librarians at the University of Pennsylvania. It is similar to the website Delicious in that it is designed as a social bookmarking system to aid the user in bookmarking online resources. It includes many of the facets that we have come to expect from such systems, such as tag clouds. The difference between Penntags and Delicious, however, is that it was specifically designed to be used in an academic library setting (Sweda, 2006), and as such is structured to better aid the academic process. As it allows users to bookmark and tag online resources, and as the part of the library catalogue that is available in an OPAC is effectively an online resource, the upshot is that Penntags acts as an extra layer of navigation on the library OPAC. Penntags also has the functionality of being able to pool resources into projects, the advantages of which in an academic setting are clear: “Users can search the open Web and library products and compile an easy to manipulate, easy-to-share collection of resources (reference guides, course materials, homework assignments)” (Sweda, 2006, p.32). Chalon et al made a study of what they called OPAC 2.0 systems (they do not claim to have originated the term) (Chalon, et al., 2008) OPAC 2.0 is a merging of the standard OPAC systems already used by libraries for their digital catalogues with the collaborative and participatory functions of Web 2.0 applications. They made a survey of 70 unique systems which included OPAC functionalities and found 14 that included aspects allowing them to be considered OPAC 2.0 systems, namely that they provided one a more of the 19 following; the ability for users to comment on books, the ability for users to submit ratings, the ability for users to submit tags, or the ability to make book suggestions based on user behaviour. They provide a list of these systems in their article together with the functionalities of each. In addition they tested out one such system at their own medical library consisting of 35 ‘experts’, 30 of whom became their subjects for the study. They found that about half of their subjects judged the comments function useful, but only 17 comments were made in a four month period. The ratings functionality received even less acceptance and use, very few of the subjects recognising what it was for. Although two thirds of the subjects considered the tagging function useful, very few actually submitted a tag, and the response to the book suggestions function was a similar story. Their overall conclusion was that their users were, “still ‘Users 1.0’. They surf regularly and use information from the Web, but most of them are not publishers” (Chalon, et al., 2008, p.5). Despite these findings Chalon et al., were positive about the implementation and insisted that going forward was a matter of learning from users exactly what adaptations to make to the system. They seem to be calling for targeted implementations based on deep local, contextual understanding of the situation when they say that “Knowing users, their needs and their skills, their interest in those functionalities is still the most important aspect to investigate before implementing new tools. Are users really willing to participate? Do they have the needed skills? Do they have enough time? Are they supported to do it?” (Chalon, et al., 2008, p.6). Chalon et al also outlined the three main ways that such implementations can technically be handled. Libraries can; use a Content Management System (CMS) that gives them the possibility of integrating the catalogue into a CMS driven front-end interface, use a standalone OPAC 2.0 that can be implemented and used alongside existing Integrated Management System (ILS) structures or, and this they see as the most ambitious option, develop the ILS itself via, for example, the integration of modules that provide the “2.0” functionalities. One way a library can utilize the huge international data stores generated by the social tagging site LibraryThing is by implementing LibraryThing for Libraries (LTFL). Mendes et al., describe their attempt to integrate LTFL into the website of the California State University, USA (Mendes, 2009). LFTL, as Mendes et 20 al explain, is “a service that displays user-generated metadata from the “social-cataloging site” LibraryThing (www.librarything.com/) in an individual library’s catalog (LTFL, 2008). The LTFL data consists of three types: tags, recommendations, and links to other editions and translations of works” (Mendes, et al., 2009, p.32). At the time of writing, in 2008 (although published in 2009 this was actually written in 2008), there were 68 libraries using LTFL, 25 of which were academic libraries, in 2009 this had risen to 139, 53 of which were academic libraries (Santolaria, 2009, p.9), the total number of libraries now (in 2012) stands at 385 (LibraryThing, 2012), unfortunately I have no details on how many of these are academic libraries. Mendes et al., describe the method whereby a library can implement LTFL by exporting the ISBN records of their collection and uploading this to an LTFL account page, then on their account page the library can configure how the LTFL metadata should be displayed and export a piece of code from LibraryThing that they can then incorporate into their own site. ISBN data then has to be periodically exported and uploaded over time to keep the merged data up to data. In essence, once this integration procedure has been performed the library then has a self-contained module that can be added anywhere on their website allowing users to browse their catalogue using LTFL instead of browsing via their standard catalogue system. In the test implementation at California State University they found that LibraryThing could match 46% of their 471,885 ISBN records. They placed the front-end navigation module at the bottom of their website. Over a period of 170 days they then tracked the user of the LTFL module compared with the use of the standard catalogue facilities. They also compared tags from a sample of 21 non-fiction books. The results of their study were largely negative. Use of the LTFL data turned out to be very low. However they point to a number of potential extenuating circumstances that could have contributed to such a low statistic: “many users were not initially aware of this new feature in the catalog; no formal promotion campaign for LTFL was conducted. The time period analyzed also included the traditionally lower-use periods of spring break, semester breaks, and summer terms (17 May-24 August), and ends with data covering the three-day Labor Day holiday weekend”. Moreover, they point out that the positioning of the functionality may have been difficult or even impossible to see without scrolling down the page. As for their tag/LCSH data comparison, 21 they found that, for every 1 book a user discovers via LCSH headings they will discover 4 using LTFL, but as they point out, they have no information about the relevancy of these discoveries. Nevertheless, they say, user-generated data does enhance resource discovery, especially for books with fewer, or no, subject headings, such as fiction books, and tags also helped to place resources in genres, as well as reflecting the language of the users themselves. 2.3.3 Phase3: Realism One of the main themes that emerges from many of the aforementioned studies is the extent to which the user response to user-generated data implementations cannot be simply taken for granted. This has been an enduring issue with Web 2.0 tools throughout their history; a two year old forum on your website that has no posts looks far worse than having no forum at all. But for library services a lack of user interest can mean a lot more than simply reflecting badly on the website, it can mean that users are no longer engaging with their services at all. Thus, much attention has been placed on encouraging the use of user-generated data implementations. The realist phase recognizes that user-generated data applications will inevitably form a part of the future library service – but asks questions such as which applications are relevant to users, and how long will they remain relevant for? In 2009 Xu et al., made a survey of the websites of 81 academic libraries in New York State, US, to establish which Web 2.0 tools were being utilised in each (Xu, et al., 2009). They found that 58% of libraries had not yet introduced any Web 2.0 tools at all. Of those that had Internet Messaging was by far the most used application; being used by 34 institutions, while blogs were the next most popular at 20. Tagging was only used by 6 libraries. They found no examples of users being asked to submit tags to the OPACs, instead tags were being used as, for example, an extension of the library blog allowing users to tag blog entries. They found an equally small number of libraries using Wikis. They cite one library as integrating Wikis with the blog as part of a student learning platform. Most, however, were using the application on their websites with no attempt to facilitate, advance or direct its use. They found only four libraries using Facebook and MySpace. They conclude with a model for what they call the Academic Library 2.0 and suggest that the near future will see the concept grow in importance and 22 variety of technology. They do however concede that currently utilisation of these technologies is patchy and varied, and that users are not as quick to embrace the new functionalities as the librarians themselves. “The lag in user participation in the Library 2.0 movement perhaps will be minimized or eliminated when users become more Web 2.0 savvy and the Library 2.0 platform matures” (Xu, et al., 2009, p.328). A similar study was performed by Mahmood & Richardson Jr in 2011 (Mahmood & Richardson Jr, 2011). They surveyed the websites of 100 academic libraries in USA taken from the Association of Research Libraries’(ARL) membership list. Each website was surveyed against an extensive list of Web 2.0 applications. To briefly summarise some of the most relevant results; RSS was the most popularly employed Web 2.0 application, the second most popular being Internet Messaging; 89 libraries used Facebook. (this is a significant jump from the previous study by Xu et al); There was no mention of MySpace this time except perhaps that they say "[a] few also used other social sites" (Mahmood & Richardson Jr, 2011, p.369); 86 were using blogging software; 85 libraries used Twitter, "to share news and announcements" (Mahmood & Richardson Jr, 2011, p.369); 55 used social bookmarking or some other form of tagging (some libraries actually incorporating tagging into the OPACs); ‘a few’ used Delicious; 47 used Flickr "for sharing pictures of events" (Mahmood & Richardson Jr, 2011, p.370); 17 used Slideshare (a site for sharing PowerPoint presentations); 10 had a presence in the virtual world of Second Life and 40 were using Wiki applications, but mainly for managing resources amongst staff such as committee minutes, procedures, etc. According to the authors the results demonstrated an "overwhelming acceptance of various Web 2.0 tools in large academic libraries of the United States"(Mahmood & Richardson Jr, 2011, p.372), which would seem to validate the projections made by the aforementioned study made only two years previously. For their Masters thesis, Holst and Berg examined the use of user-generated data in two Danish OPACs bibliotek.dk and Rex (Holst and Berg, 2008). Bibliotek.dk is a national internet service in Denmark, hosted by the state run Dansk BiblioteksCenter (DBC), which provides a single shared catalogue for all Danish public libraries. Rex is the online database of the Royal Danish Library, providing access to catalogues and 23 digital resources including paintings and photographs. They analysed the tags, comments, ratings and user tracking functionalities of these systems, as well as conducting three semi-structured interviews with experts and focus group discussions with users. In Rex they found that 78% of the data submitted by users consisted of new information not found in the metadata obtained via traditional indexing, while in bibliotek.dk it was 64%. But for search purposes this data proved to be largely unusable as there was as yet no way of integrating the tags and comments with the search terms submitted by experts. They also found that bibliotek.dk’s “Other users have borrowed...” functionality currently lacked enough data to render it significant. The expression of ‘opinion’, which they judged could categorise the overall majority of the data in bibliotek.dk (though virtually none in Rex) was a dimension that did not already exist in the database and was therefore an enrichment of traditional indexing. Participants in the focus group interview suggested that tagging semester literature would be a valuable additional functionality. However, they also cite their focus group as being sceptical of data submitted by other users, preferring the security of "serious, objective, qualitative data" (Holst and Berg, 2008, p.117). Examination of the data showed that users were, nevertheless, serious and constructive when submitting data; they found no attempts of vandalism. The authors concluded that there were three ways in which user-generated data could enrich the OPAC; 1) it could enhance and enlarge the existing indexing. It could 2) give users the possibility of personal information management, PIM, and 3) it could encourage collaborative, social data exchange (Holst and Berg, 2008, p.116). They point out that even if few users submit data, that data can still have great value for the individual users using the system. One of the features of the realist phase is an increasing interest in user tracking. In 2012 an article from the Danish journal REVY asked the question; “How can the Danish Research Libraries stimulate Web 2.0 activity?” The article reported on some examples taking place in the UK, examples which seemed to suggest the answer was largely a matter of making use of ‘activity data’, i.e. user tracking (Skovgaard Jensen, 2012). One of the ways user tracking can be combined with a more visibly active encouragement of the users is through gamification. Gamification is a Web 2.0 related concept that involves using aspects more 24 commonly used in games for activities that are non-game related. A very basic example of gamification can be found when forum members are given increasingly lofty sounding titles relative to the amount of forum posts they have made, for example, going from Newbie to something along the lines of Extreme Super Power User. The idea being that gaining these titles encourages users to make posts, as well as encouraging them to ‘behave’ and make valid contributions, lest their title get stripped from them again. An example of gamification being tested in a pioneering way in the academic library can be found, once again, at the Ann Arbor University. Markey et al., conducted a study at Ann Arbor to assess if, “undergraduate students will play games to learn how to conduct library research” (Markey et al., 2009, p. 303). Their overall conclusion was that, yes, students could be encouraged to play games, and that this process would indeed be beneficial to the research process and the overall aims of their course. Although, they had reservations about the particular game they had designed for the experiment, and should they move on from the test phase they expressed an intention to design a different game that, unlike the test game, was “a pervasive and unobtrusive presence beside the online tools and collections students use to research, write and document a writing assignment” (Markey et al., 2009, p.311). Again, this result highlights how such techniques suggest beneficial uses but are also very difficult to get right in practice, and are often very contextually dependent. Another example of gamification can be found at Huddersfield University Library in the UK, whose staff are involved in a number of experiments to discover new ways of utilising activity data as noted in the aforementioned REVY article (Skovgaard Jensen, 2012). In one such example, Andrew Walsh, an academic librarian at the library, looked at data kept by the university and the library and found a strong correlation between library use, and academic achievement. He found that students who got a 1st class degree accessed the resources of the library on average three times more often than those that got a 3rd class degree (Walsh, 2011). In order to encourage people to use the library and its resources more, therefore, he set up the game Lemon Tree in collaboration with ‘Running in the Hills’ who are behind a project called librarygame http://www.librarygame.co.uk/. With Lemon Tree, once users have registered a connection, the system then detects all their library related activity and awards points and badges based on 25 certain behaviours; members get points for leaving reviews, for taking books out, for coming into the library at the same time as friends, for logging on to the electronic resources, etc. There are also plans for more targeted functionalities such as awarding points for clicking on every link of a particular course reading list. In addition, they have managed to integrate the application with Facebook, increasing its reach and binding it more closely to the social network spaces already frequented by the students. 3 Problem Statement The literature review has shown how copious and diverse the literature regarding user-generated data applications and concepts is. Despite such diversity, or perhaps because of it, the issues remain clouded in complexity. The literature does, however, appear to form a consensus that user-generated data concepts and applications are tools that must be understood and utilised by the modern academic library. Anecdotal evidence appears to show that such utilisation is not common in Danish libraries. There has been little research into academic librarian knowledge and opinions of user-generated data, and even less information exists on the extent to which Danish academic libraries are making use of user-generated data related applications and concepts. As success in utilising and understanding user-generated data appears to be a prerequisite for the modern academic library’s success, such information would have high value for policy makers and the individual library institutions themselves. 4 Research question Apart from the review of the literature, the preliminary research for this study included; browsing the internet sites of Danish academic institutions and other related organisations such as DEFF, Denmark’s Electronic Research Library (Danmarks Elektroniske Fag- og Forskningsbibliotek), telephone discussions and email correspondence with key academic librarians well placed to have better than average knowledge of the status of user-generated data use in Danish academic libraries, and a face to face interview with one of these key experts. During this preliminary research it became clear that it might be difficult to find many thoroughgoing examples of the use of user-generated data in Danish academic libraries. For this reason I 26 decided to split the research into two parts. Instead of making a detailed review of user-generated data use in Denmark, as had been my original plan, I would firstly test this hypothesis, that there were only very few examples, and then assess in a more generalised way the level of use and the nature of those implementations which did exist. The second aspect of the study would then be on the academic librarians themselves, concentrating on the knowledge and opinions held by academic librarians. In this way I hoped that even if there really did turn out to be a dearth of implementation examples I would still obtain data that was able to tell us something of value and interest about the Danish experience. The research questions for this study are therefore; 1. How is user-generated data being utilised in Danish academic libraries. What examples exist nationally? 2. What knowledge and opinions are held by Danish academic librarians about user-generated data applications and concepts? Answering both these questions should provide a useful indication of how the reality on the ground in Danish research libraries matches the level of diversity, interest and enthusiasm that exists in the literature. Useful because if the theory and the examples cited in research are not being reflected in practice then it is important to recognise that this is the case and to analyse why. The knowledge supplied by those closest to the work of the Danish academic libraries, the academic librarians themselves should be an excellent starting point for making broader conclusions about the Danish experience. The second question should go some way to providing a richer, more contextual basis for assessing just why and how the Danish experience manifests itself in the way it does, than a simple roundup of applications and projects would be capable of doing. 4.1 User-generated data As can be seen in the literature review, the majority of studies within this field tend to fall into the categories of Web 2.0 studies or social tagging studies. I find it much more problematic to separate social tagging from such things as Wikis, reviews and ratings (or even user tracking and the involuntary, de facto 27 rating that a user-trail confers on the material being browsed), than it is to separate it from other aspects of the Web 2.0 toolbox which do not result in the creation, storage and subsequent utilization of data. Those aspects which are more a matter of information being pushed out by the institution to the user, as depicted in Figure 1, such as via SMS, RSS and websites, or of a direct two-way communication process such as instant messaging or Skype, seem to me to be quite different in nature. It is for this reason that I choose to focus this study not on Web 2.0 applications, but on user-generated data. Of course any such separation is artificial and any user-generated data strategy pursued by a library is likely to form part of a broader Web 2.0 strategy, so I cannot ignore the wider picture altogether, but user-generated data will provide the focus of attention. It is, however, necessary to be careful when defining the term user-generated data. The definition must include the use of data submitted by users other than the libraries own patrons, as would be the case when, for example, a library makes use of international tagging repositories such as LTFL. It must also include the use of data that has not necessarily been freely given, user tracking data for example. Thus, in the research questions, and the resulting survey, I take user-generated data to mean; any data utilised by the library that is submitted via the internet by non-librarians - including the applications, interfaces and functionalities that allow users and librarians to make use of such data. 4.2 The academic library I view academic libraries as forming a large national body of libraries that is a distinct and coherent enough unit to separate it from the public lending libraries. One, or other, of these groups, ‘the academic library’ or ‘the public lending library’ was therefore going to be the focus of my study. I chose academic libraries because the aspects of user-generated data that interest me most are those concerned with its ability to enrich the library catalogue. The aspect I was least concerned to analyse was its ability to promote the library services. (I should declare that this interest may well from the basis of a bias in precisely which questions I chose to ask in the survey. It almost certainly has reduced the level to which some associated Web 2.0 applications are concentrated on). I took the stance that public lending libraries might be more disposed to employ user-generated data applications because of their promotional qualities than academic 28 libraries might; given the greater need for general libraries to attract new users from a more diverse population. This is a presumption of course, and not one that was necessarily borne out by my results, although to truly test this assumption I would have to have surveyed both groups. For this reason, you might say, why not then narrow the research even further and only look at academic libraries who conduct their own research, or that serve a single discipline, or that have a distinct, reasonably small community of students. Such institutions would have been of great interest to me. However, to limit myself to such narrow definitions would have given me great problems when determining my national sample. I would have had to be much more aware of what precisely each of these institutions did and how they were placed in the national picture. In the end given my lack of ability to build up such an image of Danish academic libraries in such a short space of time I decided that I would broaden the set to include any academic library. But what then is an academic library and how is it distinct (or similar to) a research library? I have used the terms academic library and research library in a very broad and interchangeable sense throughout the planning, literature search, survey and now written stages of this project so the following is an attempt to justify that lack of clarity. When the NCES surveyed academic libraries in 2003 they defined the academic library as "an entity in a postsecondary institution that provides all of the following: An organized collection of printed or other materials, or a combination thereof. A staff trained to provide and interpret such materials as required to meet the informational, cultural, recreational, or educational needs of clientele. An established schedule in which services of the staff are available to clientele. The physical facilities necessary to support such a collection, staff, and schedule. This definition includes libraries that are part of learning resource centers” (Tabs, 2003). This is quite a broad definition. Whereas Webster’s online dictionary defines a research library simply as “a library which contains an in-depth collection of material on one or several subjects. A research library will generally 29 include primary sources and related material and can be contrasted with a lending library.” There is very little that clearly separates these two definitions, other than that Webster’s appears somewhat narrower. In a secondary definition that Webster cites as being specifically relevant to the discipline of art (although I cannot ascertain why as the definition seems to work fine for any discipline) "A library containing a comprehensive collection of materials in a specific field, academic discipline, or group of disciplines, including primary and secondary sources, selected to meet the information needs of serious researchers [...] The primary emphasis in research libraries is on the accumulation of materials and the provision of access services to scholars qualified to make use of them. In most large research libraries, access to collections containing rare books and manuscripts is restricted" - which seems to broaden the possibilities out again. In an article highlighting the difficulties involved in trying to pin down exactly what is meant by a research library, Stam describes the mountain of correspondence that was amassed after someone sought an answer to the question of a library’s eligibility as a member of the Association for Research Libraries in the US (Stam, 1992). At one point a correspondent writes, “My private opinion is there’s no such thing as a research library;… If you will compare the definitions of the word ‘research’ in Murray’s ‘Oxford Dictionary’ and Webster’s ‘New International’, you can even make out a case for your own office reference shelf as a research collection” (Stam, 1992, p.3), after much more tooing and froing another correspondent writes, “I referred the matter to [...] a member of our library staff at Harvard, who is interested in library terminology. Here is a copy of his report [...] Attached were five pages of manuscript notes, starting with a referral to the editor of the dictionary of library terminology [..], continuing with an admission that American library terminology is vague, and ending with a comparison to the German concept of the ‘wissenschaftliche Bibliothek’” (Stam, 1992, p.4). I therefore propose that while research library means different things to different commentators it can at least usually be considered to be a subset of the academic library. Because of this I prefer to use the term academic library in this thesis, where possible, as my own definition is also a broadly inclusive one that could be defined here as encompassing; any noncommercial library that has a specifically educational raison d’être and serves a post-16 population. 30 However, my sense was that Danes tend to talk of forskningsbiblioteker rather than akademisk biblioteker (forskning meaning, literally research). For this reason I refer to forskningsbiblioteker in the actual Danish survey and consequently where I have translated the questions and discussed them for this thesis I have translated forskningsbiblioteker to research library. Hence the mixture of terms throughout the thesis. However, I have to accept that my, interchangeable, definition is not necessarily everyone else’s and thus I cannot dismiss the possibility that my respondents understanding of the term forskningsbiblioteker was different from mine. If this is was the case it does present a problem for the validity of the resulting data. 5 Research design To answer the research questions outlined in the previous section I decided to pursue a survey questionnaire research design. My aim was to gain the broadest possible picture of user-generated use across Denmark, rather than analyse individual implementations in detail, a survey approach seemed the one most likely to deliver this in the short time frame that was available. Being a UK citizen and only recently becoming a part of the Danish educational system I felt that I lacked an insider’s generalized knowledge of the make-up and distribution of the Danish academic libraries. I felt that the possibility of selecting areas of study, and completely ignoring other potential useful avenues, was high. Thus, a broad sweep of all potential institutions was the best way of making sure my lack of insider knowledge did not lead me down any niche and perhaps less relevant paths. I could have chosen to conduct the survey as a structured face-to-face interview or use it as the basis for a series of telephone interviews, doing this would have allowed me to explore the open-ended questions in more detail. I ruled the possibility of face-to-face interviews out, however, as I needed a national picture, and the travel that interviews would have made necessary was not possible. A second reason, and the reason I chose not do telephone (or Skype) interviews either, was my level of proficiency in the Danish language. While I can legitimately call myself fluent in Danish I often miss the details and nuances that a more proficient or native speaker would catch. In the course of the preliminary research I had a number of 31 discussions on the telephone, as well as face-to-face discussions, during such exchanges I was often aware that I was listening, and asking questions, in broad sledgehammer terms, waiting for confirmations and denials, rather than delving into the detail. Detail is difficult when you are simultaneously trying to keep on top of the language, and conduct an interview. Of course conversations can be taped, transcribed and analysed more deeply at a later date, but incisive questioning is a direct response to the dynamic of the conversation at the time. Thus it seemed to me that the level of benefit I would get by conducting the survey in person was outweighed by the control, and reach, I would obtain by conducting the survey remotely; on paper, or computer screen. It also meant that I was able to reach far more respondents with the time and resources I had at my disposal than would otherwise have been the case. 5.1 Language I chose to conduct this survey in Danish. I should declare that it was also my original intention to conduct and write the entire thesis in Danish. However, as noted above during the course of the research it became clear to me that language issues were reducing my ability to conduct sound research. This was not just an issue in my spoken Danish, it also turned out to have an effect on the nature of the written components of the study as well. Not being fully confident in my Danish meant that I could never be certain that I had worded the survey in an appropriate way. At the point of sending out the questionnaires I had clearly reached a point where I was satisfied that all questions were straightforward and understandable. Yet I found that 11% of my respondents were confused by some of the questions and one even noted that a ‘proof-read might have been helpful’. I found myself wondering if 1) I had simply made erroneous presumptions about how clear these types of questions would be to this type of audience or 2) I was making errors of judgement that were language based or 3) it was just natural to have this level of confusion out of so many respondents (as will be noted later the response rate was very good). I had of course had proof-readers, but a proof reader, with the best will in the world, is not likely to be able to point out if you are saying what you think you’re saying, only if you are not making any sense at all, or have made clear grammatical and spelling errors. There were some questions where there were an extraordinarily high 32 amount of don’t know responses. As an example, one question asked, in Danish of course, the question “The research library should be wary of utilizing user-generated data services because ...” This was then followed by a series of possible reasons to be wary along with a rating scale. The final three options were; the ethical implications of tracking user behaviour are worrying; the ethical implications of asking users to supply voluntary information are worrying; the legal complications are an obstacle. 50% of respondents opted for don’t know as their answer to the last option while only 5% said don’t know to the previous option. The Danish version of the latter option read “De juridiske komplikationer er forhindrende” Is this a badly worded Danish sentence? Or was the idea that there might be legal implications so much more strange than that there might be ethical ones? I cannot know the answer to this without contacting all of the respondents again and personally asking them. Such areas of doubt are problematic in a survey like this. You can never be sure why a person has answered the way they have, or if their answer truly reflects their opinion on the question you are trying to answer, but the less variables at play, casting potential doubts, the better. For these reasons, I chose to write the actual thesis in English, thus giving myself far more control over the eventual meaning and clarity of the messages (I hope). This of course leads to a slightly confusing mixture of original survey and data being in Danish, my own second-hand translation of this Danish into English for the purposes of the thesis, and the first-hand English written for the rest of the thesis, for which I can only apologise. 5.2 Finding the sample The first problem I encountered in determining and finding the sample for this study was of course in finding the actual institutions that should be included. If a genuinely national picture was to be gained it was important that as many libraries as possible were approached, without bias towards any particular region, subject area or educational establishment type. I based my initial list therefore on a list provided by the DFBF, the Danish Research Library Association, via the webpage http://www.dfdf.dk/index.php?option=com_content&view=article&id=132&Itemid=69. This list professes to be a list of member institutions of the Danish Research Library Association, as of April 2011.The 33 complete list can be found in the appendices. Going through this list of 93 names I removed the names that were clearly not academic libraries, or appeared to be sub-departments of other names on the list, and then began to build up a list of web address by feeding each name into Google and noting the most relevant web presence. This process left me with a total of 69 web addresses. Based on this list I then began a second internet trawl, going through each name in turn and where possible finding the names and contact details of any library staff and entering the details of relevant staff members into an Excel sheet. I was attempting to find the librarians for each institution or the people who may have a direct connection to the web services through which the libraries’ users navigate their collections. Due to the vastly varying sizes, and nature, of many of the institutions on the list the way of doing this varied from site to site, but a general working method that worked well for many was to take the web address, strip it back to the basic root and then put it into Google as; [Site:{web address} bibliotekar], bibliotekar being the Danish for librarian, or [Site:{web address} biblioteks*], which could often find library assistants - biblioteksassistent, or [Site:{web address} personale*] or [Site:{web address} kontakt*], which in most cases took me through to the relevant people and their contact details. If still no individuals emerged I would switch to general browsing, and if this produced nothing I would take the generic contact email for the institution, if such an email existed. Where the staff base from any particular library seemed very large I attempted to select only those names whose titles made them appear more likely to have tasks related to the library’s web services. However, in some few cases individual positions held were not supplied, the large Danish research institution Statsbiblioteket being one example where only the general department was available rather than the individual positions. Thus, using this technique I built up an Excel based list of 1030 names, emails, job titles, telephone numbers, libraries, geographic locations and web addresses of potential library staff from 57 separate institutions (at least separate by name.) 5.3 Survey software The next step was to work out how the survey itself should be presented and delivered. In order to ensure the integrity of both questionnaire format, and the answers I received back, I chose to place the survey on 34 the internet rather than email a version of it to potential respondents. An internet page would allow me to make certain questions compulsory, and guarantee that all replies would be standardised by format. I designed a html version of the survey and attempted to find a host that would allow me to either 1) use an emailing functionality that could email me the submitted contents of the filled out pages, or 2) allow me to set up a SQL-based database that could store the answers so that I could access them from there at a later date. However, while I feel sure it must be possible to find a free service that did either of these, none that I came across made the process easy enough for someone with a limited knowledge of DNS server settings and SQL programming to set up. While researching the possibilities, however, I came across a web-based application called Surveymonkey. Surveymonkey (http://www.surveymonkey.net) allowed me to build-up a survey questionnaire quite easily, in more or less exactly the way I wanted it to look and work. I felt it gave the finished survey a very professional look and made changing, adapting and rearranging the questions surprisingly easy. That it should look good was of course important as a good-looking, well-functioning webpage can inspire trust from the user (Fogg, 2003) and might well result in a larger response rate. Surveymonkey can be used for free, but there were certain limitations that I could not accept, so I paid a small amount to have more control over the look and feel, and the number of questions I was allowed to ask. The only drawback I found with the entire process was that if I wanted my survey on one single page I could not have conditional questions that fired off further questions. For example, on questions where the selection of possibilities included an ‘Other’ option, I would have liked the selection of the ‘Other’ to have triggered a compulsory text box forcing the user to then say something about what that ‘other’ might be. With Surveymonkey the only way to do this would be to end the current page on the question that ended with the ‘Other’ selection and have a ‘continue’ button which would then lead you to different pages dependant on which options you had selected on the previous page. Thus the survey becomes a complex set of individual pages navigated through a set of if answer = X then go to page Y. This was no good for me. I wanted to keep the survey on one page so that users could easily get an idea of the entire amount of work they were being asked to participate in by scrolling up and down the page. For this reason the parts of my 35 survey that requests a further response if the respondent selects an ‘other’ option had to be simple voluntarily text fields that did not force an answer from the respondent. This was the only drawback I found in a system which in every other way seemed to second guess almost everything I might want to do with my question set-up and the resulting data. 5.4 Survey design 5.4.1 Respondents I considered making the survey completely anonymous. Respondents might be more likely to fill out the questionnaire if they did not have to put their names to it. They might be more likely to be completely honest. Not to mention the fact that there are simply less questions to answer if they don’t have to go through the whole business of typing out their personal details. I decided, however that the benefits of anonymity were outweighed by the disadvantages of having large numbers of respondents who, given that these emails could be forwarded and the links used by someone who, for all I knew, might not be a librarian or even work in a library. The nature of the questions I had to ask were unlikely to be controversial I thought, so the honesty of anonymity was not likely to be such an issue. I made a compromise and decided to ask for their names and the names of their institutions, but not to make the answer compulsory. Their job title, however, was compulsory, thus, if I couldn’t ascertain exactly who they were, I would at least know how they might fit into my overall sample. In addition I asked questions which sought to establish which subject areas and how many users the library served, and how much control the library had over their web services. In this way I could build up an image of just what kind of library was being represented even if I did not have its name. During the course of the survey I received two emails asking about the anonymity of their answers. I gave them assurances that I would not be using the names of individuals, but that I may well use the name of the institution, where given, and where relevant. Thus the actual names of the respondents are for my own purposes and will not be revealed in this thesis or its appendices. 36 5.4.2 Situation A group of questions sought to establish the answer to the first research question; what user-generated data applications are being employed in Danish academic libraries? I asked if the library had ever made use of user-generated data, what this had consisted of, were any such projects still in use, and if they had any plans to make use of user-generated data in the future. To pin some responses down to specific applications that could be compared across all respondents I also made an extensive list of the most common, internationally recognised applications, and asked respondents to quantify their experience with each. The more general concepts of tagging and user tracking were also added to this list. Finally I asked a question concerning how they felt about the level of training and support they received with regard to innovations in this area. 5.4.3 Awareness, opinion and prediction The second research question dealt with the knowledge and opinion amongst academic librarians of usergenerated data applications and concepts. The question described above (listing applications) was configured in such a way that I could establish not only if my respondents had used each application as a librarian but also gauge their level of knowledge, and their opinions on how relevant they felt each application was to the work of the academic librarian. I broke down what I saw as the chief types of user-generated data into; user tagging of books, user tagging of newer media, user ratings, user reviews and user tracking and asked them to rate their attitude towards each of these, as well as asking about their attitude to Web 2.0 applications in general. Given the tagging vs. traditional classification debate and the proposed ability of tagging to extract domain based information that I have discussed in the literature review, I was interested to gauge how my respondents felt about user-generated data’s ability to enrich, or pollute, the catalogue of the research library. I broke the question down and asked them how they felt about tagging if it was submitted by different user groups, ranging from the general public to domain experts. The idea here was to see 1) how they felt about tagging in general and its effect on the catalogue, but also to see 2) if it made any difference 37 to them that the taggers were specialists in the subjects covered, or if the tags were subject to approval after they had been supplied by the original users. To assess their attitude to some of the issues involved I listed a number of reasons, taken from the literature, as to why user-generated data might be a problem for the academic library and a number of reasons why it may be of benefit and asked them to rate these reasons. The reasons ‘for’ being; 1) to attract users who might not normally use the services, 2) to encourage existing users to use the services more, 3) to help the library provide better service, 4) to help manage the increasing proliferation of data and 5) to help in managing new media. And the reasons ‘against’ being because; 1) it would lower the users' perception of the reliability of the library service, 2) it would lower the actual reliability of the library service, 3) it would be left ignored and unused by the users, 4) distributing the library services across various applications would dilute them and confuse users, 5) any solutions would be outdated and irrelevant, even before they were up and running, 6) the ethical implications of tracking user behaviour were worrying, 7) the ethical implications of asking users to supply voluntary information were worrying and finally 8) because the legal complications were an obstacle. I also asked questions concerning how they saw the role of the research librarian in general, as I felt this may have an influence on their attitude to user-generated data, and thus could be used to cross tabulate against their responses to the other questions. Finally I asked questions that sought to gauge their expectations about the future for user-generated data in the Danish academic library. 5.4.4 Ordering and arrangement I tried to strike a balance between 1) varying the nature and format of the questions and 2) making each question so wildly different that they required too much of an investment in concentration and time from the reader. Thus if two questions came in an obvious pairing and used a classic Likert-type scale along the lines of (Strongly disagree, disagree, neutral, agree, strongly agree) I might reverse the sense of the questioning and make one positive and one negative, so as to decrease the likelihood of the user just 38 ticking all agrees, for example. If there were too many (Strongly disagree, disagree, neutral, agree, strongly agree) based questions in a row, I would try and break them up with questions requiring the user to give a priority rating or a free text answer, for example. I felt that this kind of structuring would help make the questionnaire more of a pleasure to fill out, rather than simply a chore. One of the key sections of this survey was the question listing 11 user-generated data based applications, designed to gauge a number of details about the respondents relationship to each. Respondents were asked for two separate pieces of information for each; how much do they know about the application and how relevant they felt it was to the work of the research library. This resulted in a list of eleven applications with two adjoining columns of drop-down boxes. On paper this would have been a relatively easy question for the user to tackle (although it would have taken up considerably more space, paper not having the luxury of drop-down menus.) Once the respondent had gone through the first application the remaining 10 would be just a case of putting a tick in the relevant boxes all the way down the page. On screen, however, this question becomes a very different affair, making a selection in a drop down box is a little harder than ticking a box with a pencil. Selecting the required answer over and over again in a matrix of 22 boxes is a lot harder than its paper-based equivalent. When testing the questionnaire myself I wasn’t even halfway through this question before it occurred to me that the process was really going to annoy my respondents. The questionnaire could not be submitted unless all drop-down selections had been activated. There was, however, no way I could do without it, it was central to both my research questions. This caused me to arrange the questions in a way that I would never have done on paper. The beauty of this question was that it served a dual purpose. It alerted the respondents to the kind of applications and concepts the entire survey wanted them to think about. User-generated data is not a very specific concept, and what I meant by user-generated data would obviously aid an understanding of what the questions were asking of them. The very nature of the applications/concepts list question gave my respondents an instant grasp of what my use of the term user-generated data consisted of. Thus, it was a good starter question for the entire questionnaire, and in my original draft it followed immediately after the questions concerning their identity 39 had been addressed. I tell my respondents, before they begin the survey, that there will be 25 questions in all. Given my own response to filling out this question it occurred to me that the reader might get to this early question and think, “Oh no, I’ve got another 25 versions of this!” and that they would give up there and then. Thus in an effort to raise my response rate I decided to move the question further down the page. This was not to cheat my respondents in any way. It was simply so that they could be sure that most of the questions in the survey were far easier to fill out than this one turned out to be. Given the good response rate I feel this decision was justified, but the fact that 11% expressed confusion is disappointing, and I can’t help wondering how placing this question at the head of the questionnaire may have altered that result. 5.5 Email In my view it was important not to deter potential responders by writing lengthy discussions of the concepts involved. As this survey was also partially about awareness, I felt I might also bias the respondents’ awareness least, and help the response rate best, by keeping the email request as short as possible. Of course the questions in the survey itself introduce issues of bias, purely by the questions I choose to ask and the way I choose to ask them, but this level of influence is an inevitable consequence of the study. It was not, however, necessary to do this in the mail, thus, the e-mail that I sent to potential respondents states only the title of the survey, who I am and that I would very much like their response. ___________________________________________________________________________________ Subject: User-generated data and user tracking in research libraries Hi [name], I am a Masters student at IVA, and currently doing a project on user-generated data in Danish research libraries. The goal is to build up a picture of what projects are currently underway in Danish research libraries, and what Danish research librarians themselves think/know/hope/predict about this subject. For this purpose I have made a questionnaire (https://www.surveymonkey.com/s/forskningsbiblioteker) which I would be very grateful if you could devote 5-10 minutes of your time to complete. Thank you in advance. 40 Nicholas Paul Atkinson ____________________________________________________________________________________ At the top of the survey page itself it was necessary to give more instruction. But again the text was kept as formal and untouched by the actual subject matter of the survey as possible. ____________________________________________________________________________________ Study into user-generated data and user tracking in research libraries It will take between 5 and 10 minutes to complete this questionnaire and there are 25 questions in total. If you would like to be informed of the survey’s results, please leave a message in the comments box at the end of questionnaire. ____________________________________________________________________________________ 5.6 Pilot test A pilot test of the survey was sent out to twenty of the names from the Excel list. Out of that twenty I received only one reply from a respondent who pointed out a lack of any don’t know options on many of the questions. Based on the poor response of this preliminary attempt I decided to make a rearrangement of the questions (explained above). I also introduced a don’t know option to most of the questions. These twenty names were taken away from the 1030. They were not contacted during the actual survey, and the one reply did not form part of the survey result. Thus the actual number of people contacted for the survey was only 1010. 6. Results 6. 1 Respondents As this was an internet survey direct control over who replied was not possible. However, as the web address of the survey was only discoverable via a direct link emailed to a select group, I can assume that my respondents were made up of only those who received the personalized emails, or those who had the email passed on to them because the original recipient felt a colleague was better placed, or equally well placed, to respond. It should be noted that I did not, in the original email, suggest that anybody pass their 41 emails on in this way. This technique would have had the potential to produce a chainmail effect, generating respondents exponentially, and would perhaps have netted me far more respondents, but I felt that it was more important to know who the survey subjects were than to simply maximise the number of filled out questionnaires. But nor did I advise against them doing this, as I certainly didn’t want to miss the opportunity of having my survey passed to more relevant people than I had been able to discover during my internet trawl; it could be that my original information as to who the current librarian staff might be in any given institution was erroneous or out of date. So my position was to neither encourage, nor discourage, this kind of passing on of emails. I did not post the link on any web destination frequented by librarians or to any generic listserv addresses, but relied instead solely on my Excel based contact list of names. As noted above, the survey is to remain anonymous, but actual names were requested on a voluntary basis in order that a sound knowledge of who the respondents actually were could be maintained. Based on my rigorous, if somewhat unscientific, method of establishing the population, I found there to be 1030 individuals in total. As I received 106 responses my data could therefore, by an extremely rough estimation, be said to be 10% of the total population. 12 of the respondents expressed difficulty with the formulation of the questions. Such a large percentage (11%) means that this must be taken into account in any final analysis of the results. In such cases either my presumptions of what librarians needed to know in order to adequately answer the questions had been misjudged or else language issues may have been responsible. 97 respondents chose to reveal their names and 100 revealed the name of the library in which they were employed. As so many chose to answer the voluntary identification questions, some of the other questions aimed at ascertaining precisely what type of library the responder was based at became perhaps less relevant than they might have been. It should be noted that I was taken by surprised by the final response rate in general. I had been led to expect, from all that I had read and heard, and from the experience of 42 colleagues conducting similar projects, and from my own pilot test, that getting surveys filled out was going to be a disappointing and difficult task. All the decisions I took during the preparation for this survey; in choosing just who and how many individuals to contact, in the wording of the questions and choosing which and how many questions to ask, etc., were directly influenced by the blanket assumption that a response was highly unlikely. I had even accepted the possibility of having to run up a large phone bill from having to contact a large proportion of the email recipients in order to introduce myself, remind them of the email and ask for their help personally. That I then went on to receive 106 replies was, therefore, a great surprise. I hope this is partially a reflection of some of the decisions I took, but I think it must also say something about the particular population towards which my survey was aimed. I was lucky to pick a group of people so willing to take the time to contribute to a project of this nature. 6.1.1 Institutions The method of sampling the institutions and their individual members has already been described. Given the varying nature and sizes of the libraries involved, and their respective institutions (small hospital libraries, libraries serving a single academic discipline, multi-disciplinary university libraries, individual college libraries distributed geographically rather than by subject, etc.), it is therefore a little arbitrary to give an actual number for how many separate institutions were contacted, as it was not always possible to be 100% sure that a location represented a distinct and independent unit. I provide a list of all contacted institutions and replying institutions in the appendices, here though, in order that a general idea of the response can be gleaned, I estimate that just over 50 separate institutions were contacted and just over 50 replied. I cannot simply say that every institution contacted replied, as the named institutions/departments in my original list and the actual institutions/departments from which I received a reply were not a one-toone match in every case. Nevertheless it is clear that if the original list from the Danish Research Library Association is representative of all academic libraries in Denmark then my responding sample can certainly be said to be a representative one. Q13. Does your department have control over the design of your library’s web services? 43 Full control 20 18.9% Full control Partial control 65 61.3% Partial control No control 17 16.0% Don’t know 4 3.8% No control In question 13 the respondents were asked about the level of control that their department had over their own web service. From the chart and table above we can see that 80% responded that they had either some, or complete, control. We can safely say then that a large majority of the departments involved had some degree of control. This is important because it shows that the respondents feel their department had a direct stake in the web service and its function, and that it is not something controlled by an outside institution and simply forced upon them. However, it should also be seen in the light of Q15 which showed that individually the respondent’s personal degree of control over their web service was a much more complex picture. Q15. With regard to the design, development and implementation of your library’s web services, how would you describe your own personal role? It is primarily my decision I am a part of the team that determines these questions I am involved throughout the design and planning stages of any new… I am in a position to voice my opinion I have no influence Other (please specify) 0 20 40 It is primarily my decision 1 0.9% I am a part of the team that determines these questions 28 26.4% I am involved throughout the design and planning stages of any new development 3 2.8% I am in a position to voice my opinion 49 46.2% I have no influence 13 12.3% Other (please specify) 12 11.3% 60 44 The above figures reflect a slightly mixed perspective over how much control the individual respondents feel they have over the web services. Only one (a Library Director) responded that it was primarily their decision. The fact that 46% said they were only “in a position to voice an opinion”, and 12% said they had “no influence” at all, gives us grounds for saying that over half of respondents felt that they had relatively little control over the design, development and implementation of their web services. Many of the respondents who selected the ‘other’ option described a situation of varying degrees of responsibility based on the particular project. 6.1.2 Disciplines As the overall response was such a healthy one, I can be reasonably confident that all common subject areas were represented in this study. Q17. Are your library’s web services structured to cater for a specific discipline or are they generically organized? Generic Generic 45 42.5% Subject-based 61 57.5% Subject-based From the above chart and table we can see that 57.5% responded that their institution served a specific subject area while 42.5% reported that they served a generic subject base. This distinction is important if we are to ascertain if librarians whose libraries served a population that could be said to form a coherent domain (based on subject area) varied with respect to their employment of, and attitude towards, usergenerated data from those librarians whose patrons were more generically spread. This difference will therefore be used in cross comparisons below. It should be noted however that in this question I failed to supply a ‘don’t know’ option and in the comments section one person did query what I had meant by the word generic (or rather generisk, in Danish). This was one of those instances where it was difficult to know if I had simply made a poor choice of terms or whether my knowledge of Danish aided the confusion. 45 6.1.3 Positions As can be seen in the chart and table below, the individual library staff who responded can be overwhelmingly classified as librarians, with only about 6% of respondents really being on the periphery of that classification (student helpers, academic staff members and a project coordinator.) I can be confident then that the data I have is representative of the opinions of academic librarians in Denmark, rather than simply people connected to the library services. Librarian Librarian 53 50% Library/department/function 19 head 17.92% Library consultant 13 12.26% Library/department/ function head Information specialist 10 9.43% Assistant librarian 4 3.77% Library consultant Student helper 4 3.77% Academic staff 1 0.94% Documentalist 1 0.94% 1 0.94% Information specialist Assistant librarian Project coordinator Total 106 Student helper Academic staff Documentalist Project coordinator 6.1.4 Patrons The question, ”Approximately how many users have access to your library’s web services today?” was asked in order to gauge the extent to which the answers may differ between those serving small domainbased “communities” to those serving a much larger and more generalized population. However as 50% of respondents did not know, and of those that did return a number only 19 reported having fewer than 10,000, and only 10 of them reported having fewer than 1000 users, it is difficult to make any assumptions on how the respondents answers might be dependent on the size of their user-base. In order to truly make 46 use of this line of questioning a much more nuanced series of questions should have been asked. Asking about who their users were, do they break down into groups that the library has some knowledge or control over, do they provide different services to different groups, etc. 6.2 Situation Q18a. How would you describe your knowledge of the following [*]? MySpace SOPAC's LTFL (LibraryThing For Libraries) User-tracking LibraryThing Delicious.com Tagging Flickr Blogging Facebook Wiki's 0 10 20 30 Used as a librarian MySpace 0.9% (1) SOPAC's 1.9% (2) LTFL (LibraryThing For Libraries) 1.9% (2) User tracking 2.8% (3) LibraryThing 4.7% (5) Delicious.com 10.4% (11) Tagging 10.4% (11) Flickr 12.3% (13) Blogging 26.4% (28) Facebook 29.2% (31) Wiki's 34.9% (37) 40 The above chart and table, which have been sorted in order of increasing use rather than the way the selection was originally presented, show how many respondents answered “Used as a librarian” to the first part of question 18. (*The question has two separate parts, the version cited here is therefore a paraphrasing.) Interesting points to note here are that, at 35%, Wikis are clearly the most commonly adopted form in this list, with Facebook and Blogging receiving the next highest attention at 29% and 26% respectively. The relative age of the concepts of blogging and Wikis, the fact that they have been around for some time compared to some of the other items on the list, plus the sheer ubiquity of Facebook in all our lives, may partially explain these figures. Tagging has only been used by 10% of the sample population, which is perhaps surprising. Tagging can of course be used without knowing what it is, but as the more specific applications of Flickr, Delicious.com and 47 LibraryThing also lie around a similar, or smaller, figure it seems safe to assume that the real figure does lie somewhere around this. Interestingly, each one of these 11 respondents was based at a separate library, and cross-tabulating them against the subject-based vs. generic library breakdown resulted in an almost 50/50 split (6 generic and 5 specific). Additionally, if we remove those who have used LibraryThing, Delicious and Flickr from these 11 there are still 5 respondents left. Conversely many of those that had used these applications did not claim to have made use of tagging. This highlights the problem of mixing actual applications with concepts such as blogging and tagging, and not defining fully beforehand the concepts or the level of use. You might say that it is difficult to see how a librarian who has used LibraryThing or Delicious has not also ‘made use’ of tagging. But, reaping the benefits of tagging by navigating the content of such applications and actively implementing a functionality that requests tags, or inputting tags oneself, is perhaps a level of use worthy of this difference in distinction. It is therefore not possible to draw any concrete conclusions about the nature and extent of these respondents’ direct experiences of tagging. It is interesting that the cited experience of user tracking was particularly low. But much as we can say that the use of blogging might be partially explained by its longevity, the low score of user tracking might similarly be down to its very recency, particularly with regard to its place in the library world. Of the large, international, tagging-based applications, Flickr scored the highest use, with 13 respondents claiming to have used the service in their capacity as a librarian, interestingly only 4 of the libraries out of that 13 served a user domain that was art and design related. Only 1 person cites having used MySpace in their capacity as a librarian. Suggesting that MySpace has had little to no impact on the work of the academic library in Denmark. Q19. Are there any other user-generated data services you have used as a librarian? After the list from the previous question, respondents were asked to supply any additional application(s) they had used. Only 15 respondents had anything to add to this list. The most common of these additions being LinkedIn and Twitter. One respondent said “We have used Web polls on our site, where each month 48 we would ask users questions about the library's services - for example, is it easy to find books? We have subsequently used such data as an indicator of whether we should have better signposting, etc.” Others listed were; Yahoo! Pipes, which according to Wikipedia, “is a web application from Yahoo! that provides a graphical user interface for building data mashups that aggregate web feeds, web pages, and other services, creating Web-based apps from various sources, and publishing those apps” (Wikipedia). The use of Yahoo! Pipes by a number of academic libraries was noted in 2010 by Redden in “Social Bookmarking in Academic Libraries: Trends and Applications” (Redden, 2010). Mendeley was mentioned by two respondents which, again according to Wikipedia is ”a desktop and web program for managing and sharing research papers, discovering research data and collaborating online” (Wikipedia). Zotero, http://www.zotero.org, according to Ritterbush, ”helps users collect and organize research sources within the Firefox browser. [...] Zotero merges the best features of other citation management programs with those of popular Web 2.0 services. Users can store full citation information and sort, tag, annotate, and search these citations” (Ritterbush, 2007). Three libraries made use of Ex Libris applications. Either their BX or SFX user tracking systems. Ex Libris’ own website http://www.exlibrisgroup.com/ describes them as, “a leading provider of library automation solutions, offering the only comprehensive product suite for the discovery, management, and distribution of all materials—print, electronic, and digital.” hvemved.dk. A Danish project developed in part by the Denmark’s Electronic Research Library (Deff) which aims to experiment with the use of social media in an academic context, focusing specifically on using social media as inspiration in the initial stages of task execution. According to Skovgaard Jensen it, "provides a search application for students searching through social services. 49 The project aims to give users the possibility of making use of one another's knowledge as a supplement to the knowledge that the library itself provides access to” (Skovgaard Jensen, 2012). Diigo. The use of which was also recognised by Redden in her survey of applications, she describes Diigo as, “a social bookmarking website which allows signed-up users to bookmark and tag webpages. Additionally, it allows users to highlight any part of a webpage and attach sticky notes to specific highlights or to a whole page” (Redden, 2010). And finally there were individual mentions for videnskab.dk, Open Access 'communities' created by researchers, http://re-ad.dk/, http://historiskatlas.dk/, http://barha.dk/ and AsiaPortal. Q20. Has your institution ever made use of web-based user-generated data as a part of its web services? As question 20 followed 18, which provided a broad list of user-generated data applications, I feel it is safer to assume that the respondent has a clearer idea of what I mean by user-generated data by this point. Thus, in this case perhaps the high percentage of ‘don’t knows’ in this particular question really does reflect a lack of knowledge of what user-generated data applications had been used, rather than simply a reaction to not understanding what I was asking. Of course, this is an assumption I can only make tentatively, the vagaries of language and the way things are interpreted can throw up all kinds of anomalies. In this case it doesn’t help that I erroneously over complicated the question by saying “Has your institution ever made use of web-based user-generated data as a part of its web services?” Adding the words “web based” was an oversight and completely unnecessary in the context of the complete sentence, such inconsistency can 50 only aid confusion in the respondent. Bearing all that in mind the breakdown of the results still suggests that a substantial amount of academic libraries are not making use of user-generated data. It might have been instructive to see how those that said “No”, compared on their answers to other questions about the usefulness of user-generated data in general. Having cross tabulated the results against many of the other questions I found no significant variations however. Q21. If yes, please specify what, and if it is still in operation? Where respondents had cited that their library has made use of user-generated data they were then asked to specify what this had consisted of. There were 41 responses to this question, which included. 17 Facebook (1 Facebook-game concerning image-tagging) 8 user tracking, recommender systems (3 specifically BX and 2 specifically SFX) 5 blogging 4 user-reviews & comments 5 user tagging (2 specified image-tagging and 1 the tagging of course material ) 2 LTFL 2 Twitter 1 various crowd-sourcing-projects 1 data-mining project 1 Flickr 1 Delicious 1 user-ratings 1 Web-polls 1 Second Life 1 cited a structured anthropological study and greater statistical knowledge gathering 1 combined geotagging and user tracking 51 1 referred to the setting up of an biographical encyclopaedia of famous Danish women http://www.kvinfo.dk/side/170/ “where it has been possible to enrich the work with information from the users”, said the respondent. 2 Patron driven acquisition (but only one implementation as both respondents were from the same institution). Andersen describes PDA as a model whereby it is possible to “let library users find and identify desired documents prior to the library’s purchase of them, and for the library to pay only for what its patrons find and actually use. When a patron’s use of an eBook or journal article passes a certain agreed-upon threshold (a certain number of eBook pages read, for example, or the download of a complete article) the library is charged, the document acquired, and the patron never knows that the document was not part of the ‘collection’ to begin with” (Andersen, 2011) Revealingly, over half of those that responded to this question pointed out that the application was no longer in operation. Q22. Does your institution have any plans to employ user-generated data in your library’s web services in the future? From the above chart we can see that that at least 49% of library institutions across the country are either considering employing user-generated-data in the future or have concrete plans to do so. Looked at another way, however, we can say that only 25.5% of librarians are certain about their library’s future intentions with regard to user-generated data, while the remaining 74.5% of respondents either don’t 52 know or have not yet decided what exactly their library will be doing. Again the high don’t know result here mirrors the high percentage of don’t knows on Q.20 and could, therefore, signify a lack of understanding of the question, but it may also display a genuine lack of knowledge by individuals of what their libraries are intending, or able, to do with user-generated data. Q12. I receive all the training and support I require to remain abreast of the relevant innovations in this area. The results to the above question do not point in any strong direction. It seems, therefore, that there is mixed opinion throughout the country as to whether or not librarians feel they are receiving the training and support they really need on this issue. However, if we add the agrees together to make an overall positive index and the disagrees together to make an overall negative index we arrive at 20% and 33% respectively. We can say, therefore, that 33% of respondents do not feel that they are currently getting the support they need. 6.3 Awareness & opinion Q18. How would you describe your knowledge of the following, and which of them do you think has relevance to the work of the research libraries in Denmark? Knowledge of: No knowledge Limited knowledge Have used Have used as a librarian Rating avg. SOPAC's 74.5% (79) 21.7% (23) 1.9% (2) 1.9% (2) 1.31 LTFL (LibraryThing For Libraries) 64.2% (68) 31.1% (33) 2.8% (3) 1.9% (2) 1.42 53 LibraryThing 33.0% (35) 42.5% (45) 19.8% (21) 4.7% (5) 1.96 MySpace 26.4% (28) 50.0% (53) 22.6% (24) 0.9% (1) 1.98 User tracking 25.5% (27) 52.8% (56) 18.9% (20) 2.8% (3) 1.99 Delicious.com 31.1% (33) 30.2% (32) 28.3% (30) 10.4% (11) 2.18 Tagging 11.3% (12) 38.7% (41) 39.6% (42) 10.4% (11) 2.49 Flickr 12.3% (13) 36.8% (39) 38.7% (41) 12.3% (13) 2.51 Wiki's 3.8% (4) 18.9% (20) 42.5% (45) 34.9% (37) 3.08 Facebook 3.8% (4) 14.2% (15) 52.8% (56) 29.2% (31) 3.08 Blogging 5.7% (6) 25.5% (27) 42.5% (45) 26.4% (28) 3.24 The results of the “Have used as a librarian” aspect of this question have already been presented in section 6.2. In the above table however we can see the full details of the selections regarding the level of knowledge each respondent had of each application. The list has been arranged according to the rating average. This ranking is worked out by giving the 4 possible options a score of 1 (no knowledge) through to 4 (have used as a librarian). Each frequency is then multiplied by its score, the sum of which is then divided by the total number of respondents (106) to find the mean. Thus the minimum value here is 1 and the maximum is 4. As such there is quite some difference between the applications at the two extremes of the list, going from 1.31 to 3.24. That 74.5% and 64% have never heard of SOPAC or LTFL, respectively was a surprise. Yet again, however, wording may well be an issue here. In the context of the complete range of possible selections I would expect ‘No knowledge’ to be equivalent to never having heard anything about the item whatsoever. That there are 4 respondents, then, who describe themselves as having ‘No knowledge’ of Facebook is something of a surprise. LibraryThing receives a much lower rating than its fellow international tagging-based applications of Flickr and Delicious, which is perhaps a surprising result among librarians. In many ways the ranking here reflects the results of the Have used as librarian aspect already presented. Wiki’s, Facebook and blogging, for example, being the applications librarians cite most knowledge of. Although, here it is interesting to note that the order of those three is reversed. Can this indicate that Wiki’s are more specifically relevant to the work of the libraries than the other two which are perhaps considered more generally useful in contexts outside of librarianship? 54 50% of librarians have made use of tagging. While user tracking has only been used by 20% of librarians. Relevance: Don’t know No relevance Some relevance Important MySpace 42.5% (45) 43.4% (46) 13.2% (14) 0.9% (1) 1.26 LibraryThing 55.7% (59) 15.1% (16) 25.5% (27) 3.8% (4) 1.74 Delicious.com 55.7% (59) 11.3% (12) 31.1% (33) 1.9% (2) 1.78 Flickr 40.6% (43) 17.9% (19) 33.0% (35) 8.5% (9) 1.84 Blogging 12.3% (13) 21.7% (23) 57.5% (61) 8.5% (9) 1.85 SOPAC's 82.1% (87) 4.7% (5) 10.4% (11) 2.8% (3) 1.89 LTFL (LibraryThing For Libraries) 82.1% (87) 4.7% (5) 9.4% (10) 3.8% (4) 1.94 Facebook 12.3% (13) 14.2% (15) 55.7% (59) 17.9% (19) 2.04 Wiki's 17.9% (19) 7.5% (8) 52.8% (56) 21.7% (23) 2.17 Tagging 17.9% (19) 4.7% (5) 56.6% (60) 20.8% (22) 2.2 User tracking 34.9% (37) 2.8% (3) 39.6% (42) 22.6% (24) 2.3 Rating avg. This table presents the data from the second half of the question dealing with the issue of how relevant respondents felt each application was to the work of the research library. Again this table is sorted by a ranking based on the rating average. The rating was worked out in a very similar way as above except that here there are only three items in the scale, and the resulting sum is then divided by the total number of respondents minus the number of respondents who selected ‘don’t know’. Thus the potential range of the rating goes from 1-3, and as the overall number of don’t knows was so high the rating is based on a much reduced set of respondents. As so few people had heard of SOPAC and LTFL it is hardly surprising that 82% selected don’t know for these applications. For those people that did supply a judgement on these two applications, and including those who supplied a judgment on LibraryThing, Delicious and Flickr, we can say that the majority thought that there was ‘some’ relevance in these applications, however, it is also true to say that the overall judgment inclined more towards No relevance than towards Important which indicates a lack of acceptance for the idea of introducing any of the popular international tagging data repositories into the work of the academic libraries in Denmark. 55 Facebook and Wiki’s, as before, gains a high rating when we look at relevance, but what is particularly interesting here is the sudden appearance of user tracking and tagging at the head of the list. In the previous section librarians described their experience of and knowledge of tagging and user tracking as being low, while user tracking itself scored particularly badly, even with respect to tagging. Whereas here respondents clearly consider user tracking and tagging to have more relevance than any of the other applications, and perhaps even more interestingly user tracking gains a higher rating index than tagging. Q5. I have a positive view of ... Stongly disagree Disagree Neutral Agree Strongly agree Don’t know Rating avg. User reviews 0.0% (0) 10.4% (11) 25.5% (27) 45.3% (48) 17.9% (19) 0.9% (1) 3.89 User tagging of books 0.9% (1) 6.6% (7) 26.4% (28) 43.4% (46) 17.0% (18) 5.7% (6) 3.91 User tagging of newer media 0.9% (1) 3.8% (4) 26.4% (28) 46.2% (49) 17.0% (18) 5.7% (6) 3.97 User ratings 0.9% (1) 8.5% (9) 23.6% (25) 43.4% (46) 21.7% (23) 1.9% (2) 4 The tracking of user behaviour 1.9% (2) 8.5% (9) 17.9% (19) 43.4% (46) 21.7% (23) 6.6% (7) 4.03 Social media and Web 2.0 services in general 0.9% (1) 2.8% (3) 17.0% (18) 47.2% (50) 27.4% (29) 4.7% (5) 4.31 The interesting things to note in the table above is how heavily the overall figures for all aspects are weighted positively. In fact if we combine the two positive figures and the two negative figures we can see that the percentage of people viewing these different aspects of user-generated data positively ranges from 60-74% while the percentage with a negative view ranges from 3-10%, this is a substantial difference. No one aspect can really be singled out here as being substantially more or less popular than others but the rating average gives us a slim indication of preferences (they are presented in ranked order, thus user reviews received the least positive response, and Web 2.0 in general the most positive, but only by a slim margin). We can say therefore that respondents are positively disposed to a broad range of user-generated functionalities with no particular favourites and no particular dislikes. For this and all the 5-point Likert-style (strongly disagree-strongly agree) questions the rating average was worked out by giving each option a score of 1-5, multiplying that score by the frequency and then dividing the sum of those score-by-frequencies by the total number of respondents - the number of don’t knows. 56 Q6. User-generated data has the potential to enrich the catalogue of the research library, if submitted by... (1) No potential (2) (3) (4) (5) Great potential Don’t know Rating avg. General Public 10.4% (11) 19.8% (21) 31.1% (33) 17.0% (18) 8.5% (9) 13.2% (14) 2.92 Domain students 4.7% (5) 9.4% (10) 19.8% (21) 38.7% (41) 11.3% (12) 16.0% (17) 3.51 General Public, and approved by librarians General Public, and approved by domain experts 7.5% (8) 12.3% (13) 15.1% (16) 31.1% (33) 20.8% (22) 13.2% (14) 3.52 6.6% (7) 11.3% (12) 11.3% (12) 37.7% (40) 17.9% (19) 15.1% (16) 3.58 Domain students, and approved by domain experts Domain students, and approved by librarians 5.7% (6) 4.7% (5) 12.3% (13) 33.0% (35) 27.4% (29) 17.0% (18) 3.86 4.7% (5) 5.7% (6) 12.3% (13) 33.0% (35) 28.3% (30) 16.0% (17) 3.89 Domain experts 2.8% (3) 4.7% (5) 12.3% (13) 33.0% (35) 34.9% (37) 12.3% (13) 4.05 Domain experts, and approved by librarians 2.8% (3) 3.8% (4) 8.5% (9) 21.7% (23) 50.0% (53) 13.2% (14) 4.29 The above table show the results of a question which listed a number of different user groups/taggers and asked the respondents to rate the potential that tagging submitted by each of these groups had to enrich the catalogue of the research library. Here I have multiplied the frequency by the scale-rating, and then divided the sum of those by (the total number of respondents - the number of don’t knows) to arrive at the eventual rating average, the table is then sorted by this rating average. The chart also presents this rating average, but unlike the table it remains in the order the options were presented in the survey. The rating average scores show that overall respondents believe most potential exists in data submitted by domain experts and approved by librarians, although data simply submitted by domain experts with no subsequent approval also scores highly. In contrast, unapproved data submitted by the general public receives the 57 lowest score with unapproved data from domain students also scoring low. Given that any score above 1 represents some potential, the fact that only 10% believe there to be no potential from tagging submitted by the general public belies a very positive attitude in general towards tagging. This positive attitude is further illustrated in the way all the frequency figures are weighted closer to 5 than to 1, resulting in a mean rating average across all groups of 3.7. Q7. The research library should utilize user-generated data, because it can ... Strongly agree Agree Neutral Disagree Strongly disagree Don’t know Rating avg. Help the library provide better service 23.6% (25) 54.7% (58) 8.5% (9) 8.5% (9) 0.9% (1) 3.8% (4) 2.05 Encourage existing users to use the services more Help in managing new media 9.4% (10) 56.6% (60) 17.0% (18) 7.5% (8) 2.8% (3) 6.6% (7) 2.33 11.3% (12) 40.6% (43) 29.2% (31) 9.4% (10) 1.9% (2) 7.5% (8) 2.46 Help to manage the increasing proliferation of data Attract users who might not normally use the services 14.2% (15) 37.7% (40) 19.8% (21) 15.1% (16) 5.7% (6) 7.5% (8) 2.57 7.5% (8) 26.4% (28) 9.4% (10) 2.85 32.1% (34) 20.8% (22) 3.8% (4) The above table presents the frequencies scores for how respondents rate the different reasons why the research library should utilize user-generated data. The rating average is worked out as previously described for the 5-point Likert-type scale and the table presents each reason sorted by this rating average. Again the chart displays this rating average arranged in the order of the original survey. Here the results seem to be fairly evenly spread. The rating goes from 1 to 5 yet all the rating averages fall between 2 and 3. In this case the higher the rating the less the respondents agree with the statement. Thus, the power of 58 user-generated data to attract users who might not normally use the service was the least important aspect for the respondents, while its ability to help the library provide a better service was the most important. However, given that there is so little variation between each rating average the safest statement to make about the results is that that all reasons are viewed as good reasons by a majority of respondents, but no reasons are singled out as being particularly important or unimportant. Could there be other reasons that would have rated highly? A follow up question could have asked respondents to provide any additional reason that I may have neglected. Q8. The research library should be wary of utilizing user-generated data services because ... Strongly disagree Disagree Neutral Agree Strongly agree Don’t know Rating avg. The legal complications are an obstacle 1.9% (2) 12.3% (13) 21.7% (23) 13.2% (14) 0.9% (1) 50.0% (53) 2.98 The ethical implications of tracking user behaviour are worrying Solutions would be outdated and irrelevant, even before they are up and running It would be left ignored and unused by the users 5.7% (6) 32.1% (34) 28.3% (30) 22.6% (24) 6.6% (7) 4.7% (5) 2.92 5.7% (6) 27.4% (29) 31.1% (33) 18.9% (20) 0.9% (1) 16.0% (17) 2.79 6.6% (7) 36.8% (39) 21.7% (23) 17.9% (19) 4.7% (5) 12.3% (13) 2.74 It would lower the users' perception of the reliability of the library service It would lower the actual reliability of the library service 7.5% (8) 34.0% (36) 32.1% (34) 18.9% (20) 1.9% (2) 5.7% (6) 2.72 8.5% (9) 42.5% (45) 22.6% (24) 17.9% (19) 2.8% (3) 5.7% (6) 2.62 Distributing the library services across various applications will dilute them and confuse users 12.3% (13) 41.5% (44) 18.9% (20) 8.5% (9) 4.7% (5) 14.2% (15) 2.44 The ethical implications of asking users to supply voluntary information are worrying 12.3% (13) 53.8% (57) 17.0% (18) 11.3% (12) 0.9% (1) 4.7% (5) 2.32 The potentially negative reasons are the subject of question 8. The table above is arranged, and rating average worked out, as for the previous question. It should be noted that in this case, however, because of the way the question is worded a high rating represents more agreement with the statement, but that this agreement in effect demonstrates a negative attitude towards user-generated data. In other words a low rating means that the reason should not prevent research libraries from employing user-generated data. Again the rating average fell between 2 and 3. Thus the majority of respondents were unconvinced that any of these reasons represent a good reason for being wary of user-generated data, while very few respondents were ready to declare that any reason was particularly relevant or irrelevant. The anomaly to 59 point out in this question is the exceptionally large number of don’t knows with regard to the legal complications being an obstacle to the research libraries utilization of user-generated data. This has been discussed earlier; here, I will just add that it could of course signify that respondents are genuinely in some doubt about the issue, but it is equally possible that it represents confusion with the question itself. Q9. Please prioritize these aspects of the research librarian’s role. (1) Lowest (2) (3) (4) (5) Highest Don’t know Rating avg. The selection and categorization of material 8.5% (9) 15.1% (16) 21.7% (23) 30.2% (32) 24.5% (26) To make all material easy to search and retrieve for users 4.7% (5) 11.3% (12) 16.0% (17) 25.5% (27) 42.5% (45) 0.0% (0) 3.90 To make selected material easy to search and retrieve for users 3.8% (4) 3.8% (4) 17.0% (18) 28.3% (30) 47.2% (50) 0.0% (0) 4.11 To discover what it is users want and make it accessible to them 1.9% (2) 3.8% (4) 16.0% (17) 31.1% (33) 47.2% (50) 0.0% (0) 4.18 3.8% (4) 3.8% (4) 8.5% (9) 59.4% (63) 1.9% (2) 4.33 To enhance the user’s own information literacy skills 22.6% (24) 0.0% (0) 3.47 Question 9 dealt with how the respondents viewed the role of the research librarian. Respondents were asked to prioritize a series of different roles. The table above presents the results together with a rating average worked out as described for question 6. In this instance, by chance, it emerges that the way the options were presented in the survey matched the order in which the rating averages ranked them. Thus both table and chart are ordered according to the original survey presentation as well as ascending rating average. It is interesting to note the direction the ratings take. The respondents give increasing priority to roles that place the ‘user’ at the centre rather than the ‘collection’ - ranging from the selection and 60 categorization of material which does not necessarily imply any involvement of the user, up to enhancing the user’s own information literacy skills which does not necessarily imply any commitment to the collection. There are, however, no substantial differences so this conclusion may be overstating how much these results can tell us, the indication is nevertheless interesting. 6.4 Prediction Q10. User-generated data will not be a part of the future research library web services. 50 45 40 35 30 25 20 15 10 5 0 Strongly agree Agree Strongly agree 1.9% 2 Agree 9.4% 10 Neutral 10.4% 11 Disagree 42.5% 45 Strongly disagree 24.5% 26 Don’t know 12 11.3% Neutral Disagree Strongly disagree Question 10 makes the statement User-generated data will not be a part of the future research library web services and ask how far the respondents agree. The table above presents the frequencies, and percentage of total respondents. The chart displays the same data, the x-axis being the frequency. 11% of respondents agreed with the statement in some way, while 67% disagreed. The result allows me to say that the vast majority of respondents believe that user-generated data will form an aspect of the future research library’s web services. Q11. The inclusion of more user-generated data in the research library’s web services is an inevitability. 61 50 45 40 35 30 25 20 15 10 5 0 Strongly agree Agree Strongly agree 15.1% 16 Agree 44.3% 47 Neutral 19.8% 21 Disagree 7.5% 8 Strongly disagree 1.9% 2 Don’t know 12 11.3% Neutral Disagree Strongly disagree Question 11 is very similar to question 10, it does, however reverse the logical sense of the question. The table, as before, presents the frequencies, and percentage of total respondents, while the chart displays only the frequency. This time respondents are asked if the inclusion of more user-generated data in the research library’s web services is an inevitability. 59% of respondents agreed in some way that it was, with only 9% this time disagreeing. We can say then that the overwhelming majority of respondents in my sample believe that user-generated data is inevitably going to form part of the research library’s services. 7 Discussion of results 7.1 Response This research covered 50 separate academic institutions. Roughly speaking all institutions contacted responded. These institutions were comprised of all relevant members of the Danish Research Library Association and as such should reflect, if not totally cover, academic libraries across Denmark. All common disciplines are represented, while the reach of each institution was split fairly evenly between single discipline libraries and generically spread libraries. 94% of respondents can be described as librarians. I 62 believe this response gives me good grounds for suggesting that the responses are a fair and balanced reflection of all academic librarians in Denmark. 7.2 How user-generated data is being utilised in Danish academic libraries Unlike the Web 2.0 analyses performed by Xu et al., and Mahmood & Richardson Jr (Xu, et al., 2009; Mahmood & Richardson Jr, 2011), the user-generated data application that most respondents reported having used in a professional capacity was the Wiki. Wikis are perhaps more specifically library service orientated than Facebook or blogging, which may explain its high use among Danish academic librarians. It would be interesting to make some direct international comparisons to see if there really is a difference between the Danish enthusiasm for the tool and that of other countries, it would also be interesting to see how the use of Wikis increases or decreases over time in Denmark relative to such things as Facebook. Facebook and blogging followed closely after Wikis in their level of use, which is more in line with the findings of Mahmood & Richardson Jr, these two applications scored consistently high in all questions related to use and relevancy. It seems likely that a partial explanation for blogging’s prevalence can be its longevity, while Facebook’s ubiquity in and outside the library world can partially explain its high result. The demise of MySpace noted in the literature review and mirrored in Mahmood & Richardson Jr, appears to also be reflected in Denmark, with only 1 respondent citing having used MySpace in a professional capacity. It seems likely that MySpace no longer features as part of the Danish academic library’s web services, if indeed it ever did. 50% of respondents had made use of tagging, while only 10% had used it in a professional capacity. This is an improvement on Xu et al’s 2009 study, but Mahmood & Richardson Jr’s study from 2011 found 55% of the US institutions surveyed using tagging. It should be noted however that the two numbers are somewhat different in many ways, particularly as Mahmood & Richardson Jr’s refers to institutions while the result in this study refers to individual librarians. Nevertheless, the findings do suggest that tagging usage may be less in the Danish academic library than it is in the US. 63 User tracking has only been used by 20% of respondents while only 2.8% had used it in a professional capacity. This small percentage of use is perhaps surprising when viewed against 1) the higher use of tagging and 2) the high relevancy statistic that user tracking received (discussed below). This, however, may reflect the fact that user tracking functionalities may exist on a library’s website without being used directly by the librarians themselves. By its very nature, user tracking is unlikely to involve any direct manipulation by the librarian, unless of course they also happen to be the programmer of the system doing the tracking. What, in fact, does ‘use’ mean in the context of this question? Again concepts needed defining and more nuanced questions asking before a true picture can emerge. Nevertheless, it would be interesting to see how the ratio between the use of user tracking and user tagging might change in the coming years, especially given the historical progression discussed in the literature review. 41% of respondents said their institutions had made use of user-generated data, while 30% said their institution had not. If we also attempt to incorporate the 29% of don’t knows, the potential percentage of respondents whose institutions have never used user-generated data ranges from 30-59%. In light of Mahmood & Richardson Jr’s study it seems the lack of utilisation of user-generated data in Denmark is relatively high. Although, again, it must be noted the results reflect figures from individuals rather than libraries and Mahmood gives us no overall statistics of user-generated data applications, only Web 2.0 applications (RSS and IM are so much more common than their user-generated data counterparts that they can skew the figures dramatically). 74.5% of respondents either didn’t know, or their institution had not yet decided, what plans regarding user-generated data projects existed at their library. This result points to a very high degree of uncertainty, especially in the light of the general acceptance and enthusiasm for usergenerated data discussed below. When asked to list other user-generated data applications that respondents had themselves used, the resulting list proved far smaller than when asked to list the user-generated applications that they were aware of their library having made use of. This suggests that many of the respondents as individuals were 64 not directly involved or connected with the user-generated data applications being utilised by their institutions. Such a result is perhaps natural enough given that, logically, the percentage of librarians using user-generated data must be lower than the percentage that they, or someone they know of at their library, has used it. Nevertheless, the contrast between the two numbers was large enough to be of interest. 41 responded with descriptions of the user-generated data projects that their institution had instigated. There were some interesting examples which had not formed part of the literature review, such as Patron Driven Acquisition, Ex libris user tracking systems, and a user-enriched biographic encyclopaedia (which is not quite the same as a Wikipedia type arrangement as the base information is provided by the institution but then enriched with the aid of user-generated data). Gamification did not feature highly in this list of projects, 1 respondent listed a Facebook game concerning image-tagging and other crowdsourcing-projects, and another respondent cited a combined geo-tagging and user tracking project which may well have had gamification overtones, but this was only two responses out of 106. Roughly half of the respondents who supplied information on projects noted that some were no longer in operation. This paints a fairly poor picture of current user-generated data use across the country. While many academic libraries have attempted some kind of foray into the user-generated data tool set, many had also experienced such a lack of interest from the users that it had been necessary to close it down. 33% of respondents did not feel they received enough training and support to keep them abreast of usergenerated data developments. This statistic warrants some concern and attention. Given the importance of the concept to the modern academic library (borne out in the literature review as well as the results of this survey), and given the speed of change in the area and the diversity of applications and ideas, if 33% of the Danish academic librarian population feel uncertain of the changes and developments taking place this suggests a potential problem for the future library services. 65 7.3 The knowledge and opinions held by Danish academic librarians about usergenerated data applications and concepts? Respondents expressed a clear lack of knowledge of LibraryThing, LTFL, Flickr, Delicious and SOPAC - i.e. the large international tagging based repositories, when viewed against the knowledge they expressed of Wiki’s, Facebook and blogging. Indeed blogging is by far the application they seem most acquainted with. Moreover, the respondents found little relevance for these international tagging-based repositories. All such applications had far less than 10% of respondents declaring that they were important to the work of the Danish academic library, while Facebook, Wikis, user tracking and tagging all scored around 20% on the same measure. Given the degree of attention paid to these sites by the academic library community described in the research literature, this result is perhaps surprising. The low level of knowledge could partially explain the low relevancy level, but as the average relevancy ratings were worked out without including the don’t know responses it is possible that the relevancy result is a true indication of opinion. It could be that the research described in the literature has failed to convince academic librarians in Denmark that such tools can aid their institutions. Perhaps the fact that they are internationally driven, US-centric and primarily in the English language, has a bearing on these results. It is also interesting that blogging received such a small relevancy rating, given its high prevalence of use. Perhaps this relevancy rating vs. usage rating reflects a mood away from blogging as an important tool, given that other tools now, such as Facebook and Twitter incorporate many of the same functionalities. Tracking and tagging had greatest relevance to the work of the academic library, according to respondents. The particularly high user tracking relevancy score suggest that the changing emphasis detected in the literature review may be reflected in the Danish experience. If this survey had been conducted only two years ago, would user tracking have scored so high? A valuable opportunity was missed in the design of the survey relating to Q18. A follow-up question could have asked respondents to list any user-generated data applications (not included in Q18 itself) that they considered to have relevancy for the academic library. Given the diversity of the answers to Q19 (asking for examples of application the library had actually used), such a question may have obtained 66 information not only about which other applications were being used by the institutions, but which applications were not being used, but that the respondents themselves felt relevant. Respondents were overwhelmingly positive in their attitude towards the generic aspects of user-generated data. The percentage of respondents with a negative attitude towards the listed aspects was 10% or under, compared to 60% or over who viewed them positively. This is a clear indication that academic librarians in Denmark are persuaded of the benefits that user-generated data can ‘potentially’ bring to the work of the library. Respondents also seem to have a great deal of trust in the ability of users to enrich the catalogue through tagging. This was especially the case with respect to domain experts; 50% of respondents declared that tagging by experts and approved by librarians would have great potential to enrich the library catalogue. But even tagging unapproved by librarians and submitted by the general public only had 10% of librarians prepared to declare having no potential. Respondents have a great deal of trust in the ability of domain experts not only to submit, but also to approve, the data. Although, it is also true that for each different user group, data approved by librarians scored more highly than data approved by domain experts. So, much in the same way that Q5 demonstrated how respondents were overwhelmingly in favour of multiple aspects of user-generated data, these results demonstrate that respondents are very positive towards the idea that the catalogue itself can be enriched by social tagging. An overwhelming majority of respondents believed that an increasing use of user-generated data in the Danish academic library was an inevitability. Given this result it seems that increasing knowledge and research into this subject is critical, especially given the evidence that close to half of respondents do not think their institution is currently employing user-generated data, and those that are have in many cases seen projects dropped through lack of user support. The reasons listed in the survey for why the academic library in Denmark should utilise user-generated data were broadly accepted as all being good reasons, while the reasons against were not accepted as 67 warranting too much concern. Once again this result demonstrates an overwhelming positive stance by academic librarians in Denmark towards the idea of using user-generated data. When respondents were asked if they agreed that they should be wary of utilising user-generated data because of (x), the fact that the responses fell mainly into the ‘disagree’ and ‘neutral’ categories, rather than strongly disagree or any kind of agreement, may well suggest that for academic librarians in Denmark these reasons are noted as being relevant, but not strong enough that they should prevent the library from pursuing user-generated data projects. The respondents found all the potential roles of the academic librarian, listed in the survey, to be important. In hindsight this result, that all were prioritised highly and fairly equally, may be somewhat predictable. When I look at the question, I too would be hard pressed to rate one of these aspects above any of the others, and I too would have rated them all highly - perhaps everybody would. The list should perhaps have been more extensive and contained elements that may have been more open to debate, that could have acted as a balance to the roles that were important. As noted earlier one interesting aspect of the result, however, is that the ranking of the results (though they only differ from each other by small amounts), suggests that the respondents prioritise aspects of the librarian’s role that centre on the user higher than ones that centre on the collection itself. As with Q18, I feel that the above three questions conceal a missed opportunity in this study. If I could do the survey again, after questions 7, 8 and 9, I would have asked respondents to suggest more examples of; reasons to use user-generated data, reasons to be wary, and important roles for the librarian - such rich data, if supplied, would have been of great value to the main aims of the study. 7.4 Clarity of concepts and questions There are problems that must be recognised with this survey. 11% expressed difficulty with the questions and certain concepts and terms could have been defined in advance before respondents were asked to begin the survey. What is user-generated data? What is user tracking? What is a research library? What are 68 domain students or domain experts? What does it mean to say that the library serves a generic subject base? What does it mean to have used an application or concept? Definitions of all these ideas could have been made more fully and succinctly. However, my reasons for not doing so were twofold; one was so as not to bias the pre-knowledge of my respondents, the other was not to deter them from filling out the survey. In hindsight, I consider the latter judgement to have been misguided; bias or not, there is no point in asking questions which are not understood. As to the first reason, however, I think my decision may have been justified. While such explanations may have made the responses more reliable and informative, it could also have decreased the response rate. If lengthy explanations of concepts had been given it would have extended the amount of reading necessary to fill out the survey significantly. In the light of the actual response rate this may seem to have been a flippant decision, but prior to sending the survey out I could have no idea that I would receive any response at all. Furthermore, some of these concepts could not have been explored adequately without vastly increasing the amount, and detail, of the questioning, this level of detail and accuracy is perhaps for further, more targeted, studies than the current one, which was largely exploratory in its intentions. Certain questions could, nevertheless, have been more effectively worded, adding the words ‘web-based’ to Q20 unnecessarily was untidy, and not placing a don't know on Q17 was an unfortunate oversight. 7.5 Language issues The language issue was undoubtedly a problem. I cannot dismiss the possibility that, because of my Danish language proficiency, certain questions might have been worded badly or the nuance directed in a way that I hadn’t quite intended. This is of course always a possibility in any survey; all sentences are open to misinterpretation, they may not even be read at all; a survey cannot easily reveal the level of attention, thought or understanding that a responder has given it. I believe that the impact of such problems has been contained to an extent by the controlled nature of the survey format; everybody was given the same set of questions worded in an identical way. If there were errors, then everyone experienced the same error. However, had language not been such an issue, interviews and focus group meetings might have been 69 additional avenues open to me which could have aided the triangulation of the data and thus the overall confidence of the reliability of the results. 7.6 International relevancy It is difficult to say how far these results can be said to be reflective of the international situation. There are certain peculiarities of the Danish experience which make comparisons between the focus and priorities of Danish academic librarians and that of their international counterparts problematic. The first and most obvious of these is language. As most of the studies in the literature review were based in the US or UK, the fact that they would be more likely to have an interest in the larger international tagging sites such as LibraryThing may be simply logical. One of the aspects of tagging that makes these international tagging sites successful is the sheer amount of people they can draw on and mobilise to use their services (Spalding, 2007), these same numbers are not available to academic institutions to the same extent, and they certainly aren’t available to Danish academic institutions. But neither, can it be stated that the peculiarity of the Danish situation is all negative in this respect. My experience in Denmark is that the community spirit of Danes is much stronger than can be said of many other countries, and the circumstances and experiences of students are more homogenous and unified. It could be, therefore, that Danish institutions might have more success than US institutions in mobilizing their user-base, albeit on a smaller scale. It should also be noted that much of Danish academic life is already very English-languagecentric (depending on the field of study), the reading lists are heavily English-orientated and the main data repositories of journals and other academic texts used by Danish academic institutions are also largely English-language-based. Danish students have become used to this, and the level of English proficiency amongst Danish students is very high, nevertheless if users were asked to contribute their ‘opinion’ to the catalogue, such data would almost undoubtedly be in Danish, and if they were asked to contribute English keywords to the catalogue they may understandably be more uncertain about the validity of their contributions. There are, thus, a series of complexities at play in the Danish experience that, while we can’t say they are necessarily negative or positive, do at least make comparisons difficult. 70 Another cause to be concerned when trying to make comparisons between the Danish situation and that painted by the English-language-centric research literature is the nature of the public sector in Denmark with respect to the US. The public sector in the US is far more decentralized than it is in Denmark (Andersen & Kraemer, 1994) (as is the UK’s, although not quite to the same extent). This leads to a situation where large projects tend to be centrally driven in Denmark, whereas the examples discussed in the literature review were independent projects and experiments conducted by individual libraries with little national direction or support. Perhaps this level of decentralization apparent in the US and UK is beneficial in the world of Web 2.0. The ability to respond dynamically to swiftly changing technologies, as well as the ability to tailor a solution to the contextual needs of a demanding and localised user-base, are two things that nationally driven initiatives are likely to have trouble with. When I first came to Denmark, 12 years ago, I was amazed by the effectiveness and efficiency of their national internet library service, bibliotek.dk; a national project that unifies all public libraries digitally and enables users to easily order material from any library through their localised library portal of this nationalised network. It seemed streets ahead of anything I was used to, and suggested that the Danish socio-political setup might be particularly well suited to procuring the maximum benefit out of the technological developments of the time. But perhaps the current pace of change is now rendering these large-scale initiatives and projects less and less viable and meaningful. It will be interesting to see in the Web 2.0 world if the propensity towards centralized solutions that exists in the Danish system proves an advantage or an obstacle. The evidence of this survey points to the latter, at least the results suggests that the Danish system is not able to implement user-generated data solutions that match individual Danish academic librarians’ own aspirations for the concept. 8 Conclusion This research has provided a snapshot of Danish academic librarians’ use, knowledge and opinion of usergenerated data applications and concepts in 2012.There have been some shortcomings to the survey design, related to the definition of some of the concepts under scrutiny, and additional language issues. 71 Such issues make the study harder to replicate and thus lowers the validity, and the comparative ability, of its findings. Nevertheless, this survey was intended primarily to be exploratory in nature, and despite the problems listed, I believe the findings to be of substantial value in this regard. I know of no other attempt to assess the academic libraries use of user-generated data (or indeed Web 2.0) in Denmark at a national level, nor do I know of any attempt to gauge academic librarian knowledge and opinion on the issue, either in Denmark or internationally, at least not one with this scale and inclusivity. Given that 106 responses were registered, I propose that the findings are a viable reflection of the broad experience nationally. In the literature review I attempted to show how the literature suggested a historical progression of sorts; from the development of the concept, to an idealistic and somewhat unrealistic assessment of its potential, even going as far as to predicting the death of traditional classification. Then on to a more pragmatic era where library services attempted to understand these user-generated data applications and what they were capable of, as well as figuring out how to utilise and integrate their techniques without negatively affecting existing services. And then finally on to a more realistic era where academic libraries were forced to reassess the usefulness of the applications and solutions being employed, and a recognition that the same solutions did not work equally in all contexts. But also a realistic era in that user-generated data had proved itself a necessary and critical tool in the efforts made by library services to stay relevant in the digital age. Thus, the issue was no longer if user generated data was useful, the issue had become; how do we use it, and how do we encourage our users to use it? The survey sought to establish Danish academic librarians’ experience of user generated data. The findings showed that a significant portion of Danish academic librarians, but far from all, had made use of some user-generated data applications. Most of that use, however, consisted of such things as having a profile on Facebook, or using wiki’s and blogging, which by contrast were not perceived as being highly relevant by the librarians themselves. Moreover, many of the user-generated data projects that Danish libraries had pursued had either come to a natural end or been removed because of lack of user engagement. This 72 overall picture contrasted sharply with the librarians’ enthusiasm for the concept. A particularly sharp contrast was found where those aspects which Danish academic librarians consider most relevant, tagging and user tracking, did not seem to be currently utilised by the majority of institutions. The sheer amount of positivity expressed by the majority of respondents towards all aspects of user-generated data, (except in the case of the tagging repositories of large international websites), suggests a situation where librarians are ready for locally and contextually relevant solutions that incorporate tagging and user tracking functionalities. These systems do not, however, exist at the moment. It will be very interesting to see what emerges in the coming years. At the very least it seems that any library however large or small, if it wishes to live up to its own aspirations, needs to employ programmers/applications developers/media specialists as well as information specialists/librarians. The larger academic institutions do this already of course, but it is hard see how even the smaller libraries can afford not to have one in-house applications developer if they want to stay relevant to their increasingly information literate users. To be able to adapt in an era of ephemeral technologies, decreasing funds, exploding digital collections and vanishing physical collections requires an ability to respond quickly, dynamically and contextually without waiting for national initiatives. The best national strategy might be one that provides the guidelines, tools and ongoing tuition and training needed to allow individual institutions to design their own solutions. 9 Bibliography Aharony, N. (2010). Twitter Use in Libraries: An Exploratory Analysis. Journal of Web Librarianship. Vol. 4, Iss. 4, 2010. Andersen, K.V. & Kraemer, K.L. (1994). Information technology and transitions in the public service: A comparison of Scandinavia and the United States. Scandinavian Journal of Information Systems, 6(1), 3-24. Anderson, R. (2011). What Patron-Driven Acquisition (PDA) Does and Doesn’t Mean: An FAQ Blog entry [online]. Retrieved on 30 June 2012 from http://scholarlykitchen.sspnet.org/2011/05/31/what-patrondriven-acquisition-pda-does-and-doesnt-mean-an-faq/. 73 Anfinnsen, S., Ghinea, G. & de Cesare, S. (2011). Web 2.0 and folksonomies in a library context. International Journal of Information Management 31 (2011) 63–70. Bejune, M.M. (2007). Wikis in Libraries. INFORMATION TECHNOLOGY AND LIBRARIES | September 2007. Blyberg, J. (2007). AADL.org Goes Social. Blog entry [online]. Retrieved on 30 June 2012 from http://www.blyberg.net/2007/01/21/aadlorg-goes-social/. Blyberg, J. (2008). Library 2.0 Debased. Blog entry [online]. Retrieved on 30 June 2012 from http://www.blyberg.net/2008/01/17/library-20-debased/. Borgman, C. L. (2007). Scholarship in the Digital Age. Information, Infrastructure, and the Internet. MIT Press. Cahill, K. (2009). User-generated content and its impact on web-based library services. Oxford : Chandos Publishing, 2009. Chalon, P.X., Di Pretoro, E. & Kohn, L. (2008). OPAC 2.0: Opportunities, development and analysis. 11th European Conference of Medical and Health Libraries. June 2008, Helsinki, Finland. Clevenger, A., Kanning, S. & Stuart, K. (2011). Navigating the Changing Landscape of Social Media within Public and Academic Libraries [online]. Retrieved on 30 June 2012 from http://www.kmstuart87.me/documents/810LiteratureReview.pdf. DeZelar-Tiedman, C. (2008). Doing the LibraryThing in an Academic Library Catalog. At http://dc2008.de/wp-content/uploads/2008/10/11_dezelar_poster.pdf. Fogg, B. J. (2003). Persuasive Technology, Using Computers to Change What We Think and Do. Amsterdam: Morgan Kaufmann. Heymann, P. & Garcia-Molina, H. (2009). Contrasting Controlled Vocabulary and Tagging, Do Experts Choose the Right Names to Label the Wrong Things. WSDM ’09 Barcelona, Spain. Holst, N. & Berg, T.E. (2008). Brugerskabte data i OPACs. Kandidatspeciale. Danmarks Biblioteksskole. [online]. Retrieved on 30 June 2012 http://pure.iva.dk/files/30774738/Speciale_brugerskabte_data_i_OPACs.pdf. LibraryThing (2012). LTFL:Libraries using LibraryThing for Libraries [online]. Retrieved on 30 June 2012 from http://www.librarything.com/wiki/index.php/LTFL:Libraries_using_LibraryThing_for_Libraries. Lietzau, Z., & Helgren, J. (2011). U.S. Public Libraries and the Use of Web Technologies, 2010. Denver, CO: Colorado State Library, Library Research Service. Mahmood, K. & Richardson Jr, J.V. (2011). Adoption of Web 2.0 in US academic libraries: a survey of ARL library websites. Program: electronic library and information systems, Vol. 45 Iss: 4 pp. 365 – 375. Maness, J.M. (2006). Library 2.0 Theory: Web 2.0 and Its Implications for Libraries [online]. Retrieved on 30 June 2012 from http://www.webology.org/2006/v3n2/a25.html. 74 Mendes, L.H., Quinonez-Skinner, J. & Skaggs, D. (2009). Subjecting the catalog to tagging. Library Hi Tech, Vol. 27 No. 1, 2009, pp. 30-41. On_tracks. (2011). [online]. Retrieved on 30 June 2012 from http://ontracks.dk/background/. O’Reilly, T. (2005). What is Web 2.0: design patterns and business models for the next generation of software. [online]. Retrieved on 30 June 2012 from http://www.oreillynet.com/lpt/a/6228. Redden, C.S. (2010). Social Bookmarking in Academic Libraries: Trends and Applications. The Journal of Academic Librarianship, Volume 36, Number 3, pages 219–227. Ritterbush, J. (2007). Supporting Library Research with LibX and Zotero, Journal of Web Librarianship, 1:3, 111-122. Santolaria, A.M. (2009). LibraryThing as a library service. Assessment report. [online]. Retrieved on 30 June 2012 from http://e-collection.library.ethz.ch/eserv/eth:777/eth-777-01.pdf. Shirky, S. (2005). Ontology is overrated. [online]. Retrieved on 30 June 2012 from http://shirky.com/writings/ontology overrated.html. Skovgaard Jensen, T. (2012). Excuse me, can I see your web 2.0, please? REVY 2012, Vol 35, No 1 (2012) pp. 4-5. Smith, G. (2008). Tagging: emerging trends. Bulletin of the Society for Information Science and Technology, Vol. 34 No. 6. Spalding, T. (2007). When tags work and when they don't: Amazon and LibraryThing.‖[online]. Retrieved on 30 June 2012 from http://www.librarything.com/blogs/thingology/2007/02/when-tags-work-and-whenthey-dont-amazon-and-librarything/. Spiteri, L.F. (2007). Structure and form of folksonomy tags: The road to the public library Catalogue. [online]. Retrieved on 30 June 2012 from http://www.webology.org/2007/v4n2/a41.html. Stam, D.H. (1992). Plus Ca Change ... Sixty Years of the Association of Research Libraries, Washington, D.C.: Association of Research Libraries, 1992. [online]. Retrieved on 30 June 2012 from http://www.arl.org/bm~doc/pluscachange.pdf. Sweda, J.E. (2006).USING SOCIAL BOOKMARKS IN AN ACADEMIC SETTING: PENNTAGS.17th Annual ASIS&T SIG/CR. Classification Research Workshop, 31-32. Tabs, E.D. (2003). Academic Libraries:2000, National Center for Education Statistics [online]. Retrieved on 30 June 2012 from https://www.ala.org/ala/research/librarystats/academic/nces/2000_ED_Tbs.pdf. Thomas, M., Caudle, D.M. & Schmitz C.M. (2009). To tag or not to tag? Library Hi Tech Vol. 27 No. 3, 2009 pp. 411-434. Vander Wal. T. (2007). Folksonomy. [online]. Retrieved on 30 June 2012 from http://vanderwal.net/folksonomy.html. 75 Voss, J. (2007). Tagging, folksonomy & co: Renaissance of manual indexing? In: The 10th International Symposium for Information Science, Cologne. Walker, J. (2010). Leveraging the power of aggregation to achieve an enhanced research environment. Stellenbosch University Annual Library Symposium / IFLA Presidential Meeting 2010 [16]. Walsh, A. (2011). Gamifying the University Library. In: Online Information Conference 2011, 29th November - 1st December 2011, London. [online]. Retrieved on 30 June 2012 from http://eprints.hud.ac.uk/11938/. Weller, K., & Peters, I. (2008). Seeding, weeding, fertilizing. different tag gardening activities for folksonomy maintenance and enrichment. Xu, C. Ouyang, F. & Chu, H. (2009). The Academic Library Meets Web 2.0: Applications and Implications. The Journal of Academic Librarianship, 35, 4. Yi, K., & Chan, L.M. (2009). Linking folksonomy to Library of Congress subject headings: an exploratory study. Journal of Documentation. Vol. 65 No. 6, 2009 pp. 872-900. 10 Appendices Appendix I: Original Email in Danish Emne: Brugerskabte data og brugerspor i forskningsbiblioteker Hej [Firstname] Jeg er kandidatstuderende på IVA, og er i gang med et projekt om brugerskabte data i danske forskningsbiblioteker. Målet er at opbygge et billede af, hvilke projekter er undervejs i danske forskningsbiblioteker, og hvad danske forskningsbibliotekarer selv synes/ved/håber/forudsige om dette emne. Til dette formål har jeg lavede et spørgeskema (https://www.surveymonkey.com/s/forskningsbiblioteker) som jeg ville være meget taknemmelig, hvis du vil afsætte 5-10 minutter af din tid til at udfylde. På forhånd tak. Nicholas Paul Atkinson 76 Appendix II: Original survey in Danish 77 78 79 80 81 Appendix III: List of research libraries taken from http://www.dfdf.dk/index.php?option=com_content&view=article&id=132&Ite mid=69 Amternes og Kommunernes Forskningsinstitut. Bibliotek Arbejderbevægelsens Bibliotek og Arkiv Arkitekskolen i Århus ASB Bibliotek. Handelshøjskolen. Aarhus Universitet Bibl. For Matematiske Fag Bibliotekarforbundet Biblioteket for Sprog, Litteratur og Kultur BIVIL Lunds Universitet CBS Bibliotek Center for Idræt, Biblioteket Danmarks Designskoles Bibliotek Danmarks Jordbrugsforskning Danmarks Kunstbibliotek Danmarks Meteorologiske Institut Danmarks Nationalbank Danmarks Pædagogiske Bibliotek Danmarks Statistiks Bibliotek Danmarks Tekniske Informationscenter Dansk BiblioteksCenter A/S Det Administrative Bibliotek Det Danske Filmmuseum Bibliotek Det Informationsvidenskabelige Akademi Det Jydske Musikkonservatorium Det Kongelige Bibliotek Det Kongelige Bibliotek, Københavns Universitetsbibliotek Det Kongelige Danske Musikkonservatorium Det Kongelige Teaters Bibliotek Det Teologiske Fakultets Bibliotek DSI - Institut for Sundhedsvæsen, Biblioteket DTU Aqua Erhvervsakademi Århus Erhvervsakademiet f. Byggeri & Produktion, Biblioteket Fagbiblioteket Folketingets Bibliotek Forsvarets Bibliotek Forsvarets Materieltjeneste Fujitsu Gentoftebibliotekerne 82 Glostrup Hospital Herlev Hospital HIH Bibliotek, Aarhus Universitet Hillerød Hospital, Sundhedsvidenskabeligt Bibliotek Historisk Institut, Århus Universitet IRCT International Documentation Centre IT-biblioteket Katrinebjerg Kemisk Institut. Århus Universitet Kriminalforsorgens Uddannelsescenter Kunstakademiets Arkitektskoles Bibliotek Kunstindustrimuseets Bibliotek KVINFO Københavns Tekniske Bibliotek Københavns Universitet Mediateket, Herning Gymnasium Medicinsk Bibliotek Ministeriet for Sundhed og Forebyggelse Moesgårdbiblioeket Museologisk Bibliotek Musikhistorisk Museum & Carl Claudius' Samling NIAS Library and Information Centre Nordsøcentret NOTA NOVO- Nordisk A/S Nunatta Atuagaategarfia OASIS - Behandling og Rådgivning for Flygtningen Odense Tekniske Bibliotek Odense Universitetshospital Professionshøjskolen Metropol Professionshøjskolen UCC Professionshøjskolen University College Lillebælt Professionshøjskolen University College Vest Psykiatrien Region Sjælland, Psykiatrisk Videncenter Region Hovedstaden Psykiatri, Pskyk. Center Sct. Hans Regionshospitalet Viborg Rigshospitalet Risø Bibliotek Roskilde Handelsskole Roskilde Universitetsbibliotek SCCISM Servicestyrelsens Bibliotek Socialforskningsinstituttet 83 Statsbiblioteket Styrelsen for Bibliotek & Medier Sundhedsstyrelsen. Biblioteket. Syddansk Universitetsbibliotek UCSJ University College Nordjylland, Biblioteket Vendsyssel Historiske Museum VIA Bibliotekerne Æstetikbiblioteket Aalborg Sygehus. Medicinsk Bibliotek Aalborg Universitetsbibliotek Århus Tekniske Bibliotek Appendix IV: Libraries that responded ASB Bibliotek, Aarhus Universitet Bibliotek for Matematiske Fag, Aarhus Universitet Biblioteket Metropol CBS Bibliotek Danmarks Kunstbibliotek Danmarks Pædagogiske Bibliotek / Aarhus University Library Danmarks Statistiks Bibliotek Danmarks tekniske informationscenter Det Administrative Bibliotek Det Danske Filminstituts Bibliotek Det Juridiske Fakultetsbibliotek Det Kgl. Bibliotek. Det Humanistiske Fakultetsbibliotek Det kgl. Danske Musikkonservatorium, Biblioteket Det Kongelige Bibliotek Det Kongelige Bibliotek, (Orientalsk Samling og Judaistisk Samling) Det Kongelige Bibliotek, Kort og billedafdelingen Det Natur og sundhedsvidenskavelige fakultetsbibliotek Det Samfundsvidenskabelige Fakultetsbibliotek, KB Det Samfundsvidenskabelige Fakultetsbibliotek, Aarhus Universitet DSI-Biblioteket DTU Bibliotek Fagbiblioteket Regionshospitalet Viborg Fakultetsbibliotek for Natur- og Sundhedsvidenskab ved KU Forsvarets Bibliotek Institut for Kemi, Biblioteket institut for statskundskab - nedlagt 1.april 2012 IT-biblioteket Katrinebjerg IVA biblioteket 84 KADK Kongelige Bibliotek KUB Samfundsvidenskab KVINFOs bibliotek Københavns Universitetsbibliotek Nord Mediateket Moesgårdbiblioteket NIAS Nobelbiblioteket Nordisk Institut for Asien Studier Nota Psykiatrisk Forskningsbibliotek Pædagogisk Sundhedsfagligt Bibliotek Samfundsvidenskabeligt fakultetsbibliotek SDUB Socialstyrelsens bibliotek Statsbiblioteket Syddansk Universitetsbibliotek Teologisk Fakultetsbibliotek UCN Biblioteket UCSJ Biblioteket University College Lillebælt VIA Bibliotekerne Aalborg Universitetsbibliotek 85