Implementing a `Del.icio.us` like system in an academic library
Transcription
Implementing a `Del.icio.us` like system in an academic library
Implementing a 'Del.icio.us' like system in an academic library discovery environment Ede Girish Renu Goud s1050984 University of Edinburgh Master of Science Computer Science School of Informatics University of Edinburgh 2011 1 Abstract The world of library search systems is undergoing a change. Recently, a lot of focus has been placed on extending the default search functionality provided by resource discovery tools and services. Moreover, new library search systems like Summon in an academic library discovery environment claim to be faster than their predecessors and expose API’s to allow development of custom tools. The Digital Library at the University of Edinburgh is in the process of evaluating such new systems and is working on exploring out the features provided by them. In this process, Summon by Serials Solutions and EDS EBSCO have been rated better than other systems. This project explores the challenges of extending Summon’s features and exploits its public PHP based API to construct an application for providing means of annotation and tagging based find services to digital library users for resources in an academic library discovery environment, thereby providing new means of findability and personalization of search for users. Further, challenges in designing the user interface for such an application are also discussed with suggested solutions to serve as guidelines for future developers of such applications in the Digital Library. Finally, the results of usability tests using nine scenarios of end user testing have also been presented along with the expectations of test users for this service to serve as inputs for the decision making committee at the Digital Library. 2 Acknowledgements I would like to thank Mr. Colin Watt for his strong vision, kind advice and ongoing support throughout the project. His organizational skills have heavily influenced the development of such a new service in the Digital Library, with features going well beyond the original concepts planned in the proposal. I would like to thank Mr. Mark McGillivray, our Informatics Research Proposal group tutor for his sharp insights into the technical challenges that I was going to face over the summer, several times, exceeding the allotted class duration. Each of his inputs for the development of Tagus, was worth trying in developing the final product. I would like to thank Ms. Angela Laurins, our usability expert, who had kindly take time out of her busy schedule and helped organize and run the usability testing of Tagus. Her encouraging words are an inspiration and have urged me to push my work beyond what was originally planned. I would like to thank Ms. Ianthe Hind for her technical help during the development of Tagus, in helping achieve work standards like continuous technical testing, small releases, daily deployment, etc. as practiced in the Digital Library Development Team. I would like to thank Ms. Claire Knowles for her inputs on technical standards being followed by the Digital Library websites, cross browser and cross platform portability, putting up with my requests for deployment on several late nights and her help in fixing several bugs that came up during the course of this project. I would like to thank all the participants, who responded to our invitation, took time out to participate in the usability testing, gave us precious feedback as well as suggestions for prospective features to be developed in future. 3 Declaration I declare that this thesis was composed by myself, that the work contained herein is my own except where explicitly stated otherwise in the text, and that this work has not been submitted for any other degree or professional qualification except as specified. Girish Renu Goud Ede s1050984 4 To Everyone who is the one and the one who is Everyone 5 Table Of Contents i. ii. iii. iv. v. vi. vii. viii. ix. Title Abstract Acknowledgements Declaration Dedication Table of contents List of figures List of tables List of graphs 1. Chapter 1. Introduction 1.1. Motivation for this Chapter 1.2. Introduction 1.3. Definitions 1.4. Purpose 1.5. Overview of bigger plans and goals driving this project 1.6. More Background and Related Work 1.7. Aims 1.8. Thesis Outline 2. Chapter 2. Initial Work 2.1. Motivation for this Chapter 2.2. Pre-Requirements-Collection-Stage 2.3. A Brief Study Of University of Edinburgh’s Summon Instance 2.4. Requirements Collection 2.5. Functional Requirements 2.6. Non-functional Requirements 2.6.1. Implications and Constraints while choosing technologies for interaction with Summon 2.6.2. Constraints due to standalone mode 2.6.3. Usability requirements 1 2 3 4 5 6 10 11 12 13 13 13 14 14 15 17 19 20 22 22 23 23 25 25 31 32 33 34 6 2.6.3.1. 2.6.3.2. 2.6.3.3. 2.6.3.4. Definitions of Usability, Kinds of Usability Criteria Available Choices Made From Available Usability Criteria Quantitative Evaluation, What Criteria Fit Into Quantitative Evaluation? Qualitative Evaluation , What Criteria Fit Into Qualitative Evaluation? Why Not Other Criteria? Limitations, Constraints, Overlaps 2.6.3.5. 2.7. Evaluation Plan 2.7.1. Constraints on automated testing 2.8. Guidelines to future developers 34 35 36 37 38 39 39 39 3. Chapter 3. Design 3.1. Motivation for this Chapter 3.2. Architecture 3.3. Design Of Core Services 3.4. Design Of Standalone API 3.5. Final Class Diagrams 3.6. Final Deployment Diagrams 3.7. Design of the UI 3.7.1. Design Rationale 3.7.2. Expert usability inputs 3.7.3. Two column layout versus single column layout 3.7.4. Searchable annotations versus non-searchable annotations 3.7.5. Tags List versus Tags Cloud 3.8. Guidelines to future developers 41 41 41 46 47 48 58 59 59 59 60 63 65 68 4. Chapter 4. Implementation 4.1. Motivation for this Chapter 4.2. Methodology 4.3. Technologies 4.3.1. Criteria 4.3.2. Constraints 4.4. Further Technical Details 4.5. Workflow Chart of development 69 69 69 69 70 70 71 73 7 4.6. Guidelines to future developers 74 5. Chapter 5. Evaluation 5.1. Motivation for this Chapter 5.2. Methods 5.2.1. Preliminary Steps 5.2.2. The Interviewing Protocol Setup 5.2.3. The Participant Selection and Contacting Process 5.2.4. The Data Collection Method 5.2.5. My preparation for the testing sessions 5.2.6. Test Scenarios 5.3. Overview of Results Sections 5.4. Reports For Future Developers 5.4.1. Report on working with Summon API 5.4.2. Report on working with Elastic Search 5.5. Reports For Digital Library's Decision Makers 5.5.1. speed of service in retrieving tags and speed of interaction 5.5.1.1. Quantitative Evaluation 5.5.1.2. Qualitative Evaluation 5.5.2. usability of user interface 5.5.2.1. Tasks & Questions 5.5.2.2. Qualitative Evaluation 5.5.3. ease of learning :: first vs second time usage 5.5.3.1. Qualitative Evaluation 5.5.4. findability :: Summon [does not have tagging] vs Tagus [uses Summon] 5.5.4.1. comparitive evaluation :: Summon only vs Summon + Tagus 5.5.4.2. Report on accuracy of public tagging affecting findability 5.5.5. ease of creating reading lists :: manual vs Tagus 5.5.5.1. Cooperative Evaluation 75 75 75 75 75 76 76 77 77 77 78 78 80 81 81 81 84 84 84 84 86 86 88 88 88 89 89 6. Chapter 6. Conclusions & Future Work 6.1. Motivation for this Chapter 6.2. Limitations 6.3. Future Work 91 91 91 92 8 7. References 94 8. Appendix 8.1. Tagus Screenshots 8.2. Test Script 8.3. Test Plan 8.4. Access Details To Location Containing Actual Data collected during evaluation 100 100 104 109 120 9 List of figures Figure 3.1. Architecture Of Tagus 42 Figure 3.2. Modules of Tagus 46 Figure 3.3. Modules of Tagus Standalone API 47 Figure 3.4. Complete Class Diagram Of Tagus’ Server 48 to 51 Figure 3.5. Complete Class Diagram Of Tagus’ Client 52 to 57 Figure 3.6. Deployment Diagrams of Tagus 58 Figure 3.7. Two Column Layout in UI 60 Figure 3.8. One Column Layout in UI using “Search Summon” 62 Figure 3.9. One Column Layout in UI using “Public tags” 63 Figure 3.10. Searchable hyperlinked personal annotations 64 Figure 3.11. Non-Searchable plain-text personal annotations 65 Figure 3.12. Tags list for a user 66 Figure 3.13. Tags cloud for a user 67 Figure 4.1. Bootup of an Elastic Search instance 71 Figure 4.2. Iterative software development workflow 73 Figure 5.1. Timing of calls to Elastic Search’s instance in browser one 82 Figure 5.2. Timing of calls to Elastic Search’s instance in browser two 83 10 List of tables Table 2.1. Table of sources 25 Table 2.2. Table of Functional requirements 26 Table 2.3. Table of Usability criteria chosen for Tagus with justification 35 Table 2.4. Criteria qualifying for quantitative evaluation 36 Table 2.5. Criteria qualifying for qualitative evaluation 37 Table 3.1. Table of core services 46 Table 4.1. Table of technologies 69 11 List of graphs Graph 5.1. Ease of learning as measured for two specific tasks 86 Graph 5.2. Ease of learning as measured across all tasks 86 Graph 5.3. Ease of learning as measured across all tasks using “satisfaction with results” 87 Graph 5.4. Findability: How EASIER is finding a resource in Tagus vs in Summon 88 12 1. Chapter 1. Introduction 1.1. Motivation for this Chapter In this chapter we introduce our project, we describe its purpose, concepts involved, background, aims and related work. We start with an introduction to the world of digital library systems and define some features characterizing such systems. We then proceed to describe the purpose of this project in the light of ongoing work at the Digital Library, University of Edinburgh. The next section presents the background work leading to the proposal and the execution of this project. The subsequent section outlines the aims of this project in a four fold manner. Also, related work going on in other digital library systems is concisely described in the penultimate section. Finally, we present the thesis outline. 1.2. Introduction Resources in a digital library include e-journals, e-books, databases, research publications and other digitized materials. Every year, millions of searches and downloads take place in the current infrastructure. Resource discovery in the digital library is often characterized by several factors, viz., speed, availability, findability and personalization. Speed refers to the quickness of the fetching of results, once a user searches for a resource. While availability implies the presence of a particular digital resource in the digital library, findability means the ability to find a currently present resource through searches and other methods. In other words, findability is the tendency of an existing resource to come up in results of searches. Thus, availability is defined for a digital resource and findability can be defined for a digital resource, for a digital library user or for a digital library search service. Finally, personalization is defined for a digital library search service, as the set of capabilities of the system to provide means of storing personal [3] and relevant information for existing resources, so that, future searches made by the same user, give out results augmented by personally relevant information. According to [13], “digital libraries can be cold, isolated, impersonal places that are 13 inaccessible to both machines and people”. Thus, when such personally relevant information is shared publicly across users, personalization itself becomes a measure of relevancy. 1.3. Definitions For the purpose of this thesis, the following terms have been used in this document, in the sense of the definitions below. 1. Federated searching is the process of searching multiple database sources simultaneously via a single search interface [19]. 2. A resource discovery system is a tool or service, used by librarians and end-users to manage and find either a) information about a resource or b) a resource itself or c) both. 3. Personalization is the process of incorporating user generated contents or user inputs before presenting information to a user. 1.4. Purpose Available metrics have demonstrated very high use of digital library resources at the University of Edinburgh [1]. Anecdotal evidence from various sources has suggested that federated search using Webfeat [11] has not delivered an optimum discovery service [2]. The library commits a significant proportion of its materials budget towards increasing the success of the search system, increasing its usage and reducing the costper-use of the e-resources [1]. With this goal in mind, the Digital Library Section in Information Services at the University of Edinburgh is in the process of evaluating two “next generation” discovery services, viz., Summon by Serials Solutions [5] and EDS by EBSCO*. Summon is labeled as a “single search box, show results, filter results by metadata” system, while, EDS is “a unified, customized index of an institution’s information resources, and an easy, yet powerful means of accessing all of that content from a single *The results of surveys and other evaluation procedures carried out on implementation of Summon in some Higher Education Institutions or HEI’S are publicly available. For more details and examples, please refer to [2] and [3]. 14 search box” [5]. Summon, through its API [4], provides means of developing custom applications while EDS does not provide such an API. As part of this evaluation, with an aim of introducing new means of findability and personalization in library search systems, with the initiative of Mr. Simon Bains, the earlier Head of Digital Library, University of Edinburgh and with the consent of the Library Committee at the University of Edinburgh, this project was undertaken, to construct and assess the application of annotation and tagging based find services in an academic library discovery environment and thereby create a tool that based on the content of the Digital Library, can provide users with added value in discovering that content. 1.5. Overview of bigger plans and goals driving this project [With inputs from an initial interview with the supervisor] 1.5.1. Overview of background work leading to this project’s proposal The University of Edinburgh selected two resource discovery systems for testing over a minimum period of twelve months. This decision was been made in order to reach a better level of understanding about a number of issues which could not be resolved in the time period permitted by the procurement process. This decision was endorsed by the Project Board, but with a firm steer that “we must not cause user confusion during the evaluation period. It was anticipated that we will go live would go live with one of the selected systems early in 2011”. The system selected for launch under the "Searcher" service label was EDS from EBSCO (the higher scoring system in the procurement). “It was felt that moving Summon to live either as a replacement for EDS, or in addition to it, during the 12 month evaluation period would also cause confusion, so Summon was not planned for general public release during the evaluation stage of the project”. Instead, access was intended to be provided in a managed way to target user groups as part of the "User Engagement" project work package. This has meant that “we have not been able to compare usage metrics of two competing systems, but the value of this data is outweighed by the need to avoid user confusion and irritation by requiring them to make a choice”. 15 One of the distinguishing features of Summon over EDS (at the time of the procurement) was the availability of an API to allow solutions to be built locally utilising the Summon functionality and integrate results and features into other local tools and systems; “it was felt that this MSc project proposal could explore this area and help inform the project team's decision making, viz. whether the Summon API could enable the choice of this product over EDS, or if this additional flexibility did not provide sufficient anticipated added value over EDS”. 1.5.2. The importance of the timing of carrying out this project The timing of this work is important due to the amount of testing work needed to make a valid evaluation of the two products and the availability of Digital Library staff to the project. “The MSc project allows us to explore some of the tasks in the "Usability" project work package in more depth than otherwise possible in the time available. It is important that we are able to make a valid business case to either continue with both products for a defined period, or select one over the other, and this work plays a vital part in that decision-making process”. It is expected that recommendations will be made before November 2011 on the future direction of the discovery service(s). 1.5.3. The prospective value of this project in the overall scheme “We see value in this work exploring different approaches to help assess user behaviour in ways that it may not otherwise have been possible to pursue, so the approaches used in this project still offer valuable inputs”. The landscape of "library systems" is changing and over the next few years the boundaries between what were previously viewed as separate "systems" for managing traditional library activities and workflows, providing access to traditional catalogues, an ever-increasing number of electronic resources, and open-access agendas, is expected to see a move towards more comprehensive library services, based on commercial proprietary or open-source technologies; “these will ideally facilitate the management of workflow, incorporate enhanced discovery tools, and be interoperable with other systems inside and outside the library”. With increasing commoditisation of IT service components, there is likely to be an increasing move to outsourcing and migration of system and service components to 3rd parties using variations on the cloud model, putting a focus on libraries to add value in the areas 16 where they retain specialist expertise; “this MSc project and its contribution towards the current resource discovery project will provide outputs which are useful in exploring the aspects of user behaviour for input to library strategy over the next 2 years”. 1.5.4. The stakeholders In addition to the users of Digital Library at the University of Edinburgh, this work would be of interest to the global library research community, as resource discovery and the emergence of solutions like Summon, is presently of high strategic interest in this sector. The scope of this proposed service could potentially include the currently growing list of organizations and institutions moving towards a Summon implementation backed academic library environments [2, 3]. “This work has a potential impact on the nature of services provided by the Digtial Library section of UoE's Information Services. By implication this means staff and students of the University, its partners, and potential new students worldwide. The ultimate aim is to provide a "one-stop-shop" for discovery of all electronic resources, and tools to support better sharing of those resources in learning, teaching, and research activities. Examples of other relevant work happening in UoE include development of mobile based applications for staff and students; it would be interesting to consider some of the ideas explored on this project into scoping of future phases of that work”. 1.6. More Background and Related Work A user’s resource discovery process in an academic research library presently relies on individual discovery, supported by reading lists and subject guides provided by academic and support staff. The LibQUAL results, reported to the Library Committee showed the satisfaction with the collections to be low. As [1] points out, one very important reason may be that the items which users want are just not available in Edinburgh’s Digital Library. Another reason, however, is that the findability of resources could be improved, as indicated by Simon Bains, clearly in, “while many users do find the resources they require, others find this difficult” [1]. Difficulty caused in improving satisfaction could be factored due to those cases caused by speed, findability and personalization. Summon claims to meet the speed challenge 17 with its unique multitiered indexing service [5]. However, there is an opportunity to allow users to share information on the system about resources which were particularly useful, whether on reading lists or found through their own search processes, thus leading to this proposed work in personalization of resource discovery [3]. This work proposes to meet the other significant challenge of findability too. Currently, Summon uses its custom data mining methods to index digital library resource and the same data is used to return results to users based on their queries. New library discovery systems like Summon offer APIs that support customized tool development [4], which makes it easier to introduce additional functionality which would have been impossible to do previously without substantial vendor support. This project constructs a tagging and annotation index which is used in conjunction with Summon so that retrieved items are accompanied by annotations and tags. By annotations, we mean personal statements added by a user, solely meant for his / her personal viewing. Tags, on the other hand, are added by other users and allow users to browse other resources and can be seen by everyone. [14] labels this feature as “collaborative tagging” and [15] introduces the notion of “induced tagging to refer to social bookmarking with two key characteristics: (1) a welldefined group of participants are knowledgeable on the available resources and the background of the user community; and (2) tagging is required as part of their regular responsibilities as a reference team.”. In this light, the prospective well-defined group of participants would be students, researchers, librarians and professors from the university. From [2], “Libraries often fail to make their resources discoverable and that this may in turn affect the perceived value of the library”. Following a review of the state of search facilities at the University of Huddersfield became the first UK commercial adopter of Summon in 2009 [2]. During this process, several issues were found and a customized instance of Summon was delivered to fix the problems. University of Huddersfield is not the only university to evaluate a new age system like Summon. The case for a single “one-stop-shop” approach to resource discovery is argued for, in [20], which points out that “the variety of systems in place [at a university] were not always as interoperable” as expected. It further highlights that “federated search can be slow and in many cases the users find it complicated to use”. Finally, it points out that customization of a 18 resource discovery system should be made based on data available from “log analysis” and “usability testing”. The list of universities which use a Summon based system include Michigan’s Grand Valley State University, Arizona State University, The University of Sydney, Penn State University, University of Adelaide, etc.[21, 22, 23, 24, 25]. While recent focus in literature seems to be on evaluating the usability of new age resource discovery systems, University of Edinburgh is keen to evaluate how extendible these new systems are, how much functionality is offered in their public api’s and how tough/easy developers find to develop new tools and services using them. Finally, existing search systems in academic environments do not directly support the direct development of extended applications. Therefore, in the light of this limitation, it is important to develop this service to be also available on a standalone basis, like Delicious [12], so that new applications can be developed based on it in future, e.g. an increasingly rich collection of records could be made available to all students, organised by relevant course, year etc. as more and more resources get tagged in the Digital Library. This makes use of the knowledge that tags are essentially user generated content in this application, and that the onus of maintaining the tags for resources lies with the user community as a whole. Thus, personalization, in the light of this context, is not only applicable at a single user level, but also at a community level. In other words, a single tag like “reading-list-msc-2011” for a resource such as “informatics introduction text book” makes the resource personalized for a whole community, viz., all msc students studying at the university in 2011. 1.7. Aims The aims of this project are four fold: 1) gather requirements for the application to be implemented 2) perform a background study of existing Summon API and other tools deemed necessary for the implementation 3) perform design-implement-reviewredesign cycles for the implementation of a service offering tagging services to library resources at two levels, viz., code/API level and user interface level and 4) deploy the 19 system on a publicly accessible server and gather the challenges in the development of this service and gather the results of usability testing of this service. These involved working closely with several staff of Digital Library office at the university, who are working in developer and usability expert roles, through out the project. Therefore, this involved taking the initiative to gathering required materials, to make certain semi-supervised decisions, offering available choices of the course of work to follow at weekly team meetings and following up with the supervisor and staff to arrive at this synchronized piece of work called Tagus. 1.8. Thesis Outline This work is organized as follows. In this chapter we introduced our work, outlining the purpose and background. In chapter 2 we present work performed to gather requirements for our project including a background study of Summon’s instance for our university. In chapter 3 we describe the architecture of the application developed, what constraints came into the picture during design and what design decisions have been made to encompass them. In chapter 4, we describe the implementation of the design, integrating various developed components, attempts to make them cross browser compatible and challenges faced during technical test driven software development. In chapter 5, we describe the evaluation methods used in this study and the results gathered from usability tests. The results have been presented in two categories, viz., one in the form of technical reports about using Summon’s public API and other technologies and the other in the form of pilot user testing findings. Finally, in chapter 6, we summarize the results and challenges and propose some guidelines for future developers. We also present the limitations of this work and propose prospective extensions, customizations and new applications of these technologies for future work. Chapter 7 lists the references and chapter 8 is a summarized set of actual data gathered on field and screenshots of the actual service, as deployed and in working order, for reference. 20 Please note that while the work itself was done using an iterative software development methodology, it is being presented section wise in the chapters to follow. 21 2. Chapter 2. Initial Work 2.1. Motivation for this Chapter In this chapter we describe the whole of the background study involving all the work done before we started designing the system. This also includes the requirements gathering process for our project, and along the challenges encountered during the process. Please note that this phase was carried out throughout the project as development happened in weekly cycles. Also, kindly note that the gathering of requirements has itself been an integral part of the work description for this project as there was no existing framework to just start working with. This was envisioned by the supervisor in the previous semester and we had accordingly planned for it. We start with a description of the information available before work on the project was started. We then present a brief study on the Summon instance for University of Edinburgh. This background study helped build the scope of this project’s outcomes. The subsequent section outlines the methods used in gathering the actual requirements. This is important in the light of the fact that, the resources on hand, at the start of the project were just 1) a single php based api file to make calls to Summon and 2) the above mentioned Summon instance. The next couple of sections list the gathered and approved requirements, segregated into two types, viz., functional and non-functional. Finally, we present and highlight usability requirements so as to arrive at a suitable set of usability tests for such a service, to be tested at the end of its development. Since the development of such a customized service using new generation resource discovery systems is novel in academic library environments, as were the cases for the requirements and the code, there was no existing framework to base our usability tests. In this light, it is worth mentioning the time and effort, spent by the supervisor and the student, spanning several meetings before finetuning and arriving at what has been presented below. 22 2.2. Pre-Requirements-Collection-Stage Before beginning the study of the university’s Summon instance, an agreement was made with Serials Solutions to obtain a unique API Id and Key pair. This involved signing Summon’s terms and conditions by the supervisor. One of the key restrictions of this agreement was, in effect, not to run automated queries with their API. This was followed by studying the existing online documentation [25]. The API is broadly divided into three categories, viz., authentication, availability, search. The authentication category deals with generating headers for query requests made using the availability and search api categories. Availability category deals with generating requests and fetching results in various supported formats [XML, JSON, Streaming JSON]. Finally, the search api category provides means to query for resources present in the Digital Library, using query strings, unique resource identifiers, etc. Please note that the whole of this api category is too large to be described in its entirety and to be fit within the scope of this document. Broadly, it supports extended or advanced searches based on concepts like commands, parameters, fields and tokens. The important takeaway from this section is that, these advanced features help exploit the power of Summon in performing server-resource-intensive search operations and support advanced user interface functions like pagination of results of querying, filtering a set of results using several criteria at once to generate a subset, etc. 2.3. A Brief Study Of University of Edinburgh’s Summon Instance The university’s Summon instance is hosted at http://ed.summon.serialssolutions.com/, on a server managed by Serials Solutions externally. The actual data displayed on searching via Summon is provided by the university. All the metadata, schema, storing and retrieval of actual data, indexing it, the search process mechanism, etc. are done by Summon in a proprietary manner. The search interface via the browser shows two choices, viz., Basic Search and Advanced Search. The Basic Search feature is just like a modern day search interface with a single search box and a search button provided. The Advanced Search feature provides a big 23 form of several input boxes, one for each kind of constraint that the user wants to apply on the search along with the words to be searched for. Using either choice, once a search has been made, the page navigates to a search results page. This results page contains a set of results and a set of options for further constraining or filtering the already displayed search results. Each result represents one resource in the Digital Library collection. For each result, several pieces of information are presented in the form of attribute=value pairs. Such displayed pairs do not represent all possible pairs. On hovering over a “full view” icon, all possible information retrievable is displayed in a mini popup pane. The results themselves are split over several pages [with page numbers starting from 1 up to n, where n is sufficient to accommodate all the results of the output of the search]. Clicking each page number, in turn, fetches the corresponding segment of the search results. While the results are displayed in a vertical list in the middle of the browser screen, the constraining options are displayed quietly on the left side. Each constraining option is called a facet. Changing the value of a facet at anytime applies the corresponding constraint on the displayed search results and redisplays the results, using a new pagination of the freshly generated results list. Where applicable and available, an icon associated with the resource, like the snapshot of the cover of a book is displayed. A user can bookmark several resources by clicking the appropriate search result’s “save item” button. This list is saved only for the current session of a non-logged in user. The saved set of items can be accessed at the bottom of the screen and printed. Owing to a lack of permission from Summon, its screenshots are not being provided in this document. However, it is gently suggested to the reader that a visit to the above URL [for directly experiencing Summon’s capabilities exposed via their user interface] would be helpful for associating with the described details in this document. 24 2.4. Requirements Collection The primary requirements for the development of Tagus have been collected, under the supervision of Mr. Colin Watt, over a period of 6 weeks from June 01, 2011 to July 14, 2011. During this period, while “Del.icio.us” provided the base idea for the interaction model of this system, the other models, viz., storage, user interface, portability and authentication for this software were extracted from various sources. These have been listed exhaustively below: 1. 2. 3. 4. 5. Table 2.1. Table of sources Regular weekly meetings with Mr. Colin Watt Specific usability expert review meetings with Ms. Angela Laurins Previous communication sequences with Ms. Morag Watson, Mr. Simon Bains Weekly developer to developer meetings with Ms. Ianthe Hind, Ms. Claire Knowles Incremental iterative development and corresponding review by taking in other parallel projects [both running and deployed] of the IS section of the Digital Library, University of Edinburgh. The next couple of sections list the requirements themselves in detail, followed by the evaluation plan derivable from these requirements directly. Kindly note that any constraints and special conditions have also been duly noted in the appropriate sections. Also, under non-functional requirements, a specific subsection has been devoted to usability requirements, in accordance with the importance Digital Library gives to usability. Together, the following sections give an estimate of the size of the project spanning organization, management, development, deployment and testing areas. 2.5. Functional Requirements For each functional requirement, its classification, its origin and justification of the purpose, it achieves have been given along with the description. 25 Table 2.2. Table of Functional requirements Legend SST: Supervisor, Student [Team] DXT: Student, Digital Library Developers [Team] UXT: Student, Usability Expert [Team] 1. Users should be able to login and logout of the service using individual login and password combinations. Classification: authentication Origin: SST Justification: The service should be protected with an authentication procedure. This is to prevent anonymous usage of the service. 2. Users should be able to search for resources using plain words, starting from a single search box accepting input. This search functionality should perform similar to University of Edinburgh’s Summon instance. It should display the list of all resources as returned by Summon. Classification: summon’s public api Origin: SST Clarification: Replicating the entire user interface, professionally developed over several years by full time employees of Serials Solutions, including support for advanced features like all facets, commands, filters, etc. is counter-productive in two ways. Justification: Firstly, this project, Tagus, is about the concept of applying tagging in an academic library environment and not about redeveloping a new age library resource discovery system. Secondly, replicating all the user interface features itself would take several months to years and would be beyond the scope of the duration of this project. 26 3. A user should be able to add a personal annotation to a resource. 4. A personal annotation for a particular resource is displayed every time that resource comes up in the user’s future searches. The annotation for the resource is visible to only the user who created it. 5. A user should be able to delete an existing personal annotation to a resource. Classification: tagging Origin: SST Justification: Personal annotations should be visible only to the user who creates them. This allows for incorporating personalization with privacy in place for the user. 6. A user should be able to add a public tag to a resource. 7. A public tag for a particular resource is displayed every time that resource comes up in any user’s future searches. The tag for the resource is visible to all users of the service. 8. A public tag, whenever and wherever it is displayed next to a particular resource, should be clickable. On clicking, a new search functionality should display the list of all the resources with the same public tag. 9. A user should be able to delete an existing public tag to a resource, provided he / she created that public tag for that resource. Classification: tagging Origin: SST Justification: Personal annotations should be visible only to the user who creates them. This allows for incorporating personalization with privacy in place for the user. 27 10. The data returned from Summon’s API for each library resource should be displayed in the user interface. Classification: user interface navigation Origin: UXT Clarification: One aim for Tagus is to have less clutter in the user interface while displaying results. Beyond The Proposed Requirements The following requirements have been gathered after the first fifth and the sixth weekly cycles of iterative software development of Tagus and have been implemented, going well beyond the planned requirements set. These required additional work and have been successfully implemented. 11. A user should be able to view a list of all the public tags that he / she has created till date. This functionality would be referred to as the tag cloud. Classification: user interface navigation Origin: Student, Approved By UXT Justification: Multiple routes to start a user’s search for library resources using tags would help achieve user satisfaction. 12. A user should be able to able to search for resource marked with a particular public tag, by inputting the tag as an input to a search box. 28 Classification: user interface navigation Origin: Student, Approved By DXT Justification: Starting with a search using plain words would result in a list of resources being displayed. Next to each resource, a list of public tags would be displayed. A user could click on any of these tags and this would result in a new search for all resources marked by the same tag. While this functionality definitely suffices for the originally proposed requirements, a use case has been visualized in which a user could want to generate a new search using a tag word which is not in the list of results. 13. Every time a search functionality gets executed, the results of the search should be displayed in a fixed format, split across several pages, showing a limited number of resources per page. Classification: usability guidelines, user interface navigation Origin: Student, Approved By DXT, UXT, SST Justification: Summon’s API supports server-side pagination. As a result, the user interface rendering logic and usability of the UI are simplified by rendering a fixed number of results per page and displaying a list of pages. This is in line with the usability guidelines of modern day search engine user interfaces. Clarification: The customized tagging API, that supports both standalone queries as well as being integrated with Summon’s public API should support a similar pagination feature. In case, such a simulation is not feasible on the server, the client side should take care of the UI rendering in such a way that the user is oblivious to the display of results. In other words, the search results should always be paginated and well segmented, irrespective of the method of search for a resource, viz., search summon using words, search public tags using words or search by clicking on an existing public tag. 29 14. Once a search functionality has been performed, a user should be able to export the top results of the search as a downloadable document file, with the results in it having been formatted in a fixed manner. Classification: extra feature exportability Origin: Student, Approved By SST Justification: This functionality was originally reserved as a requirement of a prospective new tool, which would use the Tagus Standalone API [i.e., independent of Summon] to generate a list of resource identifiers of digital library resources. Being able to develop such a new tool would provide solid proof-of-concept evidence that the tagging service is capable of performing well even without Summon. Clarification: Instead of waiting for another developer to use the newly developed Tagus Standalone API, this was suggested by SST for extra credit and has now become an extended feature available to users. However, for completeness, since just an exported list of resource ids would not be really usable for a user, the Summon API is invoked to generate a complete description of each resource in an exported list. In other words, exporting a page of displayed results would generate a downloadable file for the user consisting of information fetched from Summon. 15. There should a standalone API to this service, running as an independent service. Classification: standalone api Origin: Student, Approved By SST Clarification: The tagging part of the service should be independent of the methods and technologies used to fetch data from Summon. Justification: The tagging api should also work standalone as a subservice. The next section lists the non-functional requirements, which were derived on analyzing the functional requirements. 30 2.6. Non-functional Requirements And Further Discussion As referred to, in the introduction section, resource discovery in an academic library environment is characterized by the following factors, 1. Speed 2. Availability 3. Findability 4. Personalization as a means of relevancy. As discussed above, Summon claims to achieve the speed factor, with its multitiered indexing approaches [5]. Over the course of this project, this has been verified to be true. Summon is very fast in its response times, when responding to user queries via its public API. For future developers who would further like to increase the speed of Summon, please note that since it is hosted by Serials Solutions at their vendor site, improving speed is not achievable without access to source. Availability of resources in the Digital Library is outside the scope of this project, as it involves the procurement and maintenance of resources and/or their required licenses on a periodical basis in the library. The primary means for finding a resource in Summon is via the “single search box”, further aided by a set of mechanisms for finetuning the results [5, 6]. There is no direct means of finding other related results from each output in the result set. Similarly, with the current system, there is no means of adding personalized relevant information like a user’s comments [relevant to only that user] on search results. This project provides a way of personalization to end users by introducing the concept of tagging in an academic library environment and exploits this personalization to provide a new means of findability. This project is called Tagus, and it achieves both these factors through the concept of community-owned-collective-responsibilitytagging. This is elaborated below. A tag, within the scope of this project, is defined as a word or a phrase, describing a digital resource. A user tags a resource with one or more tag-words or tag-phrases, which would be stored in the system. If a tagged resource is output as the result of a 31 future search, the tags are displayed along with each resource’s regular information [6]. The user could use a tag to find new resources, e.g., typically by clicking a tag, which results in the system outputting a set of resources tagged [previously by users] by the same tag-word or tag-phrase. As indicated in [13], “applications that allow users to add personal metadata, notes, and keywords (simple labels or “tags”) to help manage, navigate, and share their personal collections help to improve digital libraries in three main ways: personalization, socialization, and integration”. This project extends the scope of tagging [as popularized by Delicious, 12], by including the concepts of private/personal and public tags. A private/personal tagset for a particular resource by a particular user would be visible only to that user in future searches. This method of including privacy provides a means of annotating, to the user. The public tags, on the other hand, would be globally visible across users and across resources. The significance of tagging has been highlighted clearly in [13], viz., “The ability to share data and metadata in this way is becoming increasingly important as more and more science is done by larger and more distributed teams rather than by individuals. Such social bookmarking is already available on the Web site of publications such as the Proceedings of the National Academy of Sciences and the journals published by Oxford University Press.”. 2.6.1. Implications and Constraints while choosing technologies for interaction with Summon The requirements for tagging imply a storage of tags and their retrieval. Storage, typically would require a database and fast retrieval would require an index on the data in the database. Also, “Serials Solutions, the developer of Summon, claims to have over 500,000,000 items indexed from over 94,000 journal and periodical titles.” [5]. Therefore, availability could be rated as “huge” in terms of projected data size, as potentially, the proposed application for tagging needs to deal with tags for such numbers of digital library resources. Since Summon’s API is restricted to providing only read-based queries [4] to its own index, i.e., custom data cannot be written into Summon’s database, an external 32 database of tags is required. This follows from existing literature, e.g., from [13], “Like Alexandria, most digital libraries are currently read-only, allowing users to search and browse information, but not to write new information nor add personal knowledge”. Also, since an instance of Summon itself is a vendor hosted solution [6], this database needs to be based on a fast, external [w.r.t. Summon’s remote hosting location] system at the University of Edinburgh. Finally, tagging is an incremental activity w.r.t. users and searches. As users adopt Summon and search, more search results are tagged over time. This incremental nature should be considered as an important factor for deciding on the indexing technology to be used. Apache Lucene [9] provides the best open source indexing solution, with its own internal means of maintaining data. Though, this was considered initially, ElasticSearch [7], an open source wrapper around Apache Lucene, has been chosen because it meets the “incremental nature” factor described above. 1. 2. 3. 4. ElasticSearch is “fast”, due to its basing on Apache Lucene provides searching with “free search schema” indexing data uses “JSON over HTTP”, also used by Summon and its API allows scaling by “starting with one machine and scale to hundreds” [7]. ElasticSearch provides means of configuring an index over a distributed set of machines, whose number could grow as the size of tagged data grows. Summon, itself, is known to rely on Apache Lucene [9], at its core, thus it is reliable [5, 6]. An alternative has been identified to be Apache Solr [8], also based on Lucene, but due to its additional technological dependencies on Java servlets, a container software like Tomcat, it would serve as a backup, incase ElasticSearch’s instances don’t scale well enough in future. 2.6.2. Constraints due to standalone mode As the tagging api should be available as a standalone service, the choice of technologies should be similar to what Serials Solutions itself uses to maintain Summon. In other words, the indexing component should be 1) dynamic [i.e., non-statically invoked] and 2) high performance, in terms of uptime. This implies that even if Summon’s servers handling calls via its own public API are not accessible, a developer 33 wanting to use the tagging api should be able to do so without waiting for Summon’s servers to become accessible. Finally, please note that though this was meant as a pilot project, due to these constraints, choosing technologies such as ElasticSearch / Lucene / Solr provides a solid technology framework making the standalone api very robust. Thus, quality of the technology stack has been very high throughout the project. 2.6.3. Usability requirements In this section, we look at the concept of usability as applicable to library resource discovery systems. We start with existing definitions of usability, proceed to list the options for what criteria are available for usability while justifying our choices. Also in planning to evaluate the system after development, we list the types of evaluation possible and arrive at the reason why certain kinds of evaluation suit this project, better than the others. The Digital Library stresses the importance of usability testing for every product / system it creates / deploys. This is true for even the existing resource discovery systems at the university like Searcher, catalogue search facility, etc. In some cases, usability professionals are engaged in the process. For example, UserVision [http://www.uservision.co.uk] was employed to perform an exhaustive study on usability problems of the existing Searcher system. Due to a lack of permission, the results of this study are not being referenced in this document. However, it is available on request [Digital Library] and highlights certain measures of usability while making suggestions to improve it. 2.6.3.1. Definitions of Usability, Kinds of Usability Criteria Available The definition of usability varies according to the subject under investigation and is often adapted to fit into the context of the problem under investigation. However, in the area of Human-Computer-Interaction [HCI], some criteria are broadly accepted for what could be viewed as usability. 34 ISO 9241 Part 11 [28] ISO 9241 Part 11 defines usability as below [28]: “The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use”. Please note that the course website for HCI at the University of Edinburgh [35] also describes usability in the following manner, quoting lists of heuristics from Jakob Nielsen and principles from the book authored by Alan Dix et al. Nielsen’s 10 Usability Heuristics [29] These heuristics are listed below [29]: “Visibility of system status User control and freedom Error prevention Flexibility and efficiency of use Help and documentation Match between system and the real world Consistency and standards Recognition rather than recall Aesthetic and minimalist design Help users recognize, diagnose and recover from errors”. Usability Principles from HCI Textbook [30] These are listed below [30]: “Learnability Flexibility 2.6.3.2. Robustness”. Choices Made From Available Usability Criteria Within the scope of this project, we adopt this definition of usability, while keeping it in line with the above standards: [please note that wherever applicable, keywords have been underlined to indicate their association with the keywords in the above three standard descriptions of usability]: Table 2.3. Table of Usability criteria chosen for Tagus with justification Usability for Tagus, comprises of the following criteria 1. speed of service, interaction has been chosen to represent efficiency of implementation and the power of underlying technology stack in bringing a fast user experience 2. ease of use, being intuitive to new users has been chosen to represent user control and freedom, recognition 35 rather than recall, standards 3. ease of learning has been chosen to represent visibility of system status, consistency, recovery from errors, learnability 4. findability of what a user is intending to find has been chosen to represent consistency and standards, effectiveness 5. minimal UI has been chosen to represent learnability, ease in repeating an already learnt task. These criteria are themselves, in line with the non-functional requirements described in the previous section. In this subsection, we have seen why certain criteria were chosen to represent usability for this project. While criteria provide a strong theoretical framework, they need to be planned for and implemented / measured in evaluation methods to arrive at a measure of usability of a system. In the next two subsections, we will look at two kinds of evaluation of approaches and see why certain criteria fit one evaluation approach and don’t fit the other. 2.6.3.3. Quantitative Evaluation, What Criteria Fit Into Quantitative Evaluation? Quantitative evaluation is an approach to evaluation of criteria in which the primary mode of measurements is using numbers and calculations, often directly obtained from the system under observation. From the above table, speed of service, interaction is a candidate for measurements using quantitative evaluation. This is because, the end system could be timed and durations of each activity / event / request-response-cycle / start-to-end of a user action could be obtained with a direct measurement. Table 2.4. Criteria qualifying for quantitative evaluation Criteria [Abstract concept] Transformation into evaluation [Concrete, measurable entity] speed of service, interaction speed of service in retrieving data and in displaying data 36 The rest of the criteria, viz., ease of use, ease of learning, findability and minimal UI are not really measurable by a directly observing the system and noting down numbers. Rather, each of them indicates that a feedback from the user is needed to measure them. In other words, the system by itself [alone] is not a good indicator of how well it was used / how useful it was to the user. Therefore, asking the user for these criteria is a better approach. 2.6.3.4. Qualitative Evaluation , What Criteria Fit Into Qualitative Evaluation? Qualitative evaluation is an approach to evaluation of criteria in which the primary mode of measurements is using answers to questions aimed specifically at a task at hand. These questions may be put to the user in the form of choosing one among many categories, user ratings, interpreting of answers using further questions, etc. The digital library already follows a set of guidelines in carrying out such evaluations for systems developed or used in the university. On discussing with our Usability Expert, Ms. Angela Laurins, the following criteria have been found to fit well with a qualitative approach to evaluation, thus lending further strength to our decision of applying qualitative evaluation. Table 2.5. Criteria qualifying for qualitative evaluation Criteria [Abstract concept] Transformation into evaluation [Concrete, measurable entity] ease of use, being intuitive Ratings for intuitiveness and ease of use of UI to new users ease of learning Ratings from users and as measured from observation of user activity between first and second time usage findability of what a user is intending to find Ratings for the findability of a resource, that a user intends to find in the library Minimal UI Ratings for ease of creating reading lists. Though the system and hence its UI are designed to be used to find resources, this will help determine whether a user finds the UI small enough to quickly navigate it and find out how to perform secondary actions 37 2.6.3.5. Why Not Other Criteria? Limitations, Constraints, Overlaps Though we have arrived at a concrete method of evaluating each usability criteria and we have justified these decisions / methods of obtaining measurements, we still have to justify why some other criteria were not enlisted in our scope of usability. Please note that, while they have been taken into account while developing the system, they are not being specifically evaluated due to the following reasons: The primary reason is the original scope of the project: a pilot tool, a set of test data and a pilot usability testing report. The secondary reason is the time constraint of the project: Work has been done full time everyday throughout the duration of the project including weekends. However, this project does not have a pre-existing, working software as a starting point. At the beginning, it was not even known if the development of such a tool was feasible in practice starting with just one resource, the Summon API. The third reason is that carrying out a full time end-to-end usability evaluation requires the full time work of an expert professional and often spans months of effort [known from inputs from the Digital Library Team about prior experience of working with similar systems at Digital Library]. The first aim of this project is to study and see the feasibility of developing such a novel tool in academic library environments, the second aim is to proceed to develop the tool, if found feasible. Thus the limited time is a critical constraint in this study. Finally, the following text gives specific reasons for why certain criteria were not taken into consideration: 1. Match between system and real world has not been included as a separate criterion, because it overlaps with intuitiveness and ease of learning within the scope of Tagus 2. Flexibility has not been included as a separate criterion, because it overlaps with user control and freedom within the scope of Tagus 3. Robustness has not been included, because it the concept of a pilot system. Robustness is ideally measured once a system has been deployed for real or simulated use and its performance over a decently long period of time has been monitored. Similarly, within the scope of Tagus, usability testing is really not about measuring for how robust the system is to hacks, exceptions, denial of service attacks and the like. 38 2.7. Evaluation Plan An evaluation plan, under the supervision of our usability expert, after three sessions of reviews, was formulated for Tagus, to be executed, once its requirements have been implemented and preliminary testing activities have been completed. The complete set of documents, viz., the usability testing script, its plan [sheet] for each user [data from each test participant has been included in the repository] have been included in the appendix. 2.7.1. Constraints on automated testing We conclude this chapter, with a small note on why automated testing was not employed for Tagus. Automated generation of test data and testing is well suited method for this system since there were no data and no users to start with. Automated queries to Summon could have helped create dummy tags with simple scripts exploiting the API nature of the new system. However, the terms and conditions, signed with Summon, actively prohibit such automated query generation and retrieval of metadata/data. This is understandable from the vendor’s perspective as the server is externally hosted and maintained by them and they would like to prevent unnecessary load and prohibit any prospective abuse of their system. 2.8. Guidelines to future developers Requirements collection forms an important phase for novel pilot projects in academic library environments, especially when there are no existent systems with similar functionalities, like in the case of Tagus. The guidelines for future developers from this section are to: 1. Start early 2. Revise, question the eligibility of each requirement, justify 3. Evaluate feasibility of implementation w.r.t. time and resources 4. Finalize the scope of each requirement, clearly declaring any assumptions in a manner similar to the approach [described above for Tagus]. 39 40 3. Chapter 3. Design 3.1. Motivation for this Chapter In the previous chapters, we have described the background of this project, reviewed existing literature about new age library systems, the purpose of development of this service and elaborated why and how requirements were collected, along with the justification for each decision made and approved. In this chapter, we have described the overall architecture [top level view] of the system, followed by a list of designed structures [mid level view]. The source code for the system itself has been delivered separately in the Digital Library’s SVN Repository [low level view]. Please note that where appropriate, design decisions have been justified with detailed descriptions of the constraints leading to those decisions. We wind up this chapter with a section dedicated to the design of the user interface of Tagus and once again, with a list of reasons for why some UI elements were chosen over others. 3.2. Architecture The architecture diagram for Tagus has been given below. Primarily, it could be viewed as multiple sets of servers and clients, working together to provide a service to the end user. As described earlier, the university’s Summon instance is hosted by Serials Solutions on an external vendor server. It is a black box for the developers of systems like Tagus. Referring to the diagram, everything below this black box, is under the purview of Tagus. There are 3 broad components within this architecture, labeled below as 1, 2 and 3. 1. WEB SERVER 2. INDEXING SERVER 3. BROWSER / CLIENT. Please note that 1 and 2 do not communicate directly with each other. This is a central aspect of the architecture and design of components of Tagus. The roles of all three components are described below. 41 Figure 3.1. Architecture Of Tagus EXTERNAL SERVER HOSTED BY VENDOR Blackbox to the rest of the system Summon Public API {Synchronous Calls} WEB SERVER Server Applications/Web Server Dynamic Resources Summon Interactor 1 INDEXING SERVER Web Server Static Resources Indexer-Tagus-NotationConvertor Tagus API List Generator Tags Adder Summon Searcher Tags Deleter Summon Results Fetcher Tags Searcher Authentication Summon Results Encoder Tags Fetcher {Asynchronous Calls} BROWSER External Browser Libraries 3 Tagus Thread Tags Fetcher Main Thread Styles Tags Adder Summon Thread Encoders Images Decoders Utilities Session Verifiier MiniPars ers Events Registration Summon Results Decoder Summon Results Renderer Tags Searcher Tags Deleter Events Controller Events Handlers 42 Tags Renderer 2 Further Discussion On Components Of The Architecture And Design Decisions 1. THE WEBSERVER The web server hosts all the server components and applications needed for delivering data to the browser. How it achieves each functionality expected by the browser is private to itself. The only constraint on the webserver is that the data exchange format between itself and the browser component should be fixed. It acts as a layer between the vendor’s external server and the user’s browser. It achieves this via the public API provided by Summon. The communication between the external server and the web server is synchronous. Similarly, it supports certain functionalities like generation of user lists, once requested by the browser. A browser by itself cannot generate a format other than the markup language it supports. Hence, the decision was made to move the functionality, to create lists in any format as required, to the webserver. Finally, this component also is responsible for creating, maintaining, verifying and destroying user sessions. This is a standard practice in user authentication techniques. The communication between the webserver and the user’s browser is asynchronous. This decision has been made in the light of recent developments in browser technologies. Technologies like AJAX [Asynchronous Javascript And Xml] enable developers to provide desktop-like interactivity and speed to end users. Thus, this component is highly cohesive in its functionality and is deployable [please refer to the deployment section below] independent of the other two components. 2. THE INDEXING SERVER The indexing server hosts all the data required for providing functionalities related to tags, to any client who requests for them. This includes fetching the tags for each library resource, as well as tags specific to a particular user, etc. This component has two subcomponents, the data maintainer and the data indexer. However owing to the high amount of study and reasoning put in during the proposal stage and the background work for this dissertation, the choice of technologies for 43 implementing this “indexing server” component have already been reduced to a small set of three, viz., Apache’s Solr, Apache’s Lucene, Elastic Search. Owing to a further refinement of requirements and looking forward to scalability and maintainability of this service in future, Elastic Search scored over the other two in terms of its support for distributed systems. In future, if the data serving load needs to be split across multiple servers, but the clients requesting for the data need not change their methods for issuing requests, then Elastic Search as a choice is perfect for the situation. It provides means of “scaling from one machine to hundreds” [7]. Please note that a straight forward method for applications such as Tagus would be to split this component in two parts, viz., a database holding tags and related data and a separate indexer which runs once in a while [periodically], accumulating the changes and reindexing the changes. This method requires a) A schema for the database b) A separate server for the database c) A separate indexer d) An interoperability module for triggering the indexer either on every change or once every day, etc. e) A database administrator ready to monitor the system f) A high maintenance cost, for example, in future, if one wants to add more data to a particular table with a fixed schema g) Database system upgrades, downtime, other management, etc. h) A mapping between the user authentication system of Tagus and the database authentication system i) Finally, a mapping of user requests into SQL queries and usage of libraries to perform data format conversions between formats supported by a database system and a browser Elastic Search solves a lot of problems arising out of these activities and reduces maintenance costs by doing the following a) It itself stores the data b) It performs an indexing activity for every change c) The data to be stored and indexed itself is schemaless, that is, there is no fixed set of fields to be defined before starting to store data d) It is fast, as per our experience in developing and testing Tagus 44 e) It [optionally] auto-generates unique identifiers for every block of data inserted f) It provides a HTTP REST [26] interface, implies, a developer aware of what a HTTP URL looks like and JSON [27] can easily manipulate complex requests into easy calls g) We can now utilize the authentication system already in Tagus for indexing h) Finally, JSON is supported by all modern browsers, which implies there is no need for format convertors for Elastic Search to understand a browser’s request. Finally, since an indexer technology itself [Elastic Search, in this case] would be unaware of what a “tag” means and how to interpret custom requests for retrieving data from its indexes, a subcomponent has been created specifically for this purpose, and it is called the “Indexer-Tagus-Notation-Convertor”. Just like the communication between the webserver and the browser, the communication between the indexing server and the browser is also asynchronous. Once again, this decision has been made to provide a fast user experience to the user. Thus, this component is highly cohesive in its functionality and is deployable [please refer to the deployment section below] independent of the other two components. 3. THE BROWSER The browser component is where the majority of the controlling of the application takes place. In other words, every request starts with an user action in the browser. Similarly, every request, after completing its several request-response cycles, finally culminates in an update to the UI in the browser. Thus, this component is highly cohesive in its functionality. This is the only component which is coupled loosely with the other two components. Please note that in summary, 1 communicates with 3 & 2 communicates with 3. 3 orchestrates the whole application and coordinates user events, actions and requests to be asynchronously mapped to either 1’s server functionality or 2’s server functionality or both. Finally, all the above design decisions make the components very loosely coupled. This makes the whole architecture extensible and maintainable for future. An example of such an extension is described in chapter 6 in the future work section. 45 3.3. Design Of Core Services We define “Core Services” for Tagus as the set of functionalities providing everything except the user interface modules including Table 3.1. Table of core services 1. the ability to fetch information on library resources from Summon 2. the ability to add or delete a tag / annotation 3. the ability to search the local data for listing resources marked by a certain tag 4. the ability to generate lists from results pages 5. the coordinating capability for utilizing 1, 2, 3, 4 to drive a request. These modules are split across all three components of the architecture, described above. Thus, functionality wise, the view of the whole system could be pictured as below. Figure 3.2. Modules of Tagus requests combined coordination capability authentication loads invokes external libraries loads configuration invokes invokes summon interactor tagger list generator combined coordination capability WEB SERVER BROWSER INDEXING SERVER 46 3.4. Design Of Standalone API From the table of core services, the subset of services provided by the standalone API is given by 2, 3 and 4. the ability to add or delete a tag / annotation, the ability to search the local data for listing resources marked by a certain tag and the ability to generate lists from results pages. Since the modules for core services were designed with low coupling, with each component invoked only when necessary, no separate design for the standalone API was necessary. Since they represent only a subset of the above table, the view of the standalone version of the system now becomes a subset of the above diagram. Figure 3.3. Modules of Tagus Standalone API combined coordination capability invokes invokes tagger list generator combined coordination capability BROWSER / API INVOKER INDEXING SERVER 47 3.5. Final Class Diagrams 1. PHP Classes Summon +$debug +$client +$host +$apiKey +$apiId +$sessionId Public API Of Summon, External Class +Summon() +getRecord($id) +query($query, $filterList, $start, $limit, $sortBy, $facets) -call($params, $service, $method, $raw) +_process($result) +hmacsha1($key,$data) SummonService +SummonService() +searchFor($queryString, $pageNumber) +getResourceFromId($resourceId) +fetchResults() +getResults() DataServiceWithApiKey Main() Proxy Class To Above Class With University Of Edinburgh’s Unique Id & Key Interfacing Class Which Intercepts Requests And Issues Responses 48 Utilities Class Utilities Providing Constants +getCurrentURL() +getSessionManagerURL() +getUserHomePageURL() +getLoginPageURL() +getSummonServiceURL() And Common Functions External Class From FPDF.ORG [34] To Generate PDF Files fpdf PDF +LoadData($array) +BasicTable($header, $data) +ImprovedTable($header, $data) +FancyTable($header, $data) ListGenerator Main() Adapted From An Example Class Provided From [34] Wrapper Class To Above Class, Which Abstracts Out A Lot Of Low Level Functionality Interfacing Class Which Intercepts Requests And Issues Responses 49 Connects to a database and verifies the presence of a given username and password in it AuthenticationDatabaseConnection +AuthenticationDatabaseConnection() +VerifyUsernameAndPassword($u,$hashofp) AuthenticationService +AuthenticationService() SessionManager +SessionManager() -tryToCreateOrMaintainSession() -setSessionSpecificValue($value) -getSessionSpecificValue() -checkSession() -login($un, $pwd) -logout() A Simple Authenticatio n Service Class Which Can Be Plugged To Any Kind Of Authenticatio n Mechanism Here We Plug It In To Work With The Above Database based authenticatio n verification class. Interfacing Class Which Intercepts Requests And Issues Responses 50 AuthenticationDatabaseConnection Summon 1 1 1 1 AuthenticationService SummonService 1 1 1 1 DataServiceWithApiKey SessionManager * 1 1 1 Utilities 1 1 fpdf PDF 1 1 ListGenerator * Figure 3.4. Complete Class Diagram Of Tagus’ Server 51 2. Javascript Files [Treated Equivalent To Classes] Session +DecideIfLoginIsPossibleAndAct(data) +DecideIfSessionIsStillValidAndAct(data) +DoLogoutAndAct(data) +SessionVerifier() +EnableLoginForElements() +EnableLogoutForElement() SummonFieldsConfiguration +OpenUrlRenderFunction(dataArray) +DefaultRenderFunction(dataArray) +RenderSingleSummonResourceAsSummary(doc, element) +RenderSingleSummonResourceAsFull(event, doc) Main +main() 52 Summon +RenderJSONAsVerticalList(element, pageNumber, json) +RenderResultsOfPage(resultsElement,query,pageNumber,dataToDisplay) +PaginateAndRender(data, pagesElement, resultsElement, query) +EnableSummonSearchForElement() SummonRendererBasedOnIds +RenderJSONAsVerticalListBasedOnIds(element, pageNumber, json) +RenderResultsOfPageBasedOnIds(resultsElement,pageNumber) +PaginateBasedOnIds(arrayOfResourceIds,PagesElement,resultsElement) +GetResourcesBasedOnIds(arrayOfResourceIds) TaggerCore +ValidateAccessibility(accessibility) +ValidateUser(user) +ValidateResource(resource) +ValidateTag(tag) +AddTag(accessibility, user, resource, tag, callback) +DeleteTag(accessibility, user, resourcetag_combo_id, callback) +GetTags(accessibility, user, resource, elementToLoadTagsIn) +GetAllTagsForUser(accessibility, user, elementToLoadTagsIn) +SearchForResourcesMarkedWithPublicTag(tag, callback) +SearchForResourcesMarkedByUser(accessibility, user) +ListAllUsersResourcesAndTags(accessibility) 53 TaggerRenderer +RenderSingleTagWithoutDeleteTagButton(tagName, sizeToRenderIn, countOfTag, accessibility, listElementToRenderIn) +RenderSingleTagAlongWithDeleteTagButton(tagName, tagId, accessibility, userName, listElementToRenderIn) +RenderTagsAsList(setOfTags, accessibility, userName, resourceId, listElementToRenderIn) +RenderAddTagButton(accessibility, userName, resourceId, elementToRenderIn, associatedTagsListElement) +GetAndRenderAllTypesOfTags(element, resourceId) +EnableTagusSearchForElement() +EnablePublicTagsForElement() JqueryExtensions +main () SafetyPrecautions +main () 54 UIElements +ExportListButton () +ButtonResultsTable() +ResultsTable () +PagesElement () +ResultsHeading () +ResourcesForm () +HelpFAQButton () +HelpVideoButton () +PublicTagsButton () +PublicTagsBox () +TagusSearchbutton () +TagusQueryField () +SummonSearchbutton () +SummonQueryField () +UserTagsBox () +UserTagsButton () +Main() +LogoutButton () +DisplayGuidelines () +DisplayGuidelinesButton () +AuthenticationLogoutForm() +LoginButton () +PasswordField() +LogMessage () +Loader() +HelpMessage () +Status() +PasswordField () +UserNameField () +AuthenticationLoginForm () 55 UIElementsManipulations +HideLoader () +ShowLoader () +SetResultsHeading(message) +SetLogMessage(message, cssclasslevel) +ShowPreview(event, message) +LoadUrlInPopup(url) +ToggleGuidelines() +SetDisplayGuidelines() +HideElementsForNonLoggedInUser() +ShowElementsForLoggedInUser() +EnableHelpForElement() +SetExportFunctionalityOnElement() +InitializeTabsInSearchForms() +InitializeVisibilities() +InitializeHomeReset() Utilities +getTAGUS_MAX_RESULTS_PER_SEARCH() +getTaggerProxyURL () +getTagusCommonSuffixURL () +getCurrentURL () +getSessionManagerURL () +getSummonServiceURL () +getSearchForResourcesUsingPublicTagURL () +getFetchTagsForResourceAndUserURL () +getAddTagsForResourceAndUserURL () +getListGeneratorURL () +getRemoveTagsForResourceAndUserURL () +getListDownloadableFileBaseURL () +setUserName () +invalidateUserName () +getUserName () +minimum() +removeGarbage () +addGarbage () +validateSummonData() 56 Main 1 1 1 1 1 1 TaggerRenderer Summon 1 1 TaggerCore SummonRendererBasedOnIds 1 1 1 1 SummonFieldsConfiguration 1 1 1 1 UIElementsManipulations UIElements JqueryExtensions Utilities Session SafetyPrecautions Figure 3.5. Complete Class Diagram Of Tagus’ Client 57 3.6. Final Deployment Diagrams The following diagram shows the current state of deployment in the Digital Library office. Current Deployment Diagram TAGUS-TEST.LIB.ED.AC.UK WEB SERVER INDEXING SERVER For the scope of this pilot project, a single instance of indexing server suffices. However, in future, if there is a need to scale the indexing server, because of the support in Elastic Search for distributed indexing, we could deploy Tagus as follows. The extra machines as compared to the above diagram, need to run Elastic Search instances, configured with the same unique identifier string before starting up [7]. Future Deployment Diagram [Prospective, Scaled] ANOTHER SYSTEM WITHIN THE SAME LAN AS INDEXING SERVER1 TAGUS-TEST.LIB.ED.AC.UK WEB SERVER INDEXING SERVER1 INDEXING SERVER2 ANOTHER SYSTEM WITHIN THE SAME LAN AS INDEXING SERVER1 INDEXING SERVER3 Figure 3.6. Deployment Diagrams of Tagus 58 3.7. Design of the UI 3.7.1. Design Rationale The design of the user interface itself deserves a separate subsection because of the amount of challenges faced and the effort put in to overcome them. Further, the user interface needed to provided a fast user experience. Therefore, it required careful planning and testing of available alternatives. Several design decisions were recorded in the process, the most challenging ones of which, are presented below with their final solutions. The design of UI for Tagus was governed by 1. Data Protection Policy [32], in force at the university a. No explicit “Terms and Conditions” for new users b. But no personal data has been collected from any user, which could be used to identify a particular user 2. A subset of the Web Accessibility Guidelines [33] a. We are aware of the full guidelines for accessibility in websites b. However, please note that implementing the complete set of these guidelines in the limited time available for a pilot project was deemed unfeasible 3. W3C standards like XHTML and CSS were used. 3.7.2. Expert usability inputs The process for developing the UI consisted of six design-implement-review-redesign cycles spread evenly in the beginning and in the end of the duration of this project. The reviews typically occurred within the Digital Library office and inputs were received from almost everyone in the SST, UXT, DXT teams [Ref: Section 2.5]. The final approval for each UI decision was given by either the supervisor or the usability expert with adequate time for implementation and technical testing. It is gently suggested to the reader to have a look at Appendix 8.1 [UI screenshots], before proceeding to the next section. We now present three challenging UI problems, the alternatives evaluated and the decisions made along with their justifications. 59 3.7.3. Two column layout versus single column layout for results The first challenging problem that we faced was in presenting the results to a user. Searches could be generated by user actions in two ways, viz., by searching Summon explicitly through a search box and by clicking a public tag. If the user wanted to have access to both the sets of results at the same time, through these two means, then a two column layout became a good candidate for a solution. The search results pane was now divided into two instances, the left one would show the results of direct Summon textbox based searches, while the left one would show the results generated by clicking a public tag [that is, searching for all resources marked with the same public tag]. Figure 3.7. Two Column Layout in UI However, its drawbacks quickly became apparent with some preliminary usability testing, done daily throughout the development. They are: 60 1. The two views of search results need to be kept synchronized a. For example, if a user adds a new tag for a particular library resource in the left pane and if that resource exists in the currently viewed page of the right pane, then the right pane should get updated as soon as the left one gets updated and vice versa b. The same example holds for deleting a tag or adding an annotation or deleting it 2. If a new means of generating a search was added to the two existing means, that could potentially mean that a third search results pane should be displayed for the same reason 3. Finally, dividing the search results pane into multiple columns, with each column displaying results for its own differently generated search divides the available width of the visible screen into multiple parts. As a result, the results’ details when displayed begin to appear crammed, and this brings up another problem while trying to solve the first usability problem. Thus, this decision needed to be changed. However the ease of access of various results generated by different means should still be prevalent to the user. The final solution, after experimentation, thought and testing came out in the form of tabs. Since each means of generating a search represents an option or an alternative to the user, an “OR STATE UI state holder”** element was necessary. If such an element could be provided for the means, then there could be a single results pane [single column layout] which could be displaying the results of the last search, irrespective of how it was generated. The OR STATE UI state holder element was chosen as a “set of tabs”. Another alternative could have been a radio buttons group, which would also enable only one choice being active at a time. If a user wanted to go back to another means of search, simply clicking a different tab than the current one would take the user to a different means of generating a search. For example, if a user first clicks the “Search Summon” tab, types some words [e.g., “history of island”] in the input box, clicks the search button, a set of results are displayed below. Now if the user clicks “Search All Public Tags” tab and searches for all resources marked with a particular tag [e.g., “History101”], then this action would ** An OR STATE means a choice to be made from multiple options. Only one choice could be active at any one time. Dropdown lists, radio buttons, tabs, multiple windows are all examples of such UI elements. 61 generate a new search. If the user now wants to come back to the earlier tab, he / she could just click it and their last search criteria would remain intact [“history of island”] in the input box. Clicking the search button is the only action necessary. Going once again to “Search All Public Tags” tab would retain its last search’s criteria [“History101”]. Once again, a click would suffice to regenerate what the user was viewing earlier through the same means. Figure 3.8. One Column Layout in UI using “Search Summon” 62 Figure 3.9. One Column Layout in UI using “Public tags” 3.7.4. Searchable annotations versus non-searchable annotations The concept of a personal annotation was introduced in Tagus, as a “personal note added by a user for a library resource”, visible only to the user who creates that annotation. This annotation would appear next to the resource, every time that resource would turn up in the same user’s future searches. It’s semantics were meant as strictly limited to a “personal note” to the user, thereby, making sure that they are not viewed by other users of Tagus. 63 Figure 3.10. Searchable hyperlinked personal annotations Though it was technically feasible to make the personal annotations also searchable, doing so would challenge a core concept of community-based-tagging system, sharing of data across multiple users. The original sense, in which tagging was introduced was that “if a resource is worth tagging with a particular word for one user, it is probably worth letting other users know that this resource could be associated with that word”, thus implying that “if one individual tags a resource R with a tag T, then a whole community benefits in knowing about R through T”. Making personal annotations searchable implies that users could annotate a resource R with a word T and nobody in the community benefits from the annotation because 64 nobody is aware of it except the user who created the annotation. Thus, the final decision was to drop the idea of making “personal annotations” searchable. Figure 3.11. Non-Searchable plain-text personal annotations 3.7.5. Tags List versus Tags Cloud The concept of displaying the list of all public tags that were generated by a user for various resources, over time, in Tagus, was not in the original list of requirements. This required additional thought, justification for the utility of such a concept and consequently implementation and sufficient testing to ensure that the speed factor of user experience is not compromised. 65 Displaying all of a user’s tags is a quick way to provide access to resources already accessed by the user previously. It is a popular in modern web applications and captures a different view of the system from the user’s perspective. Finally, it provides a third means of generating a search for the user, in addition to “Search Summon” and “Search all public tags". The original UI design for this concept was as given in the below diagram. Figure 3.12. Tags list for a user However, there were a few new problems to be solved with the introduction of this concept. Though this meant going beyond the scope of a pilot project, being developed by a single individual, adequate thought was given and things were discussed and planned before an approval was given. First, screen space had to be allocated for displaying the list itself. Since it is a precious commodity, and this concept competes with the “Search Results” pane for screen space [irrespective of whether the results 66 pane itself follows a single or a two column layout], the list had to be displayed to the right of the screen. Secondly, if a tag is long enough to cause a wrapping of text or be displayed in a nonwrapping style, it would cause new usability problems for the user with either “readability taking a hit” or “scrollbars appearing and presentability taking a hit”. If the list was displayed to the left of the screen, similar problems would appear. Finally, displaying the tags in different sizes, according to the order of their frequency [a tag X used more than a tag Y would be rendered in the following way :: font-size of X is larger than font-size of Y] affects the vertical list display once again, with either “increased vertical list width causing the results pane to reduce in width due to limited screen space” or “scrollbars beginning to appear”. The solution was to use a horizontal tag cloud, with a horizontal scrollbar appearing as necessary. This removed horizontal alignment problems caused by vertical lists and enabled the addition of features like varying font-sizes and tag-counts for the tags in the list. The list is now officially called a tag cloud of that particular user. Figure 3.13. Tags cloud for a user 67 3.8. Guidelines to future developers Before beginning to design the architecture for novel pilot projects in academic library environments, especially when there are no existent systems with similar functionalities, like in the case of Tagus, 1. 2. 3. 4. 5. 6. The guidelines for future developers from this section are to: Study existing systems with as “close enough” functionality as possible Employ the rich experience of existing developers available in such environments Evaluate alternatives with the stakeholders [supervisor, usability expert here] Apply constraints, shortlist technologies in a manner similar to the approach [described above for Tagus] Most importantly, iterate to arrive at a stable design Redesign to allow for “high cohesion” and “low coupling” as much as possible, like how the web server and the indexing server do not interact directly with each other, but only with the browser. This helps maintain the system in future. 68 4. Chapter 4. Implementation 4.1. Motivation for this Chapter In the previous chapters we have seen how requirements were collected, analysed, filtered, and design was reviewed, justified with solutions being proposed for each problem faced. This chapter describes how these solutions were implemented in the final system. 4.2. Methodology The methodology followed for the development of Tagus, as highlighted earlier, was iterative software development, with weekly cycles culminating in reviews of the work done per week. 4.3. Technologies The final list of technologies that were used is as follows: Table 4.1. Table of technologies Web Browser UI: XHTML Visual Styling: CSS Dynamic Language For Client Side Processing: Javascript / Asynchronous Calls To Server Functionalities: Ajax External Libraries Used: jQuery Data Exchange Format Between Server And Client: JSON Server Side Technologies: PHP Protocol For Communication Between Server And Client: HTTP REST Data Indexing And Data Management Technology: Elastic Search Standalone API Invocation Tool: Curl Web Server: Apache Browser Platforms Supported: Mozilla Firefox 4.0.1 to 5.0 , Google Chrome 12 to 13 69 Development Platforms Used: GVim, Eclipse with PHP plugin Development Web Platform: XAMPP Development OS: Windows XP Documentation: Standard Comments In Source Code Deployment Server: Server http://tagus-test.lib.ed.ac.uk Deployment OS: Linux Javascript Debugger Tool: Firebug All source code is available in the Digital Library’s repository at :: https://svn.ecdf.ed.ac.uk/repo/is/digitallibrary/Summon/tagus. 4.3.1. Criteria The technologies that have been used to implement the design, described in the previous chapter, are primarily open source. Secondly, they have been chosen because the primary target platform for usage of Tagus is the web browser. Web technologies have grown in leaps and bounds in the last decade and offer highly detailed finetuning capabilities which help developers implement their design with considerable speed. Thirdly, to ensure the maintenance of Tagus, by somebody other than the developer [the student, in this case], we needed to ensure that either the technologies chosen for this project are in the broad list of skills of the Digital Library Software Development Team or could be easily learnt. Some technologies were chosen due to requirements analysis and design constraints, as described in the previous couple of chapters. 4.3.2. Constraints Please note that we are aware of Internet Explorer being supported by other projects, standardly, at the university. However, we have learnt during the development of Tagus that while Mozilla Firefox and Google Chrome support asynchronous calls to Elastic Search, Internet Explorer blocks such calls. 70 While the rest of the UI functions appropriately, the tagging service hence, does not work in Internet Explorer. One potential reason identified is that while the rest of the services like fetching results from Summon, list generation, etc. run on the default HTTP port 80, Elastic Search’s services are accessible on port 9200 and above, as configured. There have been other issues too which were identified, but have been solved after technical debugging sessions. Kindly note that the development of a tool such as Tagus, with a starting point of just a single Summon API file is by itself large, especially when there are no other tools to base our work upon and there were no requirements to start with. Supporting specific browsers [Internet Explorer as a platform] was added as a requirement request, quite late in the cycle [when usability testing approaches were being planned]. However, due to time constraints, a prospective solution has been thought about, but it could not be verified. This prospective solution is described below. One workaround for Internet Explorer could be to write Server Side Proxy classes which act through the default HTTP port 80, while internally making request forward calls to port 9200 along with all the passed parameters. While, work on this is already in progress, we believe it is important to notify the reader about this current limitation. The website where Tagus has been deployed, will issue a message regarding the support for Internet Explorer, as soon as it is made available. Please refer to chapter 6 for more discussion about support for Internet Explorer. 4.4. Further Technical Details We describe in a small note, how an Elastic Search instance was customized to fit into the role of an indexing server. Figure 4.1. Bootup of an Elastic Search instance 71 Elastic Search by itself is schemaless. It operates using JSON as a data exchange format [both input and output]. Querying an instance is through the means of PUT, GET, POST and DELETE requests supported by HTTP REST standard. Elastic Search supports “a full query DSL [Domain Specific Language]” [7] and several advanced features. However, we just describe how we interpret data in a schemaless JSON format to conform to a particular schema. A sample query has been given below. http://localhost:9200/tagus_public/userid/< auto generated unique id > Syntactically, this structure is interpreted as follows. <protocol>://<hostmachine>:<portnumber>/tagus_<accessibility>/<userid>/<auto generated unique id> Executing such a query results in data being sent in the following format to the requester. { resource : FETCH-url-ed-library-resource-unique-id tag : abcde } We apply the following constraints on each of these attributes. 1. accessibility = private or public, only one of these two values is allowed 2. user = unique user id as allowed by the authentication module of Tagus 3. resource = resource id as retrieved from summon, should not have any spaces 4. tag = any single word, can be alphanumeric, should not have any spaces in it For the purposes of tagging, we identify a combination of (accessibility, userid, resourceid, tag) as a unique set [equivalent to a row in a database]. All operations to insert, delete, search within the index of tagged resources are interpreted in this way. 72 4.5. Workflow Chart of development [shown for one weekly cycle] Background Study Further Study Design Decisions ReDesign Decisions Requireme nts Collection Evaluate Te chnologies Technologies Shortlist Implementat ion Technologi es Broadlist Further Requireme nts Implementat ion UI Design Review Requireme nts Review Requireme nts Review Design Review Final System Discuss Constraints Discuss Constraints Discuss Constraints Deployment Technical Testing Figure 4.2. Iterative software development workflow Please note that “Background Study” was performed only in the first two weekly cycles [it was already in progress before this weekly iterative mode was started]. Kindly also note that the details of technical testing are not been mentioned separately, as it was done in parallel with implementation and reviews. A separate step has been shown in the above diagram for the sake of completeness in depicting the amount of work done completed in this project. 73 4.6. Guidelines to future developers Development of a system using web technologies is a very complicated task. Both server specific technologies and client specific ones have become quite powerful in the last decade. It is important to understand the complexity, distribute the tasks of providing functionality between a server technology and a client technology in this light, and finally provide a smooth user experience. Adding specific usability requirements to the mix makes the whole task very difficult. However, proper planning, iterative design and taking the help of a usability expert, while getting the scope of a project approved makes such difficult tasks manageable. The bulk of the effort for Tagus was in the background study and creating a design stages. Taking in the help of the developers at the Digital Library to perform technical testing was invaluable as was getting the scope of the project reevaluated and getting approved by the supervisor. Finally, choosing Elastic Search over Lucene / Solr was a critical decision and it proved successful in my early feasibility study of technologies. Therefore, a well justified choice of technologies could very much speed up the implementation. Also, the practice of reusing code, using public libraries such as jQuery and its plugins like jQuery.cookie and jQuery.json helped save time and let us achieve more functionality beyond the original plan for Tagus. 74 5. Chapter 5. Evaluation 5.1. Motivation for this Chapter In the previous chapters we have seen how and which decisions led to the final design of the new system. Appendix 8.1 gives the screenshots of the developed system, as viewed in a browser. This chapter lists the procedures that were followed to perform usability testing with users, including the methods, the setup, the process of the interviews, who were recruited for the testing and the outcomes of the evaluation. 5.2. Methods 5.2.1. Preliminary Steps We have seen in section 2.6, how from non-functional requirements, we derived our usability criteria, matching them with standards’ definitions of usability. From abstract criteria, we proceeded to analyse and derive concrete measurable criteria. We then created our evaluation plan by choosing quantitative evaluation for “speed of service, interaction” and qualitative evaluation for the rest, viz., ease of use, ease of learning, findability and minimal UI. However, in order to conduct usability testing with users, we still need to convert the evaluation plan into a plan of action. In other words, user scenarios need to be generated to be presented to users. This is in line with the standard practice followed by the Digital Library Usability Team. Generating possible real world scenarios and getting them reviewed by our usability expert formed the preliminary stage of usability testing. 5.2.2. The Interviewing Protocol Setup The interviewing protocol was explained by Ms. Angela Laurins, our usability expert during this project. The complete protocol setup was customized to suit the needs of performing a pilot usability testing session. The protocol itself consists of briefing each participant with a background of the project, describing them our purpose of conducting the session and what is to be expected in the test scenarios. This also includes answering intermediate questions, clarifying doubts to ensure the smooth progress of the session without crossing the time limit, one hour per user. 75 5.2.3. The Participant Selection and Contacting Process Before the start of the project, in the first couple of meetings, during our planning sessions with the supervisor, it was decided that participants would be contacted and selected from the team members of the Digital Library / Information Services, University of Edinburgh. Nielsen, a renowned usability expert, says in his guidelines, “Using insiders as test users works in pilot testing, and pilot testing only” [36]. But the underlying purpose, in such a case, according to him is “to improve the tests themselves” [36]. Therefore, instead of choosing our participants only from the Developers team, the participants were split across multiple teams. The prospective users were contacted via email, with the concept explained briefly. The test sessions were then timed according to slots, depending on the availability of each user on a particular day. The users were then invited to the test location. All tests happened at Meeting Room S7, Digital Library Office, 2, Buccleuch Place, University of Edinburgh’s George Square Campus. 5.2.4. The Data Collection Method As each testing session progressed, questions were put to the users, collecting data relevant to the table below [please note that this references the table in section 2.6.3.4, table 2.5]. Ratings for intuitiveness and ease of use of UI Ratings from users and as measured from observation of user activity between first and second time usage Ratings for the findability of a resource, that a user intends to find in the library Ratings for ease of creating reading lists. Though the system and hence its UI are designed to be used to find resources, this will help determine whether a user finds the UI small enough to quickly navigate it and find out how to perform secondary actions like generating reading lists 76 5.2.5. My preparation for the testing sessions I have used a standard test script, reviewed by our usability expert for the project, in line with the standard practice at the Digital Library. Please find the “Test Script”, as is, in the Appendix. This was adapated from Steve Krug’s “Rocket Surgery Made Easy” [31]. 5.2.6. Test Scenarios I have used a set of test scenarios, which reflect prospective real world uses of the system in future. These were once again used only after a couple of test runs and two complete reviews by our usability expert. Please find the details of access to “Test Plan” documents, as executed and recorded for each of the four participants in the Appendix. Please note, that during the session, users were allowed to visit both the websites http://tagustest.lib.ed.ac.uk [Deployed Tagus] and http://ed.summon.serialssolutions.com [University of Edinburgh’s Summon instance] as necessary without giving any tips or hints to complete their tasks, other than what is already present in the test plan. The contents of our test plan are summarized below: Ratings are taken for:* 1. speed of service in retrieving tags 2. usability of user interface 3. ease of learning. Comparitive evaluation is done for:* 4. user ratings for findability w.r.t. Summon with tagging [Tagus website] vs Summon without tagging [Summon instance website] 5. ease of creating reading lists based on public tags by users based on this service vs manual creation of such lists. 5.3. Overview of Results Sections I have organized the results of working on this project as described in this section. The reports are divided into two sections, those aimed at future developers working on similar projects OR working with similar / same technologies and those on usability, meant for the Digital Library. These could help serve as inputs for taking up work on similar projects in future. *All ratings are on a scale of 1 to 5. For difficulty levels found by users, 1=very difficult … 5=very easy. Alternatively, for user satisfaction levels, 1=very unsatisfied … 5=very satisfied. 77 5.4. Reports For Future Developers 5.4.1. Report on working with Summon API The API to invoke functionalities of Summon is available. According to its documentation, available online in [18], “The API is an HTTP-based service and supports requests via the HTTP GET and POST methods. Currently there are two available response formats: XML and JSON”. In keeping with modern web service development standards, this is good for two reasons. Firstly, it is a HTTP-based service. Most modern web programming languages, libraries and tools have support for working with HTTP-based services. Therefore, a developer need not write new code to work with the service and can reuse code. Secondly, a web developer often works on both the source code relevant to a server side and that relevant to a client side, often a web browser. Therefore, web developers tend to be aware of the software skills needed to work on both the sides. With the availability of new client side libraries like jQuery [16], which support HTTP GET and POST methods, it becomes easier for the developer to invoke Summon’s functionality easily enough to quickly check the feasibility or the working capability of a new idea. Thus, it is conducive to rapid-prototype-development when used along with modern client side libraries. Serials Solutions, the company which develops and maintains Summon, also provides PHP and Ruby versions of the API, as listed in http://api.summon.serialssolutions.com/help/api/code. This helps simplify the development of extensions and new tools for developers who are already familiar with atleast one of these languages. Both these languages are quite popular, at the time of writing this document. Even if the chosen technologies do not incorporate either of PHP and Ruby, developing a wrapper library which abstracts the mechanism [working with HTTP-based calls], with functionality based API is feasible. For example, the api page, http://api.summon.serialssolutions.com/help/api/authentication itself gives an example of Java code to wrap code working with id’s and digests. Thus, portability and extensibility of the API itself is feasible in moving from one technology to another. 78 There are quite a lot of advanced search features like commands, facets, filters, repagination, etc. Each of them are described in detail in their respective sections in the online documentation. For projects wishing to incorporate a direct “search library with Summon” feature in their user interface, there are several challenges to be faced. Firstly, replicating all aspects of the user interface as shown in the university’s Summon instance website, is not feasible in a short amount of time. Secondly, it is not practical to replicate all the user interface features like “search boxes”, checkbox lists to expose all the advanced search features. However, if a subset of the user interface still needs to be exposed, it would require careful planning by the stakeholders of the project. Thus the API is great to work with in the background [non-user-interface modules], as expected of most modern day web services. But the web developer still needs to rewrite code to provide support in the UI for each advanced feature provided in the API. The PHP version of the API was used for Tagus. It supports 1. Retrieval of a resource’s information given its unique id and 2. Searching the Summon index for all resources matching a search query [a set of words]. While this sufficed for the implementation of Tagus’ requirements, a future project might require the use of the advanced search features [ignoring additional UI code that might need to be developed] may find the PHP version falling short of expectations. In such cases, the developer might need to extend it with custom functions to tap into the advanced features. Another feature of the API, is the excellent error handling of incoming requests. There is a clear classification of the possible errors, as given in http://api.summon.serialssolutions.com/help/api/search/errors. This helps new developers to quickly get an insight into the origin of the problem. In summary, for functionality, the API is very good to use when working with code at a raw HTTP requests’ level. It is good to use when working at a functionality level [PHP version], but with limited support for advanced features which can be extended by the developer. Any UI feature linked to the advanced search features would require additional work. 79 5.4.2. Report on working with Elastic Search Quoted from its own documentation, “ElasticSearch is a highly available and distributed search engine. Each index is broken down into shards, and each shard can have one or more replica. By default, an index is created with 5 shards and 1 replica per shard (5/1). There are many topologies that can be used, including 1/10 (improve search performance), or 20/1 (improve indexing performance, with search executed in a map reduce fashion across shards)”. More details are available online at https://github.com/elasticsearch/elasticsearch/blob/master/README.textile. ElasticSearch [ES] has been around for nearly a couple of years at the time of writing this document, is very clean and stable. It is built on top of Apache Lucene. It provides HTTP REST based services for indexing data and then searching the indexed data. The best feature about ES is that the data to be indexed need not conform to a particular schema. Also, even if schema is required in a project due to other requirements and constraints, the schema need not be first defined before inserting the first data into the system. Finally, even if schema is required and later there is a need to extend, update or change the schema, ES itself need not be informed. The next best feature is that all communication with an ES instance is in the form of a HTTP URL followed by the data to be sent in JSON format. JSON [Java Script Object Notation] is very easy to learn and use for a web developer. Debugging source code is possible within browsers with tools like Microsoft Script Editor for Internet Explorer and Firebug for Mozilla Firefox and Google Chrome. Development with code making calls to an ES instance, therefore, becomes easier with early bug detection. Also, ES supports a query Domain Specific Language [DSL], which has several advanced features while searching against the indexer. Search queries also follow JSON format, thus preventing the need to learn another query language with custom keywords and syntax. 80 Finally, ES instances can be launched on many machines in a lan, appropriately configured to provide the same service. When machines go down, ES continues functioning till atleast one machine remains. Thus its support for distributed indexing helps keep the uptime of a deployed service. Other advantages include scaling to multiple instances dynamically when the data to be indexed grows beyond the capability of the current set of machines. While there are so many advantages of ES, the documentation available is limited. For example, if a developer working with advanced search queries faces a problem, it could take up a lot of his / her time trying to figure out ways of creating the query. This was faced even in the development of Tagus. Please note that ES has been around for a very short time with a very small number of developers working on it. There are API’S supported in multiple programming languages on both server side and client side of web development. In summary, we have found ES excellent to work with, only limited by the documentation provided. It is suggested to future developers to refer to Clinton Gormley’s excellent introduction to ES [presented at the Yet-AnotherPerl-Conference, European Union, 2010], available online at http://clintongormley.github.com/ElasticSearch.pm/ElasticSearch_YAPC-EU_2010/. 5.5. Reports on usability For Digital Library's Decision Makers 5.5.1. speed of service in retrieving tags and speed of interaction 5.5.1.1. Quantitative Evaluation From section 2.6.3.3, speed of retrieving tags fits quantitative evaluation. Please refer to the images given below for the actual data recorded for measuring the durations of various calls to the indexer from the web browser. The tool used to capture this information is Firebug 1.8.1. The average time for the retrieval of tags across various resources was a) Mozilla Firefox: 159.65 milliseconds b) Google Chrome: 384.20 milliseconds These times, in the order of half a second are observed as “very fast”. 81 5.5.1.1.1. Durations of request-response cycles in Mozilla Firefox Figure 5.1. Timing of calls to Elastic Search’s instance in browser one 82 5.5.1.1.2. Durations of request-response cycles in Google Chrome Figure 5.2. Timing of calls to Elastic Search’s instance in browser two 83 5.5.1.2. Qualitative Evaluation From section 2.6.3.3, speed of interaction is the criterion which fits qualitative evaluation. Please refer to the appendix for the actual data recorded during our usability testing sessions. All the participants found the user interface to be “very fast” to use. 5.5.2. usability of user interface 5.5.2.1. Tasks & Questions Please refer to the appendix for the actual data recorded during user sessions. The tasks themselves and the accompanying questions could be summarized below: Task :: Add a tag / an annotation Task :: Find a resource Task :: Remove a tag / an annotation Task :: Add a tag, Tell a friend about it [to let the friend find it with this tag], Remove it Given the tasks, data that was recorded What helped ? What did not help ? What could have helped [expectations] ? 5.5.2.2. Qualitative Evaluation Overall, the users found the user interface between “usable” and “very usable”, with scope for improvement. I have listed the problems that were noted during the sessions. Firstly, we found that if a user was familiar with popular tagging systems like Facbook, Flickr, Delicious, etc., they found the overall system “usable”. Some users expected advanced features like auto-suggest for resources tagged by people of the same group [like all students taking the same course, those in the same undergraduate year, etc.]. Secondly, one user specifically did not want to use the “public”ness of tags and wanted to use the tags only as “personal” information to be not shared with others. Though this 84 is a functionality related feedback, we found such expectations affecting their ratings on usability [brought down the level from “very usable” to “usable”]. Finally, we found that if users were exposed to the Summon instance first and then introduced to Tagus, users appreciated the user interface “very usable”. Positive Feedback : Quick access via TABS is convenient, “Less cluttered” than other search systems that they were aware of, Tag counts in tagcloud helps “immensely”, case-insensitive search of tags is supported, Image under “Home” gives a quick idea of what Tagus is about, Students will find the system very easy to use, Having a demo video is not required as “it is easy to figure out”. Negative Feedback : Image under “Home” is not intuitive enough, pure keyboard navigation of the system is not supported [without mouse], users sometimes overlooked which input box they were typing into [annotations vs tags], advanced search features like filtering options and auto-suggest are not available. Neutral feedback : Two users can add the same public tag to the same resource. When such a resource comes up in the searches of users, a tag is displayed twice. Users found it a bit “different” and “awkward”, but found the concept “useful” because it provides them a way of knowing that a particular tag is more relevant to the resource if it is present more number of times than the other tags. Further, users’ perception of usability increased with the amount of provided functionality of tagging. For example, one user commented that it Can be useful to librarians to predict which resources would be in demand if they could know which ones are being tagged the most. Thus, overall, for a proof-of-concept tool [Tagus] built upon a resource discovery system [Summon] we believe that the provided user interface was between “usable” and “very usable”, while noting that users familiar with popular “tagging” based websites found it more usable than users who were not and that the provided functionality affected the “perceived usability” of the concept of tagging itself, which in turn led to their perceived usability of the user interface. 85 5.5.3. ease of learning :: first vs second time usage 5.5.3.1. Qualitative Evaluation The core procedure of evaluating ease of learning is to give a task to the user and make him / her repeat it in a different form. Observe the ratings given for the completion of the two tasks. The data to be observed is the difference of ratings given for the first time and the second time. This difference should either be zero or positive. For example, for the first time, for a task X, a user U gives a rating of 3: Moderate, then the user has learnt the task in between the two times, if for the second time U gives X a rating of either 3: Moderate, 4: Easy or 5: Very Easy. Tasks 4 and 5 were specifically designed to measure ease of learning. 5 TestUser1, TestUser3 4 TestUser 1 TestUser 4 3 2 TestUser 2 TestUser 3 TestUser2 TestUser 4 1 Task 4 Task 5 Graph 5.1. Ease of learning as measured for two specific tasks* Ease of learning across the first six tasks, the commonality being “add / delete / search via” tags. TestUser3 5 TestUser4 4 TestUser 1 TestUser1 3 TestUser 2 TestUser 3 2 TestUser2 TestUser 4 1 Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Graph 5.2. Ease of learning as measured across all tasks* Interpretation of the above graph: The complexity of the tasks increased from task1 to task6. This could be a factor in causing the dips, especially for task3 and task5. This is a *All ratings are on a scale of 1 to 5. For difficulty levels found by users, 1=very difficult … 5=very easy. Alternatively, for user satisfaction levels, 1=very unsatisfied … 5=very satisfied. 86 good feedback to incorporate when revising the tests themselves in future. In other words, to measure ease of learning, when a user performs a task for the second time, there should not be any added complexity which could cause fluctuations in the measurements. To compensate for this, we also recorded how satisfied the users were with the results they obtained, which should give a better measure of ease of learning [changing expectations of the user] as they spent more time with [learnt more about] the system. Task1, being the first task, has no recording of data for the user to rate “how satisfied he / she was the results”. This was because the results for the first task set the expectations of the user in practice. So, the real measurements can start from the second task. 5 4 TestUser 1 3 TestUser 2 TestUser 3 2 TestUser 4 1 Task 2 Task 3 Task 4 Task 5 Task 6 Graph 5.3. Ease of learning as measured across all tasks using “satisfaction with results”* As we can observe, with the exception of the rating for task5 by testuser2, the users were able to learn the system. Also the average rating for all tasks was 4.35 across all users. This indicates high satisfaction for users for the results of performing the tasks. Please note that the minimum possible rating was 1 and the maximum one was 5. Also please note the limitations of this measurement. In a real world scenario, a user is not limited to the test duration of one hour. He / she would have more time to explore the system. Also, we should consider the factor of community learning in Tagus, which maintains data collectively owned by all the users. If a learned user of Tagus informs about the system to a new user, then the new user’s ease of learning could improve much more than when the new user explores Tagus by himself / herself, for the first time. *All ratings are on a scale of 1 to 5. For difficulty levels found by users, 1=very difficult … 5=very easy. Alternatively, for user satisfaction levels, 1=very unsatisfied … 5=very satisfied. 87 5.5.4. findability :: Summon [does not have tagging] vs Tagus [uses Summon] For measuring the findability of a resource in Tagus as compared to the findability of that resource, we designed a specific task, Task 7. Please find its details in the appendix. It involved finding a resource in Summon and also finding it in Tagus in any order. However, when in Tagus, the user tags the resource with a public tag, as relevant to him / her. A time gap is given here, to indulge the user in a general chat about the system for around five minutes after taking the user test plan sheet away. The task ends with the user trying to find the same resource again in both the systems. The sheet is returned to the user to rate how findable the resource was in Tagus as compared to finding it in Summon for the second time. Please note that Tagus is like an extension to Summon and it works off its API. So, this evaluation checks whether tagging would help improve findability if available along with the regular functionality of Summon. 5.5.4.1. comparitive evaluation :: Summon vs Tagus 5 4 TestUser 1 3 TestUser 2 TestUser 3 2 TestUser 4 1 0T 1 A 1 S 2 K 2 7 3 Graph 5.4. Findability: How EASIER is finding a resource in Tagus vs in Summon* 5.5.4.2. Report on accuracy of public tagging affecting findability Though one of the intentions of applying the concept of tagging in an academic library environment is to improve findability, tagging could adversely affect the findability of a resource. For example, assume a public tag called “book” is applied to many resources by an user. Such a highly general tag does not really convey anything about the resource to the user community. Yet another example would be to publicly tag a particular *All ratings are on a scale of 1 to 5. For difficulty levels found by users, 1=very difficult … 5=very easy. Alternatively, for user satisfaction levels, 1=very unsatisfied … 5=very satisfied. 88 resource on the topic of history with say, a tag called “geography”. In the latter example, the goal of the user applying the irrelevant public tag might be to solely let other users not find a book rather than help them find it. However, this problem is not unique to academic library environments. It is applicable equally to other applications of tagging like, say, a user tags a photo of a blank sheet of paper with “operatingsystem” or just “photo”. Thus, it is a standard problem in the world of tagging based applications. One possible approach is the use of moderation. There could be a set of “moderators”, users with super rights to override or remove inappropriate tags applied to resources. Another approach is to have automatic bots monitoring the system for a known list of blocked words. Yet another approach is to let the user community moderate tags itself. A user can tag a library resource with a general tag called “book”. Yet, another user, when he / she finds this tag not applicable to that resource, could simply delete it. However, this last approach could cause new problems in cases when one user deletes really relevant tags added by another user. Our suggestion is to take the filtered approach for Tagus, where a newly added tag is checked against a known list of blocked tags to check for relevancy. This again is dependent on the concept of relevancy, but discussing further on the topic of relevancy per user per resource is out of scope of this document. 5.5.5. ease of creating reading lists :: manual vs Tagus 5.5.5.1. Cooperative Evaluation Tasks 8 and 9 [appendix] have been created exclusively with the aim of checking whether the user interface is small / minimal enough to be quickly navigable, as explained in section 2.6.3.4. The user needs to find means of achieving the goals of the two tasks, viz., creating a direct reading list out of search results and creating a customized reading list. There are two role plays :: professor and student, in the two scenarios. Participants were instructed to think aloud, during the testing sessions. All the participants found 89 task 8 very easy to do and agreed upon the usefulness of the feature [exporting reading lists]. Task 9 required a bit of thinking and problem solving on part of the users, as the solution was neither direct nor were any hints given. However, all the users managed to think of a way to apply tagging in a way new to them [users were asked if they had faced such a situation before on social networking websites, etc. where tagging was used]. This paragraph concludes the chapter on evaluation of Tagus. We have divided our reports into two appropriate sections and in each section, we described our results, critically analysed the details in them, observed the limitations of both creating the tests and the test conditions, as compared to real world situations and suggested solutions, wherever applicable. 90 Chapter 6. Conclusions & Future Work 6.1. Motivation for this Chapter Implementing a 'Del.icio.us' like system in an academic library discovery environment is about each of the following: 1. taking the concept of “community owned personal tagging” and applying it in a new context 2. checking out the feasibility of working the public API of a new age library resource discovery system to create a new tool 3. gathering the requirements needed to create a tool to exploit tagging 4. figuring out a list of technologies to work with all the constraints 5. designing the architecture, design all the modules as there is no existing software in such an environment to serve as a standard model 6. exploring many possibilities at every stage and getting reviewed every week 7. designing the user interface, weighing options and justifying design decisions 8. designing usability test case scenarios and conducting usability test sessions 9. recording and critically analysing the results and reporting them. During this project, we have also performed a literature review of recent publications, a technical review of working with the technologies involved / chosen, problems facing the development of such new applications and suggested solutions wherever applicable. This project involved a lot of work as primarily the scope of the project went from being proof-of-concept to a strong modularized framework, functional by itself, but also extensible into further customized applications in future. 6.2. Limitations The undertaking of this project would not have worked without the inputs of all the individuals in the acknowledgements page and we strongly believe that this project was successful considering the short duration that was available. However, this project is still a pilot tool as there is scope for improvement in terms of both functionality and usability. 91 Google’s search mechanism & Delicious’s easy to use tagging system have set high standards of speed and usability for public resources on the internet. As a result, users are very conscious of speed and usability in similar applications elsewhere. However, to keep things in perspective and within the scope of this project :: As proposed, this is a pilot project, intended to aid in the evaluation of DigLib’s new search services candidates. In the end, reports on speed, usability and service integration would be provided to the Digital Library and are expected to help in this evaluation. Finally, this project can be customized in several ways, as suggested by members of DigLib, UoE, in several ways, provided 1. DigLib thinks of going ahead with Summon 2. this project is out of pilot / test mode and deployed as a full time service [to start in an alpha / beta release and slowly become a public release] 3. more testing is conducted 4. UI is improved 5. The development of more applications based on the standalone API / the web based system is undertaken. For example, a. All tags by all the students under a particular professor could create a data aggregation and could be made available as a webservice to librarians and budget allocators to predict, say, which journals subscriptions to allocate more budget to, etc. b. Automated notification of course-code-tagged resources to coursesubscribed-students via mailing lists could be made available as a webservice integration, etc. c. Resources used by previous academic years’ students could be autosuggested to current academic year’s students based on tag-based-theme similarity, etc. 6.3. Future Work As observed in the feedback from the participants of our testing sessions, the concept of tagging is useful to users. However, the guidelines to future developers in conjunction with the whole approach to development of Tagus, as described in earlier chapters of this document serve as 1. a model to base new work on and 2. inputs to 92 improve future test cases and 3. improve criteria for usability itself. Looking forward to continuing my work at the Digital Library Office at the University of Edinburgh, the immediate work that I would be taking up is as follows: 1. 2. 3. 4. 5. Evaluating the proposed ideas for providing portability to Internet Explorer Implement suggestions from test participants / users to improve the user interface Work on creating Help Demo Video and FAQ sections on the website An attempt to link the authentication module of Tagus with EASE [10] Improve the format of generated reading lists Although the results of usability testing of Tagus provide good feedback, after making a couple of more iterations in the iterative development cycle, with all the above done, one direction for future work is to involve more number of users and across a wide range of roles in the university. This would be possible after the system has been stress / load tested and found to be robust enough to take up users’ request load appropriately. A second direction for more work would be to focus on evaluating the feasibilities of new ideas by quickly developing prototypes based on the framework delivered by this project. Some ideas have been given as examples in the previous section. A third direction would be to create an alternative version of the system to work with other new age library resource discovery systems, along with Summon. This would require that the public API’S of these systems conform to a uniform interface, thereby providing / exposing similar functionalities as Summon. 93 7. References 1. Bains [2010], The paper that went to Library Committee proposing a resource discovery procurement [Access Link: http://www.lib.ed.ac.uk/about/libcom/PapersFeb10/paperE100210.pdf] [Cited 18 August 2011] [Internet Source] 2. Stone [2010], Searching Life, the Universe and Everything? The Implementation of Summon at the University of Huddersfield [Access Link: http://liber.library.uu.nl/publish/articles/000489/article.pdf] [Cited 18 August 2011] [Internet Source] 3. The Summon Beta Evaluation Team [2009], An Evaluation of Serials Solutions Summon As a Discovery Service for the Dartmouth College Library [Access Link: http://www.dartmouth.edu/~library/admin/docs/Summon_Report.pdf] [Cited 18 August 2011] [Internet Source] 4. Klein [2010], Hacking Summon [Access Link: http://journal.code4lib.org/articles/3655] [Cited 18 August 2011] [Internet Source] 5. Summon by SerialsSolutions [Access Link: http://www.serialssolutions.com/discovery/summon/] [Cited 18 August 2011] [Internet Source] 6. University of Edinburgh’s instance of Summon service [Access Link: http://ed.summon.serialssolutions.com/] [Cited 18 August 2011] [Internet Source] 7. ElasticSearch [Access Link: http://www.elasticsearch.org/] [Cited 18 August 2011] [Internet Source] 94 8. Apache Solr [Access Link: http://lucene.apache.org/solr/] [Cited 18 August 2011] [Internet Source] 9. Apache Lucene [Access Link: http://lucene.apache.org/] [Cited 18 August 2011] [Internet Source] 10. University of Edinburgh, EASE system for web based university wide authentication [Access Link: http://www.ed.ac.uk/schools-departments/informationservices/services/computing/computing-infrastructure/authenticationauthorisation/ease/overview ] [Cited 18 August 2011] [Internet Source] 11. Webfeat [Access Link: http://www.webfeat.org/] [Cited 18 August 2011] [Internet Source] 12. Delicious [Access Link: http://www.delicious.com/] [Cited 18 August 2011] [Internet Source] 13. Hull D, Pettifer SR, Kell DB, 2008 Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web. PLoS Comput Biol 4(10): e1000204. doi:10.1371/journal.pcbi.1000204 [Access Link: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000204] 14. George Macgregor, Emma McCulloch [2006], "Collaborative tagging as a knowledge organisation and resource discovery tool", Library Review, Vol. 55 Iss: 5, pp.291 - 300 [Access Link: http://dx.doi.org/10.1108/00242530610667558, 95 http://www.emeraldinsight.com/journals.htm?issn=00242535&volume=55&issue=5&articleid=1554177&show=pdf ] 15. J. Alfredo Sánchez, Adriana Arzamendi-Pétriz, and Omar Valdiviezo [2007], Induced tagging: promoting resource discovery and recommendation in digital libraries. In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries (JCDL '07). ACM, New York, NY, USA, 396-397. DOI=10.1145/1255175.1255252 [Access Link: http://doi.acm.org/10.1145/1255175.1255252, http://portal.acm.org/ft_gateway.cfm?id=1255252&type=pdf&CFID=15726141&CFTOK EN=28450628]. 16. jQuery [Access Link: http://jquery.com/ ] [Cited 18 August 2011] [Internet Source] 17. Curl [Access Link: http://curl.haxx.se/ ] [Cited 18 August 2011] [Internet Source] 18. Summon API [Access Link: http://api.summon.serialssolutions.com/help/api/] [Cited 18 August 2011] [Internet Source] 19. Lynn D. Lampert, Katherine S. Dabbour, Librarian Perspectives on Teaching Metasearch and Federated Search Technologies, Internet Reference Services Quarterly Vol. 12, Iss. 3-4, 2008 DOI: 10.1300/J136v12n03_02 [Access Link: http://www.tandfonline.com/doi/abs/10.1300/J136v12n03_02 ] 20. Lauridsen, Helle and Stone, Graham (2009) The 21st century library: a whole new ball game? Serials, 22 (2). pp. 141-145. ISSN 0953-0460 DOI: 10.1629/22141 [Access Link: http://dx.doi.org/10.1629/22141, http://eprints.hud.ac.uk/5156/] 96 21. The Success of Web-Scale Discovery in Returning Net-Gen Users to the Library: The SummonTM Service in Academic Libraries [Access Link: http://www.libraryjournal.com/lj/tools/webcast/883883388/the_success_of_web-scale_discovery.html.csp] [Cited 18 August 2011] [Internet Source] 22. Help with Summon, http://www.library.usyd.edu.au/catalogue/summon/summonfaq.html [Cited 18 August 2011] [Internet Source] 23. An article on Summon at PSU, http://www.libraries.psu.edu/psul/itech/services/summon.html [Cited 12 August 2011] [Internet Source] 24. What is Summon?, http://www.adelaide.edu.au/library/help/summonabout.html [Cited 12 August 2011] [Internet Source] 25. Summon API, http://api.summon.serialssolutions.com/help/api/ [Cited 12 August 2011] [Internet Source] 26. RESTful Web services: The basics, https://www.ibm.com/developerworks/webservices/library/ws-restful/ [Cited 12 August 2011] [Internet Source] 27. JSON, http://www.json.org/ [Cited 12 August 2011] [Internet Source] 97 28. ISO 9241-11:1998 Ergonomic requirements for office work with visual display terminals (VDTs) -- Part 11: Guidance on usability, http://www.userfocus.co.uk/resources/iso9241/part11.html [Cited 12 August 2011] [Internet Source] 29. Jakob Nielsen's Website on usability, http://www.useit.com/ [Cited 12 August 2011] [Internet Source] 30. Alan Dix, Janet Finlay, Gregory Abowd, and Russell Beale. 1997. Human-Computer Interaction. Prentice-Hall, Inc., Upper Saddle River, NJ, USA. Access Link: http://dl.acm.org/citation.cfm?id=249491 HCITextbook at University of Edinburgh, http://www.hcibook.com/ [Cited 12 August 2011] [Internet Source] 31. Steve Krug, Rocket Surgery Made Easy, http://www.sensible.com/rsme.html [Cited 12 August 2011] [Internet Source] 32. Univeristy of Edinburgh, Data Protection Policy http://www.recordsmanagement.ed.ac.uk/InfoStaff/DPstaff/UoEDPPolicy.htm [Cited 12 August 2011] [Internet Source] 98 33. Univeristy of Edinburgh, Web Accessibility Guidelines http://www.projects.ed.ac.uk/methodologies/Standards/Accessibility/AccessGuid e.htm [Cited 12 August 2011] [Internet Source] 34. FPDF Class For PHP Applications, http://www.fpdf.org [Cited 12 August 2011] [Internet Source] 35. University of Edinburgh, Human-Computer Interaction, Course Website, http://www.inf.ed.ac.uk/teaching/courses/hci/ [Cited 12 August 2011] [Internet Source] 36. Jakob Nielsen, Try to Be a Test User Sometime, http://www.useit.com/alertbox/being-a-test-user.html [Cited 12 August 2011] [Internet Source] 99 8. Chapter 8. Appendix 8.1. Tagus Screenshots 100 101 102 103 8.2. Test Script MSc Project :: Tagus :: Implementing a 'Del.icio.us' like system in an academic library discovery environment Tagus Testing :: User Testing :: Single Test August 2011 Test Script (Adapted from Rocket Surgery Made Easy © 2010 Steve Krug) Clear the browsing history!!! Web browser minimised Hi, ___________. My name is Girish Ede and I’m going to be walking you through this testing session today. Before we begin, I have some information for you, and I’m going to read it to make sure that I cover everything. You probably already have a good idea of why you were asked here, but let me go over it again briefly. We’re asking people to use Tagus to find out if it works as intended and to find out what you think about using it. The session will take no more than an hour. The first thing I want to make clear right away is that we’re testing Tagus, not you. You can’t do anything wrong here. Please don’t worry about making mistakes. There are no right or wrong answers. This is a proof of concept website :: it is not perfect, but your feedback is really going to help us. 104 As you use Tagus, I’m going to ask you to try to think out loud as much as possible. I’d like you to say what you’re looking at, what you’re trying to do, and what you’re thinking. Why you’ve decided to click on something. Tell me when you are confused or pleased that something has worked. This will be a big help to us. Also, please don’t worry that you’re going to say something you shouldn’t. We’re doing this testing to evaluate Tagus and help improve the Library services we offer, so we need to hear your honest reactions. If you have any questions when we’re done I’ll try to answer them then. And if you need to take a break at any point, just let me know. If you would, I’m going to ask you to sign a simple permission form for me. It just says that I have your permission to use the feedback you provide from the testing which will be anonomysed, and that the results will only be seen by the people working on the project. Tell the participant that his/her data will be anonymised in your report Do you have any questions so far? 105 OK. Before we look at Tagus, I’d like to ask you just a few quick questions. Do you use social networking websites? If yes, which ones do you use from the list below. Facebook :: facebook’s photo tagging feature Delicious :: delicious’s bookmark tagging feature Flickr :: flickr’s photo tagging feature Others :: any tagging feature you might have used earlier What Library online resources do you use ? – (e.g., like Searcher, catalogue, databases etc. at the University of Edinburgh ? ) Yes / No. If no, are you familiar with any other similar tools ? I will now take you to a resource discovery service called Summon from Serials Solutions. Please take a couple of minutes to browse around the website. Feel free to click around and see what it does. What do you expect to do using Summon ? 106 OK, great. We’re done with the questions, and we can start looking at Tagus. Maximize browser window: http://tagus-test.lib.ed.ac.uk/ Tagus homepage First, I’m going to ask you to look at this page and tell me what you make of it:does anything stand out?, what do you think you can do here? Tell me what you’re thinking when you’re looking around the page? You can move the mouse around if you want to, but don’t click on anything yet. Allow this to continue for three or four minutes, at most. Thanks. Now I’m going to ask you to try doing some specific tasks to complete using Tagus. I’m going to read each one out loud and give you a printed copy. There is no right or wrong answer: the task is complete when you are satisfied with your answer. It will help me if you can try to think out loud as much as possible as you go along. Tell me what you’re thinking, what you think will happen when you click on links, what you like, what you dislike, if you expected something to happen, if you’re pleased, displeased…. After each task I’ll ask you to rate how satisfied you are with your result and how easy or difficult you found the task. This would help us in our evaluation. 107 Hand the participant the test plan sheet Hand the participant the first scenario, and read it aloud. Allow the user to proceed until you don’t feel like it’s producing any value or the user becomes very frustrated. Repeat for each task or until time runs out. If time, ask for comment on results screen Thanks, that was really helpful. Do you have any questions for me, now that we’re done? Thank them and show them out. 108 8.3. Test Plan MSc Project :: Tagus :: Implementing a 'Del.icio.us' like system in an academic library discovery environment Tagus Testing :: User Testing :: Single Test Date: August 2011 Tester’s copy Participant ID: 109 Task 1: Adding a tag, after finding a resource Part 1 Assume you are an undergraduate student at the university. Please login using the login / password combination given to you. Can you find the book “Computer Graphics” by “Francis S Hill” ? Your professor had recommended this book for your Book course. Please add a public tag “Computerbook” to this book. Please add a personal annotation “to_read” to this book. Part 2 Please repeat the activity by finding the following resources and adding the same tag and annotation to each of the following: Title:Fraunhofer Institute: building on a decade of computer graphics research Author:Earnshaw Title: 3-D Computer Animation Author: Vince John Did the participant complete this task successfully? Yes No On a scale of 1-5 where 1 is very difficult and 5 is very easy, please rate how you found this task: Very difficult 1 Very easy 2 3 4 5 Observations/ comments: 110 Task 2: Finding a resource, given a public tag You and your friend John had taken a course, “Software Engineering”. However being unwell, you could not attend today’s class. The professor for the course has given a set of books and journals to read, for the next assignment. Your friend searches for all the required books/journals in the website. He tags them publicly with “SEAssignment1”. He then sms’es this tag to you. Can you find the set of resources that the professor had suggested as reading material for the assignment. Did the participant complete this task successfully? Yes No On a scale of 1-5, how satisfied are you with your result for this task? Very unsatisfied Very satisfied 1 2 3 4 5 On a scale of 1-5, please rate how you found this task: Very difficult 1 Very easy 2 3 4 5 Comments/observations 111 Task 3: Removing a public tag / personal annotation Part 1 You have a month to complete the first assignment for your course, “Software Engineering”. Your professor has given a list of THREE books to be required reading for the assignment. Your professor has marked THESE THREE with the public tag “Computerbook”. Find them. Part 2 Assume you have completed reading TWO books out of the THREE in the required reading list. Can you remove your personal annotations to those books, which you had earlier marked as “to_read”. Did the participant complete this task successfully? Yes No On a scale of 1-5 how satisfied you are with your result for this task? Very unsatisfied Very satisfied 1 2 3 4 5 On a scale of 1-5, please rate how you found this task: Very difficult 1 Very easy 2 3 4 5 **Please logout of Tagus now ** Observations/comments: 112 Task 4: Ease of learning, first usage vs second usage **Please login back to Tagus**. Can you find out the list of all the public tags you have added until now ? Tip: One of the tags used previously was : “computerbook”. Did the participant complete this task successfully? Yes No Please rate how satisfied you are with your results: Very unsatisfied Very satisfied 1 2 3 4 5 Please rate how you found this task: Very difficult 1 Very easy 2 3 4 5 Comments/observations: 113 Task 5: Ease of learning, Community maintained tags Part 1 You get a text message saying that some of your friends taking the same courses as you have also tagged some resources in the Digital Library. You ’ve already used the tag “Computerbook” to tag useful resources. Assume that your friends also tagged some of their interesting findings with “Computerbook”. Can you find a list of the books you or your friends tagged with “Computerbook” ? Part 2 Can you look around Tagus and find out how many resources YOU had tagged publicly as “Computerbook” ? Did the participant complete this task successfully? Yes No On a scale of 1-5 how satisfied you are with your results? Very unsatisfied Very satisfied 1 2 3 4 5 On a scale of 1-5, Please rate how you found this task: Very difficult 1 Very easy 2 3 4 5 Observations/comments: 114 Task 6: Find resources based on others’ tags and add your own tags to those resources Part 1 Your friends, on the same course, have tagged some of their interesting findings with a public tag called “interesting_book”. Can you find a list of all books that were tagged with “interesting_book” ? Part2 You think that the “interesting_book” tag is a bit vague for all those resources. Can you now add your own public tag “Computerbook” for each of the results? You now think that these resources are now more relevant to you as “computer books” rather than as “interesting books”! Did the participant complete this task successfully? Yes No Please rate how satisfied you are with your results: Very unsatisfied Very satisfied 1 2 3 4 5 Please rate how you found this task: Very difficult 1 Very easy 2 3 4 5 Comments/observations: 115 Task 7: Summon Only Vs Summon + Tagus Open a new browser window. Please visit University of Edinburgh’s installation of Summon :: http://ed.summon.serialssolutions.com/. You are a graduate student who is studying “Computer Graphics”. Your professor for the course had asked you to refer to “3d And Multimedia On The Information Superhighway” by “Earnshaw”. Please find this book using Summon. Go back to Tagus Please find this book using Tagus. Please tag it using an appropriate public tag in Tagus. **Take away this sheet of paper from the user** Assuming that you are now two months into the future, having forgotten the name of the book your professor suggested you read, how would you ** Give the user, the URL for Summon again ** a. find this book using Summon. ** Give the user, the URL for Tagus again ** b. find this book using Tagus. **Return this sheet of paper to the user** As compared to the previous approach, Do you find the second approach to be Very difficult 1 Very easy 2 3 4 5 Comments/observations: 116 Task 8: Reading Lists Come back to Tagus. Now imagine you are the professor of the course “Computer Graphics”. You want to compile a reading list for your course. You want to then e-mail this list to your students. Members of Library Staff have already tagged all the resources with a public tag called “CG101”. Can you compile a reading list of these resources and save the list to your computer ? Did the participant complete this task successfully? Yes No Do you think you would you use the exporting feature to send relevant materials to a class of students? Yes No Comments/observations: 117 Task 9: Customized lists for each user You are a graduate student at the university. Your friends and you had over the last semester tagged several books you found in your searches with “for_holidays”. Your friends want this list to be trimmed to a subset of around 10 books. Can you customize this list and export it as a file and send it to your friends ? Imagine you are a student, do you think this feature would be useful? Yes No Did the participant complete this task successfully? Yes No Comments/observations: 118 What features and cues in the user interface did you find useful in doing these tasks ? What helped you do your task ? What could have been better ? Did you think that the user interface was intuitive and easy to navigate ? What confused you the most ? Which task troubled you the most ? What did not seem obvious to you at first, but you understood as you started doing these tasks ? Did you feel comfortable with Tagus, once you understood it ? Do you think, this “tagging and annotating” feature is useful ? Would you use it ? Would you prefer searching for a resource using its title and author every time you need to find it OR searching for a resource once, tag it, and then find it using your tag ? Any other comments and feedback. Please list them here. 119 8.4. Access Details To Location Containing Actual Data collected during evaluation I have already included the data collected, including feedback from users as graphs and plain text in chapter 5 under appropriate sections. Raw data sheets, collected “as they are”, have been assembled at :: https://svn.ecdf.ed.ac.uk/repo/is/digitallibrary/Summon/tagus/RawDataSheets/. The main source code itself is available at :: https://svn.ecdf.ed.ac.uk/repo/is/digitallibrary/Summon/tagus/. All files and folders except RawDataSheets under the main "tagus" folder are part of the source code of the project. These can be downloaded to your local machine and imported as an Eclipse project [PHP, Javascript]. Folder “RawDataSheets” has been added here for the sake of the student's dissertation. PLEASE do not remove this folder from here. Its size is only around 3.2 MB. For access to this repository location, please contact the following people: 1.The Supervisor Of The Project, Mr. Colin Watt 2. Usability expert of the project, Ms. Angela Laurins 3. Ms. Claire Knowles, Information Systems Developer At The Digital Library Office. Further contact details are available at :: http://www.ed.ac.uk/schools-departments/information-services/about/organisation/library-andcollections/who-we-are/staff-list. 120