Implementing a `Del.icio.us` like system in an academic library

Transcription

Implementing a 'Del.icio.us' like system in an academic library discovery environment
Ede Girish Renu Goud
s1050984
University of Edinburgh
Master of Science
Computer Science
School of Informatics
University of Edinburgh
2011
1
Abstract
The world of library search systems is undergoing a change. Recently, a lot of focus has
been placed on extending the default search functionality provided by resource discovery
tools and services. Moreover, new library search systems like Summon in an academic
library discovery environment claim to be faster than their predecessors and expose API’s
to allow development of custom tools. The Digital Library at the University of Edinburgh
is in the process of evaluating such new systems and is working on exploring out the
features provided by them. In this process, Summon by Serials Solutions and EDS EBSCO
have been rated better than other systems. This project explores the challenges of
extending Summon’s features and exploits its public PHP based API to construct an
application for providing means of annotation and tagging based find services to digital
library users for resources in an academic library discovery environment, thereby
providing new means of findability and personalization of search for users. Further,
challenges in designing the user interface for such an application are also discussed with
suggested solutions to serve as guidelines for future developers of such applications in
the Digital Library. Finally, the results of usability tests using nine scenarios of end user
testing have also been presented along with the expectations of test users for this
service to serve as inputs for the decision making committee at the Digital Library.
2
Acknowledgements
I would like to thank Mr. Colin Watt for his strong vision, kind advice and ongoing
support throughout the project. His organizational skills have heavily influenced the
development of such a new service in the Digital Library, with features going well
beyond the original concepts planned in the proposal.
I would like to thank Mr. Mark McGillivray, our Informatics Research Proposal group
tutor for his sharp insights into the technical challenges that I was going to face over the
summer, several times, exceeding the allotted class duration. Each of his inputs for the
development of Tagus, was worth trying in developing the final product.
I would like to thank Ms. Angela Laurins, our usability expert, who had kindly take time
out of her busy schedule and helped organize and run the usability testing of Tagus. Her
encouraging words are an inspiration and have urged me to push my work beyond what
was originally planned.
I would like to thank Ms. Ianthe Hind for her technical help during the development of
Tagus, in helping achieve work standards like continuous technical testing, small
releases, daily deployment, etc. as practiced in the Digital Library Development Team.
I would like to thank Ms. Claire Knowles for her inputs on technical standards being
followed by the Digital Library websites, cross browser and cross platform portability,
putting up with my requests for deployment on several late nights and her help in fixing
several bugs that came up during the course of this project.
I would like to thank all the participants, who responded to our invitation, took time out
to participate in the usability testing, gave us precious feedback as well as suggestions
for prospective features to be developed in future.
3
Declaration
I declare that this thesis was composed by myself, that the work contained herein is my
own except where explicitly stated otherwise in the text, and that this work has not
been submitted for any other degree or professional qualification except as specified.
Girish Renu Goud Ede s1050984
4
To Everyone who is the one and the one who is Everyone
5
Table Of Contents
i.
ii.
iii.
iv.
v.
vi.
vii.
viii.
ix.
Title
Abstract
Acknowledgements
Declaration
Dedication
Table of contents
List of figures
List of tables
List of graphs
1. Chapter 1. Introduction
1.1. Motivation for this Chapter
1.2. Introduction
1.3. Definitions
1.4. Purpose
1.5. Overview of bigger plans and goals driving this project
1.6. More Background and Related Work
1.7. Aims
1.8. Thesis Outline
2. Chapter 2. Initial Work
2.2. Pre-Requirements-Collection-Stage
2.3. A Brief Study Of University of Edinburgh’s Summon Instance
2.4. Requirements Collection
2.5. Functional Requirements
2.6. Non-functional Requirements
2.6.1.
Implications and Constraints while choosing technologies
for interaction with Summon
2.6.2.
Constraints due to standalone mode
2.6.3.
Usability requirements
1
2
3
4
5
6
10
11
12
13
13
13
14
14
15
17
19
20
22
22
23
23
25
25
31
32
33
34
6
2.6.3.1.
2.6.3.2.
2.6.3.3.
2.6.3.4.
Definitions of Usability, Kinds of Usability Criteria Available
Choices Made From Available Usability Criteria
Quantitative Evaluation, What Criteria Fit Into Quantitative
Evaluation?
Qualitative Evaluation , What Criteria Fit Into Qualitative
Evaluation?
Why Not Other Criteria? Limitations, Constraints, Overlaps
2.6.3.5.
2.7. Evaluation Plan
2.7.1. Constraints on automated testing
2.8. Guidelines to future developers
34
35
36
37
38
39
39
39
3. Chapter 3. Design
3.2. Architecture
3.3. Design Of Core Services
3.4. Design Of Standalone API
3.5. Final Class Diagrams
3.6. Final Deployment Diagrams
3.7. Design of the UI
3.7.1. Design Rationale
3.7.2. Expert usability inputs
3.7.3. Two column layout versus single column layout
3.7.4. Searchable annotations versus non-searchable annotations
3.7.5. Tags List versus Tags Cloud
41
41
41
46
47
48
58
59
59
59
60
63
65
68
4. Chapter 4. Implementation
4.1.
Motivation for this Chapter
4.2.
Methodology
4.3.
Technologies
4.3.1. Criteria
4.3.2. Constraints
4.4.
Further Technical Details
4.5.
Workflow Chart of development
69
69
69
69
70
70
71
73
7
4.6.
Guidelines to future developers
74
5. Chapter 5. Evaluation
5.1.
5.2.
Methods
5.2.1. Preliminary Steps
5.2.2. The Interviewing Protocol Setup
5.2.3. The Participant Selection and Contacting Process
5.2.4. The Data Collection Method
5.2.5. My preparation for the testing sessions
5.2.6. Test Scenarios
5.3.
Overview of Results Sections
5.4.
Reports For Future Developers
5.4.1. Report on working with Summon API
5.4.2. Report on working with Elastic Search
5.5.
Reports For Digital Library's Decision Makers
5.5.1. speed of service in retrieving tags and speed of interaction
5.5.1.1.
Quantitative Evaluation
5.5.1.2.
Qualitative Evaluation
5.5.2. usability of user interface
5.5.2.1.
Tasks & Questions
5.5.2.2.
5.5.3. ease of learning :: first vs second time usage
5.5.3.1.
5.5.4. findability :: Summon [does not have tagging] vs Tagus [uses Summon]
5.5.4.1.
comparitive evaluation :: Summon only vs Summon + Tagus
5.5.4.2.
Report on accuracy of public tagging affecting findability
5.5.5. ease of creating reading lists :: manual vs Tagus
5.5.5.1.
Cooperative Evaluation
75
75
75
75
75
76
76
77
77
77
78
78
80
81
81
81
84
84
84
84
86
86
88
88
88
89
89
6. Chapter 6. Conclusions & Future Work
6.2. Limitations
6.3. Future Work
91
91
91
92
8
7. References
94
8. Appendix
8.1. Tagus Screenshots
8.2. Test Script
8.3. Test Plan
8.4. Access Details To Location Containing Actual Data collected during evaluation
100
100
104
109
120
9
List of figures
Figure 3.1. Architecture Of Tagus
42
Figure 3.2. Modules of Tagus
46
Figure 3.3. Modules of Tagus Standalone API
47
Figure 3.4. Complete Class Diagram Of Tagus’ Server
48 to 51
Figure 3.5. Complete Class Diagram Of Tagus’ Client
52 to 57
Figure 3.6. Deployment Diagrams of Tagus
58
Figure 3.7. Two Column Layout in UI
60
Figure 3.8. One Column Layout in UI using “Search Summon”
62
Figure 3.9. One Column Layout in UI using “Public tags”
63
Figure 3.10. Searchable hyperlinked personal annotations
64
Figure 3.11. Non-Searchable plain-text personal annotations
65
Figure 3.12. Tags list for a user
66
Figure 3.13. Tags cloud for a user
67
Figure 4.1. Bootup of an Elastic Search instance
71
Figure 4.2. Iterative software development workflow
73
Figure 5.1. Timing of calls to Elastic Search’s instance in browser one
82
Figure 5.2. Timing of calls to Elastic Search’s instance in browser two
83
10
List of tables
Table 2.1. Table of sources
25
Table 2.2. Table of Functional requirements
26
Table 2.3. Table of Usability criteria chosen for Tagus with justification
35
Table 2.4. Criteria qualifying for quantitative evaluation
36
Table 2.5. Criteria qualifying for qualitative evaluation
37
Table 3.1. Table of core services
46
Table 4.1. Table of technologies
69
11
List of graphs
Graph 5.1. Ease of learning as measured for two specific tasks
86
Graph 5.2. Ease of learning as measured across all tasks
86
Graph 5.3. Ease of learning as measured across all tasks using
“satisfaction with results”
87
Graph 5.4. Findability: How EASIER is finding a resource in Tagus vs in Summon
88
12
1. Chapter 1. Introduction
1.1.
In this chapter we introduce our project, we describe its purpose, concepts involved,
background, aims and related work. We start with an introduction to the world of digital
library systems and define some features characterizing such systems. We then proceed
to describe the purpose of this project in the light of ongoing work at the Digital Library,
University of Edinburgh. The next section presents the background work leading to the
proposal and the execution of this project. The subsequent section outlines the aims of
this project in a four fold manner. Also, related work going on in other digital library
systems is concisely described in the penultimate section. Finally, we present the thesis
outline.
1.2.
Introduction
Resources in a digital library include e-journals, e-books, databases, research
publications and other digitized materials. Every year, millions of searches and
downloads take place in the current infrastructure. Resource discovery in the digital
library is often characterized by several factors, viz., speed, availability, findability and
personalization.
Speed refers to the quickness of the fetching of results, once a user searches for a
resource. While availability implies the presence of a particular digital resource in the
digital library, findability means the ability to find a currently present resource through
searches and other methods. In other words, findability is the tendency of an existing
resource to come up in results of searches. Thus, availability is defined for a digital
resource and findability can be defined for a digital resource, for a digital library user or
for a digital library search service. Finally, personalization is defined for a digital library
search service, as the set of capabilities of the system to provide means of storing
personal [3] and relevant information for existing resources, so that, future searches
made by the same user, give out results augmented by personally relevant information.
According to [13], “digital libraries can be cold, isolated, impersonal places that are
13
inaccessible to both machines and people”. Thus, when such personally relevant
information is shared publicly across users, personalization itself becomes a measure of
relevancy.
1.3.
Definitions
For the purpose of this thesis, the following terms have been used in this document, in
the sense of the definitions below.
1. Federated searching is the process of searching multiple database sources
simultaneously via a single search interface [19].
2. A resource discovery system is a tool or service, used by librarians and end-users to
manage and find either a) information about a resource or b) a resource itself or c) both.
3. Personalization is the process of incorporating user generated contents or user inputs
before presenting information to a user.
1.4.
Purpose
Available metrics have demonstrated very high use of digital library resources at the
University of Edinburgh [1]. Anecdotal evidence from various sources has suggested that
federated search using Webfeat [11] has not delivered an optimum discovery service
[2]. The library commits a significant proportion of its materials budget towards
increasing the success of the search system, increasing its usage and reducing the costper-use of the e-resources [1]. With this goal in mind, the Digital Library Section in
Information Services at the University of Edinburgh is in the process of evaluating two
“next generation” discovery services, viz., Summon by Serials Solutions [5] and EDS by
EBSCO*.
Summon is labeled as a “single search box, show results, filter results by metadata”
system, while, EDS is “a unified, customized index of an institution’s information
resources, and an easy, yet powerful means of accessing all of that content from a single
*The results of surveys and other evaluation procedures carried out on implementation of Summon in
some Higher Education Institutions or HEI’S are publicly available. For more details and examples,
please refer to [2] and [3].
14
search box” [5]. Summon, through its API [4], provides means of developing custom
applications while EDS does not provide such an API.
As part of this evaluation, with an aim of introducing new means of findability and
personalization in library search systems, with the initiative of Mr. Simon Bains, the
earlier Head of Digital Library, University of Edinburgh and with the consent of the
Library Committee at the University of Edinburgh, this project was undertaken, to
construct and assess the application of annotation and tagging based find services in an
academic library discovery environment and thereby create a tool that based on the
content of the Digital Library, can provide users with added value in discovering that
content.
1.5. Overview of bigger plans and goals driving this project
[With inputs from an initial interview with the supervisor]
1.5.1.
Overview of background work leading to this project’s proposal
The University of Edinburgh selected two resource discovery systems for testing over a
minimum period of twelve months. This decision was been made in order to reach a
better level of understanding about a number of issues which could not be resolved in
the time period permitted by the procurement process. This decision was endorsed by
the Project Board, but with a firm steer that “we must not cause user confusion during
the evaluation period. It was anticipated that we will go live would go live with one of
the selected systems early in 2011”. The system selected for launch under the
"Searcher" service label was EDS from EBSCO (the higher scoring system in the
procurement). “It was felt that moving Summon to live either as a replacement for EDS,
or in addition to it, during the 12 month evaluation period would also cause confusion,
so Summon was not planned for general public release during the evaluation stage of
the project”. Instead, access was intended to be provided in a managed way to target
user groups as part of the "User Engagement" project work package. This has meant
that “we have not been able to compare usage metrics of two competing systems, but
the value of this data is outweighed by the need to avoid user confusion and irritation
by requiring them to make a choice”.
15
One of the distinguishing features of Summon over EDS (at the time of the
procurement) was the availability of an API to allow solutions to be built locally utilising
the Summon functionality and integrate results and features into other local tools and
systems; “it was felt that this MSc project proposal could explore this area and help
inform the project team's decision making, viz. whether the Summon API could enable
the choice of this product over EDS, or if this additional flexibility did not provide
sufficient anticipated added value over EDS”.
1.5.2.
The importance of the timing of carrying out this project
The timing of this work is important due to the amount of testing work needed to make
a valid evaluation of the two products and the availability of Digital Library staff to the
project. “The MSc project allows us to explore some of the tasks in the "Usability"
project work package in more depth than otherwise possible in the time available. It is
important that we are able to make a valid business case to either continue with both
products for a defined period, or select one over the other, and this work plays a vital
part in that decision-making process”. It is expected that recommendations will be made
before November 2011 on the future direction of the discovery service(s).
1.5.3.
The prospective value of this project in the overall scheme
“We see value in this work exploring different approaches to help assess user behaviour
in ways that it may not otherwise have been possible to pursue, so the approaches used
in this project still offer valuable inputs”. The landscape of "library systems" is changing
and over the next few years the boundaries between what were previously viewed as
separate "systems" for managing traditional library activities and workflows, providing
access to traditional catalogues, an ever-increasing number of electronic resources, and
open-access agendas, is expected to see a move towards more comprehensive library
services, based on commercial proprietary or open-source technologies; “these will
ideally facilitate the management of workflow, incorporate enhanced discovery tools,
and be interoperable with other systems inside and outside the library”. With increasing
commoditisation of IT service components, there is likely to be an increasing move to
outsourcing and migration of system and service components to 3rd parties using
variations on the cloud model, putting a focus on libraries to add value in the areas
16
where they retain specialist expertise; “this MSc project and its contribution towards
the current resource discovery project will provide outputs which are useful in exploring
the aspects of user behaviour for input to library strategy over the next 2 years”.
1.5.4.
The stakeholders
In addition to the users of Digital Library at the University of Edinburgh, this work would
be of interest to the global library research community, as resource discovery and the
emergence of solutions like Summon, is presently of high strategic interest in this sector.
The scope of this proposed service could potentially include the currently growing list of
organizations and institutions moving towards a Summon implementation backed
academic library environments [2, 3]. “This work has a potential impact on the nature of
services provided by the Digtial Library section of UoE's Information Services. By
implication this means staff and students of the University, its partners, and potential
new students worldwide. The ultimate aim is to provide a "one-stop-shop" for discovery
of all electronic resources, and tools to support better sharing of those resources in
learning, teaching, and research activities. Examples of other relevant work happening
in UoE include development of mobile based applications for staff and students; it
would be interesting to consider some of the ideas explored on this project into scoping
of future phases of that work”.
1.6.
More Background and Related Work
A user’s resource discovery process in an academic research library presently relies on
individual discovery, supported by reading lists and subject guides provided by academic
and support staff. The LibQUAL results, reported to the Library Committee showed the
satisfaction with the collections to be low. As [1] points out, one very important reason
may be that the items which users want are just not available in Edinburgh’s Digital
Library. Another reason, however, is that the findability of resources could be improved,
as indicated by Simon Bains, clearly in, “while many users do find the resources they
require, others find this difficult” [1].
Difficulty caused in improving satisfaction could be factored due to those cases caused
by speed, findability and personalization. Summon claims to meet the speed challenge
17
with its unique multitiered indexing service [5]. However, there is an opportunity to
allow users to share information on the system about resources which were particularly
useful, whether on reading lists or found through their own search processes, thus
leading to this proposed work in personalization of resource discovery [3]. This work
proposes to meet the other significant challenge of findability too.
Currently, Summon uses its custom data mining methods to index digital library
resource and the same data is used to return results to users based on their queries.
New library discovery systems like Summon offer APIs that support customized tool
development [4], which makes it easier to introduce additional functionality which
would have been impossible to do previously without substantial vendor support. This
project constructs a tagging and annotation index which is used in conjunction with
Summon so that retrieved items are accompanied by annotations and tags. By
annotations, we mean personal statements added by a user, solely meant for his / her
personal viewing. Tags, on the other hand, are added by other users and allow users to
browse other resources and can be seen by everyone.
[14] labels this feature as “collaborative tagging” and [15] introduces the notion of
“induced tagging to refer to social bookmarking with two key characteristics: (1) a welldefined group of participants are knowledgeable on the available resources and the
background of the user community; and (2) tagging is required as part of their regular
responsibilities as a reference team.”. In this light, the prospective well-defined group of
participants would be students, researchers, librarians and professors from the
university.
From [2], “Libraries often fail to make their resources discoverable and that this may in
turn affect the perceived value of the library”. Following a review of the state of search
facilities at the University of Huddersfield became the first UK commercial adopter of
Summon in 2009 [2]. During this process, several issues were found and a customized
instance of Summon was delivered to fix the problems. University of Huddersfield is not
the only university to evaluate a new age system like Summon. The case for a single
“one-stop-shop” approach to resource discovery is argued for, in [20], which points out
that “the variety of systems in place [at a university] were not always as interoperable”
as expected. It further highlights that “federated search can be slow and in many cases
the users find it complicated to use”. Finally, it points out that customization of a
18
resource discovery system should be made based on data available from “log analysis”
and “usability testing”. The list of universities which use a Summon based system
include Michigan’s Grand Valley State University, Arizona State University, The
University of Sydney, Penn State University, University of Adelaide, etc.[21, 22, 23, 24,
25].
While recent focus in literature seems to be on evaluating the usability of new age
resource discovery systems, University of Edinburgh is keen to evaluate how extendible
these new systems are, how much functionality is offered in their public api’s and how
tough/easy developers find to develop new tools and services using them.
Finally, existing search systems in academic environments do not directly support the
direct development of extended applications. Therefore, in the light of this limitation, it
is important to develop this service to be also available on a standalone basis, like
Delicious [12], so that new applications can be developed based on it in future, e.g. an
increasingly rich collection of records could be made available to all students, organised
by relevant course, year etc. as more and more resources get tagged in the Digital
Library.
This makes use of the knowledge that tags are essentially user generated content in this
application, and that the onus of maintaining the tags for resources lies with the user
community as a whole. Thus, personalization, in the light of this context, is not only
applicable at a single user level, but also at a community level. In other words, a single
tag like “reading-list-msc-2011” for a resource such as “informatics introduction text
book” makes the resource personalized for a whole community, viz., all msc students
studying at the university in 2011.
1.7.
Aims
The aims of this project are four fold: 1) gather requirements for the application to be
implemented 2) perform a background study of existing Summon API and other tools
deemed necessary for the implementation 3) perform design-implement-reviewredesign cycles for the implementation of a service offering tagging services to library
resources at two levels, viz., code/API level and user interface level and 4) deploy the
19
system on a publicly accessible server and gather the challenges in the development of
this service and gather the results of usability testing of this service.
These involved working closely with several staff of Digital Library office at the
university, who are working in developer and usability expert roles, through out the
project. Therefore, this involved taking the initiative to gathering required materials, to
make certain semi-supervised decisions, offering available choices of the course of work
to follow at weekly team meetings and following up with the supervisor and staff to
arrive at this synchronized piece of work called Tagus.
1.8.
Thesis Outline
This work is organized as follows. In this chapter we introduced our work, outlining the
purpose and background. In chapter 2 we present work performed to gather
requirements for our project including a background study of Summon’s instance for
our university.
In chapter 3 we describe the architecture of the application developed, what constraints
came into the picture during design and what design decisions have been made to
encompass them. In chapter 4, we describe the implementation of the design,
integrating various developed components, attempts to make them cross browser
compatible and challenges faced during technical test driven software development.
In chapter 5, we describe the evaluation methods used in this study and the results
gathered from usability tests. The results have been presented in two categories, viz.,
one in the form of technical reports about using Summon’s public API and other
technologies and the other in the form of pilot user testing findings. Finally, in chapter 6,
we summarize the results and challenges and propose some guidelines for future
developers. We also present the limitations of this work and propose prospective
extensions, customizations and new applications of these technologies for future work.
Chapter 7 lists the references and chapter 8 is a summarized set of actual data gathered
on field and screenshots of the actual service, as deployed and in working order, for
reference.
20
Please note that while the work itself was done using an iterative software development
methodology, it is being presented section wise in the chapters to follow.
21
2. Chapter 2. Initial Work
In this chapter we describe the whole of the background study involving all the work
done before we started designing the system. This also includes the requirements
gathering process for our project, and along the challenges encountered during the
process. Please note that this phase was carried out throughout the project as
development happened in weekly cycles. Also, kindly note that the gathering of
requirements has itself been an integral part of the work description for this project as
there was no existing framework to just start working with. This was envisioned by the
supervisor in the previous semester and we had accordingly planned for it.
We start with a description of the information available before work on the project was
started. We then present a brief study on the Summon instance for University of
Edinburgh. This background study helped build the scope of this project’s outcomes. The
subsequent section outlines the methods used in gathering the actual requirements.
This is important in the light of the fact that, the resources on hand, at the start of the
project were just 1) a single php based api file to make calls to Summon and 2) the
above mentioned Summon instance. The next couple of sections list the gathered and
approved requirements, segregated into two types, viz., functional and non-functional.
Finally, we present and highlight usability requirements so as to arrive at a suitable set
of usability tests for such a service, to be tested at the end of its development.
Since the development of such a customized service using new generation resource
discovery systems is novel in academic library environments, as were the cases for the
requirements and the code, there was no existing framework to base our usability tests.
In this light, it is worth mentioning the time and effort, spent by the supervisor and the
student, spanning several meetings before finetuning and arriving at what has been
presented below.
22
2.2. Pre-Requirements-Collection-Stage
Before beginning the study of the university’s Summon instance, an agreement was
made with Serials Solutions to obtain a unique API Id and Key pair. This involved signing
Summon’s terms and conditions by the supervisor. One of the key restrictions of this
agreement was, in effect, not to run automated queries with their API.
This was followed by studying the existing online documentation [25]. The API is
broadly divided into three categories, viz., authentication, availability, search. The
authentication category deals with generating headers for query requests made using
the availability and search api categories. Availability category deals with generating
requests and fetching results in various supported formats [XML, JSON, Streaming
JSON].
Finally, the search api category provides means to query for resources present in the
Digital Library, using query strings, unique resource identifiers, etc. Please note that the
whole of this api category is too large to be described in its entirety and to be fit within
the scope of this document. Broadly, it supports extended or advanced searches based
on concepts like commands, parameters, fields and tokens. The important takeaway
from this section is that, these advanced features help exploit the power of Summon in
performing server-resource-intensive search operations and support advanced user
interface functions like pagination of results of querying, filtering a set of results using
several criteria at once to generate a subset, etc.
2.3. A Brief Study Of University of Edinburgh’s Summon Instance
The university’s Summon instance is hosted at http://ed.summon.serialssolutions.com/,
on a server managed by Serials Solutions externally. The actual data displayed on
searching via Summon is provided by the university. All the metadata, schema, storing
and retrieval of actual data, indexing it, the search process mechanism, etc. are done by
Summon in a proprietary manner.
The search interface via the browser shows two choices, viz., Basic Search and Advanced
Search. The Basic Search feature is just like a modern day search interface with a single
search box and a search button provided. The Advanced Search feature provides a big
23
form of several input boxes, one for each kind of constraint that the user wants to apply
on the search along with the words to be searched for.
Using either choice, once a search has been made, the page navigates to a search results
page. This results page contains a set of results and a set of options for further
constraining or filtering the already displayed search results.
Each result represents one resource in the Digital Library collection. For each result,
several pieces of information are presented in the form of attribute=value pairs. Such
displayed pairs do not represent all possible pairs. On hovering over a “full view” icon,
all possible information retrievable is displayed in a mini popup pane. The results
themselves are split over several pages [with page numbers starting from 1 up to n,
where n is sufficient to accommodate all the results of the output of the search].
Clicking each page number, in turn, fetches the corresponding segment of the search
results.
While the results are displayed in a vertical list in the middle of the browser screen, the
constraining options are displayed quietly on the left side. Each constraining option is
called a facet. Changing the value of a facet at anytime applies the corresponding
constraint on the displayed search results and redisplays the results, using a new
pagination of the freshly generated results list.
Where applicable and available, an icon associated with the resource, like the snapshot
of the cover of a book is displayed.
A user can bookmark several resources by clicking the appropriate search result’s “save
item” button. This list is saved only for the current session of a non-logged in user. The
saved set of items can be accessed at the bottom of the screen and printed.
Owing to a lack of permission from Summon, its screenshots are not being provided in
this document. However, it is gently suggested to the reader that a visit to the above
URL [for directly experiencing Summon’s capabilities exposed via their user interface]
would be helpful for associating with the described details in this document.
24
2.4. Requirements Collection
The primary requirements for the development of Tagus have been collected, under the
supervision of Mr. Colin Watt, over a period of 6 weeks from June 01, 2011 to July 14,
2011. During this period, while “Del.icio.us” provided the base idea for the interaction
model of this system, the other models, viz., storage, user interface, portability and
authentication for this software were extracted from various sources. These have been
listed exhaustively below:
1.
2.
3.
4.
5.
Table 2.1. Table of sources
Regular weekly meetings with Mr. Colin Watt
Specific usability expert review meetings with Ms. Angela Laurins
Previous communication sequences with Ms. Morag Watson, Mr. Simon Bains
Weekly developer to developer meetings with Ms. Ianthe Hind, Ms. Claire Knowles
Incremental iterative development and corresponding review by taking in other parallel
projects [both running and deployed] of the IS section of the Digital Library, University
of Edinburgh.
The next couple of sections list the requirements themselves in detail, followed by the
evaluation plan derivable from these requirements directly. Kindly note that any
constraints and special conditions have also been duly noted in the appropriate
sections.
Also, under non-functional requirements, a specific subsection has been devoted to
usability requirements, in accordance with the importance Digital Library gives to
usability. Together, the following sections give an estimate of the size of the project
spanning organization, management, development, deployment and testing areas.
2.5. Functional Requirements
For each functional requirement, its classification, its origin and justification of the
purpose, it achieves have been given along with the description.
25
Table 2.2. Table of Functional requirements
Legend
SST: Supervisor, Student [Team]
DXT: Student, Digital Library Developers [Team]
UXT: Student, Usability Expert [Team]
1. Users should be able to login and logout of the service using individual login and
password combinations.
Classification: authentication
Origin: SST
Justification: The service should be protected with an authentication procedure. This is
to prevent anonymous usage of the service.
2. Users should be able to search for resources using plain words, starting from a single
search box accepting input. This search functionality should perform similar to
University of Edinburgh’s Summon instance. It should display the list of all resources as
returned by Summon.
Classification: summon’s public api
Origin: SST
Clarification: Replicating the entire user interface, professionally developed over several
years by full time employees of Serials Solutions, including support for advanced
features like all facets, commands, filters, etc. is counter-productive in two ways.
Justification: Firstly, this project, Tagus, is about the concept of applying tagging in an
academic library environment and not about redeveloping a new age library resource
discovery system. Secondly, replicating all the user interface features itself would take
several months to years and would be beyond the scope of the duration of this project.
26
3. A user should be able to add a personal annotation to a resource.
4. A personal annotation for a particular resource is displayed every time that resource
comes up in the user’s future searches. The annotation for the resource is visible to only
the user who created it.
5. A user should be able to delete an existing personal annotation to a resource.
Classification: tagging
Origin: SST
Justification: Personal annotations should be visible only to the user who creates them.
This allows for incorporating personalization with privacy in place for the user.
6. A user should be able to add a public tag to a resource.
7. A public tag for a particular resource is displayed every time that resource comes up
in any user’s future searches. The tag for the resource is visible to all users of the
service.
8. A public tag, whenever and wherever it is displayed next to a particular resource,
should be clickable. On clicking, a new search functionality should display the list of all
the resources with the same public tag.
9. A user should be able to delete an existing public tag to a resource, provided he / she
created that public tag for that resource.
Classification: tagging
Origin: SST
Justification: Personal annotations should be visible only to the user who creates them.
This allows for incorporating personalization with privacy in place for the user.
27
10. The data returned from Summon’s API for each library resource should be displayed
in the user interface.
Classification: user interface navigation
Origin: UXT
Clarification: One aim for Tagus is to have less clutter in the user interface while
displaying results.
Beyond The Proposed Requirements
The following requirements have been gathered after the first fifth and the sixth weekly
cycles of iterative software development of Tagus and have been implemented, going
well beyond the planned requirements set. These required additional work and have
been successfully implemented.
11. A user should be able to view a list of all the public tags that he / she has created till
date. This functionality would be referred to as the tag cloud.
Origin: Student, Approved By UXT
Justification: Multiple routes to start a user’s search for library resources using tags
would help achieve user satisfaction.
12. A user should be able to able to search for resource marked with a particular public
tag, by inputting the tag as an input to a search box.
28
Origin: Student, Approved By DXT
Justification: Starting with a search using plain words would result in a list of resources
being displayed. Next to each resource, a list of public tags would be displayed. A user
could click on any of these tags and this would result in a new search for all resources
marked by the same tag. While this functionality definitely suffices for the originally
proposed requirements, a use case has been visualized in which a user could want to
generate a new search using a tag word which is not in the list of results.
13. Every time a search functionality gets executed, the results of the search should be
displayed in a fixed format, split across several pages, showing a limited number of
resources per page.
Classification: usability guidelines, user interface navigation
Origin: Student, Approved By DXT, UXT, SST
Justification: Summon’s API supports server-side pagination. As a result, the user
interface rendering logic and usability of the UI are simplified by rendering a fixed
number of results per page and displaying a list of pages. This is in line with the usability
guidelines of modern day search engine user interfaces.
Clarification: The customized tagging API, that supports both standalone queries as well
as being integrated with Summon’s public API should support a similar pagination
feature. In case, such a simulation is not feasible on the server, the client side should
take care of the UI rendering in such a way that the user is oblivious to the display of
results. In other words, the search results should always be paginated and well
segmented, irrespective of the method of search for a resource, viz., search summon
using words, search public tags using words or search by clicking on an existing public
tag.
29
14. Once a search functionality has been performed, a user should be able to export the
top results of the search as a downloadable document file, with the results in it having
been formatted in a fixed manner.
Classification: extra feature exportability
Origin: Student, Approved By SST
Justification: This functionality was originally reserved as a requirement of a prospective
new tool, which would use the Tagus Standalone API [i.e., independent of Summon] to
generate a list of resource identifiers of digital library resources. Being able to develop
such a new tool would provide solid proof-of-concept evidence that the tagging service
is capable of performing well even without Summon.
Clarification: Instead of waiting for another developer to use the newly developed Tagus
Standalone API, this was suggested by SST for extra credit and has now become an
extended feature available to users. However, for completeness, since just an exported
list of resource ids would not be really usable for a user, the Summon API is invoked to
generate a complete description of each resource in an exported list. In other words,
exporting a page of displayed results would generate a downloadable file for the user
consisting of information fetched from Summon.
15. There should a standalone API to this service, running as an independent service.
Classification: standalone api
Origin: Student, Approved By SST
Clarification: The tagging part of the service should be independent of the methods and
technologies used to fetch data from Summon.
Justification: The tagging api should also work standalone as a subservice.
The next section lists the non-functional requirements, which were derived on analyzing
the functional requirements.
30
2.6. Non-functional Requirements And Further Discussion
As referred to, in the introduction section, resource discovery in an academic library
environment is characterized by the following factors,
1. Speed
2. Availability
3. Findability
4. Personalization as a means of relevancy.
As discussed above, Summon claims to achieve the speed factor, with its multitiered
indexing approaches [5]. Over the course of this project, this has been verified to be
true. Summon is very fast in its response times, when responding to user queries via its
public API. For future developers who would further like to increase the speed of
Summon, please note that since it is hosted by Serials Solutions at their vendor site,
improving speed is not achievable without access to source. Availability of resources in
the Digital Library is outside the scope of this project, as it involves the procurement and
maintenance of resources and/or their required licenses on a periodical basis in the
library.
The primary means for finding a resource in Summon is via the “single search box”,
further aided by a set of mechanisms for finetuning the results [5, 6]. There is no direct
means of finding other related results from each output in the result set. Similarly, with
the current system, there is no means of adding personalized relevant information like a
user’s comments [relevant to only that user] on search results.
This project provides a way of personalization to end users by introducing the concept
of tagging in an academic library environment and exploits this personalization to
provide a new means of findability. This project is called Tagus, and it achieves both
these factors through the concept of community-owned-collective-responsibilitytagging. This is elaborated below.
A tag, within the scope of this project, is defined as a word or a phrase, describing a
digital resource. A user tags a resource with one or more tag-words or tag-phrases,
which would be stored in the system. If a tagged resource is output as the result of a
31
future search, the tags are displayed along with each resource’s regular information [6].
The user could use a tag to find new resources, e.g., typically by clicking a tag, which
results in the system outputting a set of resources tagged [previously by users] by the
same tag-word or tag-phrase. As indicated in [13], “applications that allow users to add
personal metadata, notes, and keywords (simple labels or “tags”) to help manage,
navigate, and share their personal collections help to improve digital libraries in three
main ways: personalization, socialization, and integration”.
This project extends the scope of tagging [as popularized by Delicious, 12], by including
the concepts of private/personal and public tags. A private/personal tagset for a
particular resource by a particular user would be visible only to that user in future
searches. This method of including privacy provides a means of annotating, to the user.
The public tags, on the other hand, would be globally visible across users and across
resources. The significance of tagging has been highlighted clearly in [13], viz., “The
ability to share data and metadata in this way is becoming increasingly important as
more and more science is done by larger and more distributed teams rather than by
individuals. Such social bookmarking is already available on the Web site of publications
such as the Proceedings of the National Academy of Sciences and the journals published
by Oxford University Press.”.
2.6.1.
Implications and Constraints while choosing technologies for interaction with Summon
The requirements for tagging imply a storage of tags and their retrieval. Storage,
typically would require a database and fast retrieval would require an index on the data
in the database.
Also, “Serials Solutions, the developer of Summon, claims to have over 500,000,000
items indexed from over 94,000 journal and periodical titles.” [5]. Therefore, availability
could be rated as “huge” in terms of projected data size, as potentially, the proposed
application for tagging needs to deal with tags for such numbers of digital library
resources.
Since Summon’s API is restricted to providing only read-based queries [4] to its own
index, i.e., custom data cannot be written into Summon’s database, an external
32
database of tags is required. This follows from existing literature, e.g., from [13], “Like
Alexandria, most digital libraries are currently read-only, allowing users to search and
browse information, but not to write new information nor add personal knowledge”.
Also, since an instance of Summon itself is a vendor hosted solution [6], this database
needs to be based on a fast, external [w.r.t. Summon’s remote hosting location] system
at the University of Edinburgh. Finally, tagging is an incremental activity w.r.t. users and
searches. As users adopt Summon and search, more search results are tagged over time.
This incremental nature should be considered as an important factor for deciding on the
indexing technology to be used. Apache Lucene [9] provides the best open source
indexing solution, with its own internal means of maintaining data. Though, this was
considered initially, ElasticSearch [7], an open source wrapper around Apache Lucene,
has been chosen because it meets the “incremental nature” factor described above.
1.
2.
3.
4.
ElasticSearch
is “fast”, due to its basing on Apache Lucene
provides searching with “free search schema”
indexing data uses “JSON over HTTP”, also used by Summon and its API
allows scaling by “starting with one machine and scale to hundreds” [7].
ElasticSearch provides means of configuring an index over a distributed set of machines,
whose number could grow as the size of tagged data grows. Summon, itself, is known to
rely on Apache Lucene [9], at its core, thus it is reliable [5, 6]. An alternative has been
identified to be Apache Solr [8], also based on Lucene, but due to its additional
technological dependencies on Java servlets, a container software like Tomcat, it would
serve as a backup, incase ElasticSearch’s instances don’t scale well enough in future.
2.6.2.
Constraints due to standalone mode
As the tagging api should be available as a standalone service, the choice of
technologies should be similar to what Serials Solutions itself uses to maintain Summon.
In other words, the indexing component should be 1) dynamic [i.e., non-statically
invoked] and 2) high performance, in terms of uptime. This implies that even if
Summon’s servers handling calls via its own public API are not accessible, a developer
33
wanting to use the tagging api should be able to do so without waiting for Summon’s
servers to become accessible.
Finally, please note that though this was meant as a pilot project, due to these
constraints, choosing technologies such as ElasticSearch / Lucene / Solr provides a solid
technology framework making the standalone api very robust. Thus, quality of the
technology stack has been very high throughout the project.
2.6.3.
Usability requirements
In this section, we look at the concept of usability as applicable to library resource
discovery systems. We start with existing definitions of usability, proceed to list the
options for what criteria are available for usability while justifying our choices. Also in
planning to evaluate the system after development, we list the types of evaluation
possible and arrive at the reason why certain kinds of evaluation suit this project, better
than the others.
The Digital Library stresses the importance of usability testing for every product /
system it creates / deploys. This is true for even the existing resource discovery systems
at the university like Searcher, catalogue search facility, etc. In some cases, usability
professionals are engaged in the process. For example, UserVision
[http://www.uservision.co.uk] was employed to perform an exhaustive study on
usability problems of the existing Searcher system. Due to a lack of permission, the
results of this study are not being referenced in this document. However, it is available
on request [Digital Library] and highlights certain measures of usability while making
suggestions to improve it.
2.6.3.1.
Definitions of Usability, Kinds of Usability Criteria Available
The definition of usability varies according to the subject under investigation and is
often adapted to fit into the context of the problem under investigation. However, in
the area of Human-Computer-Interaction [HCI], some criteria are broadly accepted for
what could be viewed as usability.
34
ISO 9241 Part 11 [28]
ISO 9241 Part 11 defines usability as below [28]:
“The extent to which a product can be used by specified users to achieve specified goals
with effectiveness, efficiency and satisfaction in a specified context of use”.
Please note that the course website for HCI at the University of Edinburgh [35] also
describes usability in the following manner, quoting lists of heuristics from Jakob Nielsen
and principles from the book authored by Alan Dix et al.
Nielsen’s 10 Usability Heuristics [29]
These heuristics are listed below [29]:
“Visibility of system status
User control and freedom
Error prevention
Flexibility and efficiency of use
Help and documentation
Match between system and the real world
Consistency and standards
Recognition rather than recall
Aesthetic and minimalist design
Help users recognize, diagnose and recover from errors”.
Usability Principles from HCI Textbook [30]
These are listed below [30]:
“Learnability
Flexibility
2.6.3.2.
Robustness”.
Choices Made From Available Usability Criteria
Within the scope of this project, we adopt this definition of usability, while keeping it in
line with the above standards: [please note that wherever applicable, keywords have
been underlined to indicate their association with the keywords in the above three
standard descriptions of usability]:
Table 2.3. Table of Usability criteria chosen for Tagus with justification
Usability for Tagus, comprises of the following criteria
1. speed of service, interaction
has been chosen to represent efficiency of implementation and the
power of underlying technology stack in bringing a fast user experience
2. ease of use, being intuitive to new users
has been chosen to represent user control and freedom, recognition
35
rather than recall, standards
3. ease of learning
has been chosen to represent visibility of system status, consistency,
recovery from errors, learnability
4. findability of what a user is intending to find
has been chosen to represent consistency and standards, effectiveness
5. minimal UI
has been chosen to represent learnability, ease in repeating an already
learnt task.
These criteria are themselves, in line with the non-functional requirements described in
the previous section.
In this subsection, we have seen why certain criteria were chosen to represent usability
for this project. While criteria provide a strong theoretical framework, they need to be
planned for and implemented / measured in evaluation methods to arrive at a measure
of usability of a system. In the next two subsections, we will look at two kinds of
evaluation of approaches and see why certain criteria fit one evaluation approach and
don’t fit the other.
2.6.3.3.
Quantitative Evaluation, What Criteria Fit Into Quantitative Evaluation?
Quantitative evaluation is an approach to evaluation of criteria in which the primary
mode of measurements is using numbers and calculations, often directly obtained from
the system under observation.
From the above table,
speed of service, interaction
is a candidate for measurements using quantitative evaluation. This is because, the end
system could be timed and durations of each activity / event / request-response-cycle /
start-to-end of a user action could be obtained with a direct measurement.
Table 2.4. Criteria qualifying for quantitative evaluation
Criteria [Abstract concept]
Transformation into evaluation
[Concrete, measurable entity]
speed of service, interaction
speed of service in retrieving data and in
displaying data
36
The rest of the criteria, viz., ease of use, ease of learning, findability and minimal UI are
not really measurable by a directly observing the system and noting down numbers.
Rather, each of them indicates that a feedback from the user is needed to measure
them. In other words, the system by itself [alone] is not a good indicator of how well it
was used / how useful it was to the user. Therefore, asking the user for these criteria is a
better approach.
2.6.3.4.
Qualitative Evaluation , What Criteria Fit Into Qualitative Evaluation?
Qualitative evaluation is an approach to evaluation of criteria in which the primary
mode of measurements is using answers to questions aimed specifically at a task at
hand. These questions may be put to the user in the form of choosing one among many
categories, user ratings, interpreting of answers using further questions, etc.
The digital library already follows a set of guidelines in carrying out such evaluations for
systems developed or used in the university. On discussing with our Usability Expert,
Ms. Angela Laurins, the following criteria have been found to fit well with a qualitative
approach to evaluation, thus lending further strength to our decision of applying
qualitative evaluation.
Table 2.5. Criteria qualifying for qualitative evaluation
Criteria [Abstract concept]
Transformation into evaluation [Concrete,
measurable entity]
ease of use, being intuitive
Ratings for intuitiveness and ease of use of UI
to new users
ease of learning
Ratings from users and as measured from
observation of user activity between first and second
time usage
findability of what a user is
intending to find
Ratings for the findability of a resource, that a user
intends to find in the library
Minimal UI
Ratings for ease of creating reading lists. Though the
system and hence its UI are designed to be used to
find resources, this will help determine whether a
user finds the UI small enough to quickly navigate it
and find out how to perform secondary actions
37
2.6.3.5.
Why Not Other Criteria? Limitations, Constraints, Overlaps
Though we have arrived at a concrete method of evaluating each usability criteria and
we have justified these decisions / methods of obtaining measurements, we still have to
justify why some other criteria were not enlisted in our scope of usability. Please note
that, while they have been taken into account while developing the system, they are not
being specifically evaluated due to the following reasons:
The primary reason is the original scope of the project: a pilot tool, a set of test data and
a pilot usability testing report. The secondary reason is the time constraint of the
project: Work has been done full time everyday throughout the duration of the project
including weekends. However, this project does not have a pre-existing, working
software as a starting point. At the beginning, it was not even known if the development
of such a tool was feasible in practice starting with just one resource, the Summon API.
The third reason is that carrying out a full time end-to-end usability evaluation requires
the full time work of an expert professional and often spans months of effort [known
from inputs from the Digital Library Team about prior experience of working with similar
systems at Digital Library]. The first aim of this project is to study and see the feasibility
of developing such a novel tool in academic library environments, the second aim is to
proceed to develop the tool, if found feasible. Thus the limited time is a critical
constraint in this study.
Finally, the following text gives specific reasons for why certain criteria were not taken
into consideration:
1. Match between system and real world
has not been included as a separate criterion, because it overlaps with intuitiveness
and ease of learning within the scope of Tagus
2. Flexibility
has not been included as a separate criterion, because it overlaps with user control
and freedom within the scope of Tagus
3. Robustness
has not been included, because it the concept of a pilot system. Robustness is
ideally measured once a system has been deployed for real or simulated use and its
performance over a decently long period of time has been monitored. Similarly, within the
scope of Tagus, usability testing is really not about measuring for how robust the system is
to hacks, exceptions, denial of service attacks and the like.
38
2.7. Evaluation Plan
An evaluation plan, under the supervision of our usability expert, after three sessions of
reviews, was formulated for Tagus, to be executed, once its requirements have been
implemented and preliminary testing activities have been completed.
The complete set of documents, viz., the usability testing script, its plan [sheet] for each
user [data from each test participant has been included in the repository] have been
included in the appendix.
2.7.1.
Constraints on automated testing
We conclude this chapter, with a small note on why automated testing was not
employed for Tagus. Automated generation of test data and testing is well suited
method for this system since there were no data and no users to start with. Automated
queries to Summon could have helped create dummy tags with simple scripts exploiting
the API nature of the new system.
However, the terms and conditions, signed with Summon, actively prohibit such
automated query generation and retrieval of metadata/data. This is understandable
from the vendor’s perspective as the server is externally hosted and maintained by
them and they would like to prevent unnecessary load and prohibit any prospective
abuse of their system.
Requirements collection forms an important phase for novel pilot projects in academic
library environments, especially when there are no existent systems with similar
functionalities, like in the case of Tagus.
The guidelines for future developers from this section are to:
1. Start early
2. Revise, question the eligibility of each requirement, justify
3. Evaluate feasibility of implementation w.r.t. time and resources
4. Finalize the scope of each requirement, clearly declaring any assumptions in a
manner similar to the approach [described above for Tagus].
39
40
3. Chapter 3. Design
In the previous chapters, we have described the background of this project, reviewed
existing literature about new age library systems, the purpose of development of this
service and elaborated why and how requirements were collected, along with the
justification for each decision made and approved. In this chapter, we have described
the overall architecture [top level view] of the system, followed by a list of designed
structures [mid level view]. The source code for the system itself has been delivered
separately in the Digital Library’s SVN Repository [low level view].
Please note that where appropriate, design decisions have been justified with detailed
descriptions of the constraints leading to those decisions. We wind up this chapter with
a section dedicated to the design of the user interface of Tagus and once again, with a
list of reasons for why some UI elements were chosen over others.
3.2.
Architecture
The architecture diagram for Tagus has been given below. Primarily, it could be viewed
as multiple sets of servers and clients, working together to provide a service to the end
user.
As described earlier, the university’s Summon instance is hosted by Serials Solutions on
an external vendor server. It is a black box for the developers of systems like Tagus.
Referring to the diagram, everything below this black box, is under the purview of
Tagus.
There are 3 broad components within this architecture, labeled below as 1, 2 and 3.
1. WEB SERVER
2. INDEXING SERVER
3. BROWSER / CLIENT.
Please note that 1 and 2 do not communicate directly with each other. This is a central
aspect of the architecture and design of components of Tagus. The roles of all three
components are described below.
41
Figure 3.1. Architecture Of Tagus
EXTERNAL SERVER HOSTED BY VENDOR
Blackbox to the rest of the system
Summon Public API
{Synchronous Calls}
WEB SERVER
Server Applications/Web Server Dynamic Resources
Summon Interactor
1
INDEXING SERVER
Web Server
Static Resources
Indexer-Tagus-NotationConvertor
Tagus API
List Generator
Tags Adder
Summon Searcher
Tags Deleter
Summon Results Fetcher
Tags Searcher
Authentication
Summon Results Encoder
Tags Fetcher
{Asynchronous Calls}
BROWSER
External Browser Libraries
3
Tagus Thread
Tags Fetcher
Main Thread
Styles
Tags Adder
Summon Thread
Encoders
Images
Decoders
Utilities
Session
Verifiier
MiniPars
ers
Events Registration
Summon Results Decoder
Summon Results Renderer
Tags Searcher
Tags Deleter
Events Controller
Events Handlers
42
Tags Renderer
2
Further Discussion On Components Of The Architecture And Design Decisions
1. THE WEBSERVER
The web server hosts all the server components and applications needed for delivering
data to the browser. How it achieves each functionality expected by the browser is
private to itself. The only constraint on the webserver is that the data exchange format
between itself and the browser component should be fixed.
It acts as a layer between the vendor’s external server and the user’s browser. It
achieves this via the public API provided by Summon. The communication between the
external server and the web server is synchronous.
Similarly, it supports certain functionalities like generation of user lists, once requested
by the browser. A browser by itself cannot generate a format other than the markup
language it supports. Hence, the decision was made to move the functionality, to create
lists in any format as required, to the webserver.
Finally, this component also is responsible for creating, maintaining, verifying and
destroying user sessions. This is a standard practice in user authentication techniques.
The communication between the webserver and the user’s browser is asynchronous.
This decision has been made in the light of recent developments in browser
technologies. Technologies like AJAX [Asynchronous Javascript And Xml] enable
developers to provide desktop-like interactivity and speed to end users.
Thus, this component is highly cohesive in its functionality and is deployable [please
refer to the deployment section below] independent of the other two components.
2. THE INDEXING SERVER
The indexing server hosts all the data required for providing functionalities related to
tags, to any client who requests for them. This includes fetching the tags for each library
resource, as well as tags specific to a particular user, etc. This component has two
subcomponents, the data maintainer and the data indexer.
However owing to the high amount of study and reasoning put in during the proposal
stage and the background work for this dissertation, the choice of technologies for
43
implementing this “indexing server” component have already been reduced to a small
set of three, viz., Apache’s Solr, Apache’s Lucene, Elastic Search.
Owing to a further refinement of requirements and looking forward to scalability and
maintainability of this service in future, Elastic Search scored over the other two in
terms of its support for distributed systems. In future, if the data serving load needs to
be split across multiple servers, but the clients requesting for the data need not change
their methods for issuing requests, then Elastic Search as a choice is perfect for the
situation. It provides means of “scaling from one machine to hundreds” [7].
Please note that a straight forward method for applications such as Tagus would be to
split this component in two parts, viz., a database holding tags and related data and a
separate indexer which runs once in a while [periodically], accumulating the changes
and reindexing the changes. This method requires
a) A schema for the database
b) A separate server for the database
c) A separate indexer
d) An interoperability module for triggering the indexer either on every change or once
every day, etc.
e) A database administrator ready to monitor the system
f) A high maintenance cost, for example, in future, if one wants to add more data to a
particular table with a fixed schema
g) Database system upgrades, downtime, other management, etc.
h) A mapping between the user authentication system of Tagus and the database
authentication system
i) Finally, a mapping of user requests into SQL queries and usage of libraries to
perform data format conversions between formats supported by a database system
and a browser
Elastic Search solves a lot of problems arising out of these activities and reduces
maintenance costs by doing the following
a) It itself stores the data
b) It performs an indexing activity for every change
c) The data to be stored and indexed itself is schemaless, that is, there is no fixed set of
fields to be defined before starting to store data
d) It is fast, as per our experience in developing and testing Tagus
44
e) It [optionally] auto-generates unique identifiers for every block of data inserted
f) It provides a HTTP REST [26] interface, implies, a developer aware of what a HTTP
URL looks like and JSON [27] can easily manipulate complex requests into easy calls
g) We can now utilize the authentication system already in Tagus for indexing
h) Finally, JSON is supported by all modern browsers, which implies there is no need for
format convertors for Elastic Search to understand a browser’s request.
Finally, since an indexer technology itself [Elastic Search, in this case] would be unaware
of what a “tag” means and how to interpret custom requests for retrieving data from its
indexes, a subcomponent has been created specifically for this purpose, and it is called
the “Indexer-Tagus-Notation-Convertor”.
Just like the communication between the webserver and the browser, the
communication between the indexing server and the browser is also asynchronous.
Once again, this decision has been made to provide a fast user experience to the user.
Thus, this component is highly cohesive in its functionality and is deployable [please
refer to the deployment section below] independent of the other two components.
3. THE BROWSER
The browser component is where the majority of the controlling of the application takes
place. In other words, every request starts with an user action in the browser. Similarly,
every request, after completing its several request-response cycles, finally culminates in
an update to the UI in the browser.
Thus, this component is highly cohesive in its functionality. This is the only component
which is coupled loosely with the other two components.
Please note that in summary, 1 communicates with 3 & 2 communicates with 3. 3
orchestrates the whole application and coordinates user events, actions and requests to
be asynchronously mapped to either 1’s server functionality or 2’s server functionality
or both. Finally, all the above design decisions make the components very loosely
coupled. This makes the whole architecture extensible and maintainable for future.
An example of such an extension is described in chapter 6 in the future work section.
45
3.3.
Design Of Core Services
We define “Core Services” for Tagus as the set of functionalities providing everything
except the user interface modules including
Table 3.1. Table of core services
1. the ability to fetch information on library resources from Summon
2. the ability to add or delete a tag / annotation
3. the ability to search the local data for listing resources marked by a certain tag
4. the ability to generate lists from results pages
5. the coordinating capability for utilizing 1, 2, 3, 4 to drive a request.
These modules are split across all three components of the architecture, described
above. Thus, functionality wise, the view of the whole system could be pictured as
below.
Figure 3.2. Modules of Tagus
requests
combined coordination capability
authentication
loads
invokes
external libraries
loads
configuration
invokes
invokes
summon interactor
tagger
list generator
WEB
SERVER
BROWSER
INDEXING
SERVER
46
3.4.
Design Of Standalone API
From the table of core services, the subset of services provided by the standalone API is
given by 2, 3 and 4.
the ability to add or delete a tag / annotation,
the ability to search the local data for listing resources marked by a certain tag and
the ability to generate lists from results pages.
Since the modules for core services were designed with low coupling, with each
component invoked only when necessary, no separate design for the standalone API
was necessary. Since they represent only a subset of the above table, the view of the
standalone version of the system now becomes a subset of the above diagram.
Figure 3.3. Modules of Tagus Standalone API
invokes
invokes
tagger
list generator
BROWSER /
API
INVOKER
INDEXING
SERVER
47
3.5.
Final Class Diagrams
1. PHP Classes
Summon
+$debug
+$client
+$host
+$apiKey
+$apiId
+$sessionId
Public API Of
Summon,
External Class
+Summon()
+getRecord($id)
+query($query, $filterList, $start, $limit, $sortBy, $facets)
-call($params, $service, $method, $raw)
+_process($result)
+hmacsha1($key,$data)
SummonService
+SummonService()
+searchFor($queryString, $pageNumber)
+getResourceFromId($resourceId)
+fetchResults()
+getResults()
DataServiceWithApiKey
Main()
Proxy Class To
Above Class
With
University Of
Edinburgh’s
Unique Id &
Key
Interfacing
Class Which
Intercepts
Requests And
Issues
Responses
48
Utilities Class
Utilities
Providing
Constants
+getCurrentURL()
+getSessionManagerURL()
+getUserHomePageURL()
+getLoginPageURL()
+getSummonServiceURL()
And
Common
Functions
External Class
From
FPDF.ORG
[34] To
Generate PDF
Files
fpdf
PDF
+LoadData($array)
+BasicTable($header, $data)
+ImprovedTable($header, $data)
+FancyTable($header, $data)
ListGenerator
Main()
Adapted From
An Example
Class
Provided
From [34]
Wrapper
Class To
Above Class,
Which
Abstracts Out
A Lot Of Low
Level
Functionality
Interfacing
Class Which
Intercepts
Requests And
Issues
Responses
49
Connects to a
database and
verifies the
presence of a
given
username and
password in it
AuthenticationDatabaseConnection
+AuthenticationDatabaseConnection()
+VerifyUsernameAndPassword($u,$hashofp)
AuthenticationService
+AuthenticationService()
SessionManager
+SessionManager()
-tryToCreateOrMaintainSession()
-setSessionSpecificValue($value)
-getSessionSpecificValue()
-checkSession()
-login($un, $pwd)
-logout()
A Simple
Authenticatio
n Service
Class Which
Can Be
Plugged To
Any Kind Of
Authenticatio
n Mechanism
Here We Plug
It In To Work
With The
Above
Database
based
authenticatio
n verification
class.
Interfacing
Class Which
Intercepts
Requests And
Issues
Responses
50
AuthenticationDatabaseConnection
Summon
1
1
1
1
AuthenticationService
SummonService
1
1
1
1
DataServiceWithApiKey
SessionManager
*
1
1
1
Utilities
1
1
fpdf
PDF
1
1
ListGenerator
*
Figure 3.4. Complete Class Diagram Of Tagus’ Server
51
2. Javascript Files [Treated Equivalent To Classes]
Session
+DecideIfLoginIsPossibleAndAct(data)
+DecideIfSessionIsStillValidAndAct(data)
+DoLogoutAndAct(data)
+SessionVerifier()
+EnableLoginForElements()
+EnableLogoutForElement()
SummonFieldsConfiguration
+OpenUrlRenderFunction(dataArray)
+DefaultRenderFunction(dataArray)
+RenderSingleSummonResourceAsSummary(doc,
element)
+RenderSingleSummonResourceAsFull(event, doc)
Main
+main()
52
Summon
+RenderJSONAsVerticalList(element, pageNumber, json)
+RenderResultsOfPage(resultsElement,query,pageNumber,dataToDisplay)
+PaginateAndRender(data, pagesElement, resultsElement, query)
+EnableSummonSearchForElement()
SummonRendererBasedOnIds
+RenderJSONAsVerticalListBasedOnIds(element, pageNumber, json)
+RenderResultsOfPageBasedOnIds(resultsElement,pageNumber)
+PaginateBasedOnIds(arrayOfResourceIds,PagesElement,resultsElement)
+GetResourcesBasedOnIds(arrayOfResourceIds)
TaggerCore
+ValidateAccessibility(accessibility)
+ValidateUser(user)
+ValidateResource(resource)
+ValidateTag(tag)
+AddTag(accessibility, user, resource, tag, callback)
+DeleteTag(accessibility, user, resourcetag_combo_id, callback)
+GetTags(accessibility, user, resource, elementToLoadTagsIn)
+GetAllTagsForUser(accessibility, user, elementToLoadTagsIn)
+SearchForResourcesMarkedWithPublicTag(tag, callback)
+SearchForResourcesMarkedByUser(accessibility, user)
+ListAllUsersResourcesAndTags(accessibility)
53
TaggerRenderer
+RenderSingleTagWithoutDeleteTagButton(tagName, sizeToRenderIn, countOfTag,
accessibility, listElementToRenderIn)
+RenderSingleTagAlongWithDeleteTagButton(tagName, tagId, accessibility, userName,
listElementToRenderIn)
+RenderTagsAsList(setOfTags, accessibility, userName, resourceId,
listElementToRenderIn)
+RenderAddTagButton(accessibility, userName, resourceId, elementToRenderIn,
associatedTagsListElement)
+GetAndRenderAllTypesOfTags(element, resourceId)
+EnableTagusSearchForElement()
+EnablePublicTagsForElement()
JqueryExtensions
+main ()
SafetyPrecautions
+main ()
54
UIElements
+ExportListButton ()
+ButtonResultsTable()
+ResultsTable ()
+PagesElement ()
+ResultsHeading ()
+ResourcesForm ()
+HelpFAQButton ()
+HelpVideoButton ()
+PublicTagsButton ()
+PublicTagsBox ()
+TagusSearchbutton ()
+TagusQueryField ()
+SummonSearchbutton ()
+SummonQueryField ()
+UserTagsBox ()
+UserTagsButton ()
+Main()
+LogoutButton ()
+DisplayGuidelines ()
+DisplayGuidelinesButton ()
+AuthenticationLogoutForm()
+LoginButton ()
+PasswordField()
+LogMessage ()
+Loader()
+HelpMessage ()
+Status()
+PasswordField ()
+UserNameField ()
+AuthenticationLoginForm ()
55
UIElementsManipulations
+HideLoader ()
+ShowLoader ()
+SetResultsHeading(message)
+SetLogMessage(message, cssclasslevel)
+ShowPreview(event, message)
+LoadUrlInPopup(url)
+ToggleGuidelines()
+SetDisplayGuidelines()
+HideElementsForNonLoggedInUser()
+ShowElementsForLoggedInUser()
+EnableHelpForElement()
+SetExportFunctionalityOnElement()
+InitializeTabsInSearchForms()
+InitializeVisibilities()
+InitializeHomeReset()
Utilities
+getTAGUS_MAX_RESULTS_PER_SEARCH()
+getTaggerProxyURL ()
+getTagusCommonSuffixURL ()
+getCurrentURL ()
+getSessionManagerURL ()
+getSummonServiceURL ()
+getSearchForResourcesUsingPublicTagURL ()
+getFetchTagsForResourceAndUserURL ()
+getAddTagsForResourceAndUserURL ()
+getListGeneratorURL ()
+getRemoveTagsForResourceAndUserURL ()
+getListDownloadableFileBaseURL ()
+setUserName ()
+invalidateUserName ()
+getUserName ()
+minimum()
+removeGarbage ()
+addGarbage ()
+validateSummonData()
56
Main
1
1
1
1
1
1
TaggerRenderer
Summon
1
1
TaggerCore
SummonRendererBasedOnIds
1
1
1
1
SummonFieldsConfiguration
1
1
1
1
UIElementsManipulations
UIElements
JqueryExtensions
Utilities
Session
SafetyPrecautions
Figure 3.5. Complete Class Diagram Of Tagus’ Client
57
3.6.
Final Deployment Diagrams
The following diagram shows the current state of deployment in the Digital Library
office.
Current Deployment Diagram
TAGUS-TEST.LIB.ED.AC.UK
WEB SERVER
INDEXING
SERVER
For the scope of this pilot project, a single instance of indexing server suffices.
However, in future, if there is a need to scale the indexing server, because of the
support in Elastic Search for distributed indexing, we could deploy Tagus as follows.
The extra machines as compared to the above diagram, need to run Elastic Search
instances, configured with the same unique identifier string before starting up [7].
Future Deployment Diagram [Prospective, Scaled]
ANOTHER SYSTEM WITHIN THE SAME
LAN AS INDEXING SERVER1
TAGUS-TEST.LIB.ED.AC.UK
WEB SERVER
INDEXING
SERVER1
INDEXING SERVER2
ANOTHER SYSTEM WITHIN THE SAME
LAN AS INDEXING SERVER1
INDEXING SERVER3
Figure 3.6. Deployment Diagrams of Tagus
58
3.7. Design of the UI
3.7.1.
Design Rationale
The design of the user interface itself deserves a separate subsection because of the
amount of challenges faced and the effort put in to overcome them. Further, the user
interface needed to provided a fast user experience. Therefore, it required careful
planning and testing of available alternatives. Several design decisions were recorded in
the process, the most challenging ones of which, are presented below with their final
solutions.
The design of UI for Tagus was governed by
1. Data Protection Policy [32], in force at the university
a. No explicit “Terms and Conditions” for new users
b. But no personal data has been collected from any user, which could be used
to identify a particular user
2. A subset of the Web Accessibility Guidelines [33]
a. We are aware of the full guidelines for accessibility in websites
b. However, please note that implementing the complete set of these
guidelines in the limited time available for a pilot project was deemed
unfeasible
3. W3C standards like XHTML and CSS were used.
3.7.2.
Expert usability inputs
The process for developing the UI consisted of six design-implement-review-redesign
cycles spread evenly in the beginning and in the end of the duration of this project. The
reviews typically occurred within the Digital Library office and inputs were received from
almost everyone in the SST, UXT, DXT teams [Ref: Section 2.5]. The final approval for
each UI decision was given by either the supervisor or the usability expert with
adequate time for implementation and technical testing.
It is gently suggested to the reader to have a look at Appendix 8.1 [UI screenshots],
before proceeding to the next section. We now present three challenging UI problems,
the alternatives evaluated and the decisions made along with their justifications.
59
3.7.3.
Two column layout versus single column layout for results
The first challenging problem that we faced was in presenting the results to a user.
Searches could be generated by user actions in two ways, viz., by searching Summon
explicitly through a search box and by clicking a public tag. If the user wanted to have
access to both the sets of results at the same time, through these two means, then a
two column layout became a good candidate for a solution.
The search results pane was now divided into two instances, the left one would show
the results of direct Summon textbox based searches, while the left one would show the
results generated by clicking a public tag [that is, searching for all resources marked with
the same public tag].
Figure 3.7. Two Column Layout in UI
However, its drawbacks quickly became apparent with some preliminary usability
testing, done daily throughout the development. They are:
60
1. The two views of search results need to be kept synchronized
a. For example, if a user adds a new tag for a particular library resource in the
left pane and if that resource exists in the currently viewed page of the right
pane, then the right pane should get updated as soon as the left one gets
updated and vice versa
b. The same example holds for deleting a tag or adding an annotation or
deleting it
2. If a new means of generating a search was added to the two existing means, that
could potentially mean that a third search results pane should be displayed for the
same reason
3. Finally, dividing the search results pane into multiple columns, with each column
displaying results for its own differently generated search divides the available width
of the visible screen into multiple parts. As a result, the results’ details when
displayed begin to appear crammed, and this brings up another problem while trying
to solve the first usability problem.
Thus, this decision needed to be changed. However the ease of access of various results
generated by different means should still be prevalent to the user.
The final solution, after experimentation, thought and testing came out in the form of
tabs. Since each means of generating a search represents an option or an alternative to
the user, an “OR STATE UI state holder”** element was necessary. If such an element
could be provided for the means, then there could be a single results pane [single
column layout] which could be displaying the results of the last search, irrespective of
how it was generated.
The OR STATE UI state holder element was chosen as a “set of tabs”. Another
alternative could have been a radio buttons group, which would also enable only one
choice being active at a time.
If a user wanted to go back to another means of search, simply clicking a different tab
than the current one would take the user to a different means of generating a search.
For example, if a user first clicks the “Search Summon” tab, types some words [e.g.,
“history of island”] in the input box, clicks the search button, a set of results are
displayed below. Now if the user clicks “Search All Public Tags” tab and searches for all
resources marked with a particular tag [e.g., “History101”], then this action would
** An OR STATE means a choice to be made from multiple options. Only one choice could be active at any one
time. Dropdown lists, radio buttons, tabs, multiple windows are all examples of such UI elements.
61
generate a new search. If the user now wants to come back to the earlier tab, he / she
could just click it and their last search criteria would remain intact [“history of island”] in
the input box. Clicking the search button is the only action necessary. Going once again
to “Search All Public Tags” tab would retain its last search’s criteria [“History101”]. Once
again, a click would suffice to regenerate what the user was viewing earlier through the
same means.
Figure 3.8. One Column Layout in UI using “Search Summon”
62
Figure 3.9. One Column Layout in UI using “Public tags”
3.7.4.
Searchable annotations versus non-searchable annotations
The concept of a personal annotation was introduced in Tagus, as a “personal note
added by a user for a library resource”, visible only to the user who creates that
annotation. This annotation would appear next to the resource, every time that
resource would turn up in the same user’s future searches.
It’s semantics were meant as strictly limited to a “personal note” to the user, thereby,
making sure that they are not viewed by other users of Tagus.
63
Figure 3.10. Searchable hyperlinked personal annotations
Though it was technically feasible to make the personal annotations also searchable,
doing so would challenge a core concept of community-based-tagging system, sharing of
data across multiple users.
The original sense, in which tagging was introduced was that “if a resource is worth
tagging with a particular word for one user, it is probably worth letting other users know
that this resource could be associated with that word”, thus implying that “if one
individual tags a resource R with a tag T, then a whole community benefits in knowing
about R through T”.
Making personal annotations searchable implies that users could annotate a resource R
with a word T and nobody in the community benefits from the annotation because
64
nobody is aware of it except the user who created the annotation. Thus, the final
decision was to drop the idea of making “personal annotations” searchable.
Figure 3.11. Non-Searchable plain-text personal annotations
3.7.5.
Tags List versus Tags Cloud
The concept of displaying the list of all public tags that were generated by a user for
various resources, over time, in Tagus, was not in the original list of requirements. This
required additional thought, justification for the utility of such a concept and
consequently implementation and sufficient testing to ensure that the speed factor of
user experience is not compromised.
65
Displaying all of a user’s tags is a quick way to provide access to resources already
accessed by the user previously. It is a popular in modern web applications and captures
a different view of the system from the user’s perspective.
Finally, it provides a third means of generating a search for the user, in addition to
“Search Summon” and “Search all public tags".
The original UI design for this concept was as given in the below diagram.
Figure 3.12. Tags list for a user
However, there were a few new problems to be solved with the introduction of this
concept. Though this meant going beyond the scope of a pilot project, being developed
by a single individual, adequate thought was given and things were discussed and
planned before an approval was given. First, screen space had to be allocated for
displaying the list itself. Since it is a precious commodity, and this concept competes
with the “Search Results” pane for screen space [irrespective of whether the results
66
pane itself follows a single or a two column layout], the list had to be displayed to the
right of the screen.
Secondly, if a tag is long enough to cause a wrapping of text or be displayed in a nonwrapping style, it would cause new usability problems for the user with either
“readability taking a hit” or “scrollbars appearing and presentability taking a hit”. If the
list was displayed to the left of the screen, similar problems would appear.
Finally, displaying the tags in different sizes, according to the order of their frequency [a
tag X used more than a tag Y would be rendered in the following way :: font-size of X is
larger than font-size of Y] affects the vertical list display once again, with either
“increased vertical list width causing the results pane to reduce in width due to limited
screen space” or “scrollbars beginning to appear”.
The solution was to use a horizontal tag cloud, with a horizontal scrollbar appearing as
necessary. This removed horizontal alignment problems caused by vertical lists and
enabled the addition of features like varying font-sizes and tag-counts for the tags in the
list. The list is now officially called a tag cloud of that particular user.
Figure 3.13. Tags cloud for a user
67
Before beginning to design the architecture for novel pilot projects in academic library
environments, especially when there are no existent systems with similar
functionalities, like in the case of Tagus,
1.
2.
3.
4.
5.
6.
The guidelines for future developers from this section are to:
Study existing systems with as “close enough” functionality as possible
Employ the rich experience of existing developers available in such environments
Evaluate alternatives with the stakeholders [supervisor, usability expert here]
Apply constraints, shortlist technologies in a manner similar to the approach [described
above for Tagus]
Most importantly, iterate to arrive at a stable design
Redesign to allow for “high cohesion” and “low coupling” as much as possible, like how
the web server and the indexing server do not interact directly with each other, but only
with the browser. This helps maintain the system in future.
68
4. Chapter 4. Implementation
4.1.
In the previous chapters we have seen how requirements were collected, analysed,
filtered, and design was reviewed, justified with solutions being proposed for each
problem faced. This chapter describes how these solutions were implemented in the
final system.
4.2.
Methodology
The methodology followed for the development of Tagus, as highlighted earlier, was
iterative software development, with weekly cycles culminating in reviews of the work
done per week.
4.3.
Technologies
The final list of technologies that were used is as follows:
Table 4.1. Table of technologies
Web Browser UI: XHTML
Visual Styling: CSS
Dynamic Language For Client Side Processing: Javascript /
Asynchronous Calls To Server Functionalities: Ajax
External Libraries Used: jQuery
Data Exchange Format Between Server And Client: JSON
Server Side Technologies: PHP
Protocol For Communication Between Server And Client: HTTP REST
Data Indexing And Data Management Technology: Elastic Search
Standalone API Invocation Tool: Curl
Web Server: Apache
Browser Platforms Supported: Mozilla Firefox 4.0.1 to 5.0 , Google Chrome 12 to 13
69
Development Platforms Used: GVim, Eclipse with PHP plugin
Development Web Platform: XAMPP
Development OS: Windows XP
Documentation: Standard Comments In Source Code
Deployment Server: Server http://tagus-test.lib.ed.ac.uk
Deployment OS: Linux
Javascript Debugger Tool: Firebug
All source code is available in the Digital Library’s repository at ::
https://svn.ecdf.ed.ac.uk/repo/is/digitallibrary/Summon/tagus.
4.3.1.
Criteria
The technologies that have been used to implement the design, described in the
previous chapter, are primarily open source. Secondly, they have been chosen because
the primary target platform for usage of Tagus is the web browser.
Web technologies have grown in leaps and bounds in the last decade and offer highly
detailed finetuning capabilities which help developers implement their design with
considerable speed.
Thirdly, to ensure the maintenance of Tagus, by somebody other than the developer
[the student, in this case], we needed to ensure that either the technologies chosen for
this project are in the broad list of skills of the Digital Library Software Development
Team or could be easily learnt.
Some technologies were chosen due to requirements analysis and design constraints, as
described in the previous couple of chapters.
4.3.2.
Constraints
Please note that we are aware of Internet Explorer being supported by other projects,
standardly, at the university. However, we have learnt during the development of
Tagus that while Mozilla Firefox and Google Chrome support asynchronous calls to
Elastic Search, Internet Explorer blocks such calls.
70
While the rest of the UI functions appropriately, the tagging service hence, does not
work in Internet Explorer. One potential reason identified is that while the rest of the
services like fetching results from Summon, list generation, etc. run on the default
HTTP port 80, Elastic Search’s services are accessible on port 9200 and above, as
configured. There have been other issues too which were identified, but have been
solved after technical debugging sessions.
Kindly note that the development of a tool such as Tagus, with a starting point of just a
single Summon API file is by itself large, especially when there are no other tools to
base our work upon and there were no requirements to start with. Supporting specific
browsers [Internet Explorer as a platform] was added as a requirement request, quite
late in the cycle [when usability testing approaches were being planned]. However, due
to time constraints, a prospective solution has been thought about, but it could not be
verified. This prospective solution is described below.
One workaround for Internet Explorer could be to write Server Side Proxy classes which
act through the default HTTP port 80, while internally making request forward calls to
port 9200 along with all the passed parameters. While, work on this is already in
progress, we believe it is important to notify the reader about this current limitation.
The website where Tagus has been deployed, will issue a message regarding the
support for Internet Explorer, as soon as it is made available. Please refer to chapter 6
for more discussion about support for Internet Explorer.
4.4.
Further Technical Details
We describe in a small note, how an Elastic Search instance was customized to fit into
the role of an indexing server.
Figure 4.1. Bootup of an Elastic Search instance
71
Elastic Search by itself is schemaless. It operates using JSON as a data exchange format
[both input and output]. Querying an instance is through the means of PUT, GET, POST
and DELETE requests supported by HTTP REST standard. Elastic Search supports “a full
query DSL [Domain Specific Language]” [7] and several advanced features.
However, we just describe how we interpret data in a schemaless JSON format to
conform to a particular schema.
A sample query has been given below.
http://localhost:9200/tagus_public/userid/< auto generated unique id >
Syntactically, this structure is interpreted as follows.
<protocol>://<hostmachine>:<portnumber>/tagus_<accessibility>/<userid>/<auto
generated unique id>
Executing such a query results in data being sent in the following format to the
requester.
{
resource : FETCH-url-ed-library-resource-unique-id
tag : abcde
}
We apply the following constraints on each of these attributes.
1. accessibility = private or public, only one of these two values is allowed
2. user = unique user id as allowed by the authentication module of Tagus
3. resource = resource id as retrieved from summon, should not have any spaces
4. tag = any single word, can be alphanumeric, should not have any spaces in it
For the purposes of tagging, we identify a combination of (accessibility, userid,
resourceid, tag) as a unique set [equivalent to a row in a database]. All operations to
insert, delete, search within the index of tagged resources are interpreted in this way.
72
4.5.
Workflow Chart of development [shown for one weekly cycle]
Background
Study
Further
Study
Design
Decisions
ReDesign
Decisions
Requireme
nts
Collection
Evaluate Te
chnologies
Technologies
Shortlist
Implementat
ion
Technologi
es
Broadlist
Further
Requireme
nts
Implementat
ion
UI Design
Review
Requireme
nts
Review
Requireme
nts
Review
Design
Review
Final
System
Discuss
Constraints
Discuss
Constraints
Discuss
Constraints
Deployment
Technical
Testing
Figure 4.2. Iterative software development workflow
Please note that “Background Study” was performed only in the first two weekly cycles [it was
already in progress before this weekly iterative mode was started].
Kindly also note that the details of technical testing are not been mentioned separately, as it
was done in parallel with implementation and reviews. A separate step has been shown in the
above diagram for the sake of completeness in depicting the amount of work done completed
in this project.
73
4.6.
Guidelines to future developers
Development of a system using web technologies is a very complicated task. Both server
specific technologies and client specific ones have become quite powerful in the last
decade. It is important to understand the complexity, distribute the tasks of providing
functionality between a server technology and a client technology in this light, and
finally provide a smooth user experience.
Adding specific usability requirements to the mix makes the whole task very difficult.
However, proper planning, iterative design and taking the help of a usability expert,
while getting the scope of a project approved makes such difficult tasks manageable.
The bulk of the effort for Tagus was in the background study and creating a design
stages. Taking in the help of the developers at the Digital Library to perform technical
testing was invaluable as was getting the scope of the project reevaluated and getting
approved by the supervisor.
Finally, choosing Elastic Search over Lucene / Solr was a critical decision and it proved
successful in my early feasibility study of technologies. Therefore, a well justified choice
of technologies could very much speed up the implementation. Also, the practice of
reusing code, using public libraries such as jQuery and its plugins like jQuery.cookie and
jQuery.json helped save time and let us achieve more functionality beyond the original
plan for Tagus.
74
5. Chapter 5. Evaluation
5.1.
In the previous chapters we have seen how and which decisions led to the final design of
the new system. Appendix 8.1 gives the screenshots of the developed system, as viewed
in a browser. This chapter lists the procedures that were followed to perform usability
testing with users, including the methods, the setup, the process of the interviews, who
were recruited for the testing and the outcomes of the evaluation.
5.2.
Methods
5.2.1. Preliminary Steps
We have seen in section 2.6, how from non-functional requirements, we derived our
usability criteria, matching them with standards’ definitions of usability. From abstract
criteria, we proceeded to analyse and derive concrete measurable criteria. We then
created our evaluation plan by choosing quantitative evaluation for “speed of service,
interaction” and qualitative evaluation for the rest, viz., ease of use, ease of learning,
findability and minimal UI.
However, in order to conduct usability testing with users, we still need to convert the
evaluation plan into a plan of action. In other words, user scenarios need to be generated
to be presented to users. This is in line with the standard practice followed by the Digital
Library Usability Team. Generating possible real world scenarios and getting them
reviewed by our usability expert formed the preliminary stage of usability testing.
5.2.2.
The Interviewing Protocol Setup
The interviewing protocol was explained by Ms. Angela Laurins, our usability expert
during this project. The complete protocol setup was customized to suit the needs of
performing a pilot usability testing session. The protocol itself consists of briefing each
participant with a background of the project, describing them our purpose of conducting
the session and what is to be expected in the test scenarios. This also includes answering
intermediate questions, clarifying doubts to ensure the smooth progress of the session
without crossing the time limit, one hour per user.
75
5.2.3.
The Participant Selection and Contacting Process
Before the start of the project, in the first couple of meetings, during our planning
sessions with the supervisor, it was decided that participants would be contacted and
selected from the team members of the Digital Library / Information Services, University
of Edinburgh. Nielsen, a renowned usability expert, says in his guidelines, “Using insiders
as test users works in pilot testing, and pilot testing only” [36]. But the underlying
purpose, in such a case, according to him is “to improve the tests themselves” [36].
Therefore, instead of choosing our participants only from the Developers team, the
participants were split across multiple teams.
The prospective users were contacted via email, with the concept explained briefly. The
test sessions were then timed according to slots, depending on the availability of each
user on a particular day. The users were then invited to the test location. All tests
happened at Meeting Room S7, Digital Library Office, 2, Buccleuch Place, University of
Edinburgh’s George Square Campus.
5.2.4.
The Data Collection Method
As each testing session progressed, questions were put to the users, collecting data
relevant to the table below [please note that this references the table in section 2.6.3.4,
table 2.5].
Ratings for intuitiveness and ease of use
of UI
Ratings from users and as measured
from observation of user activity
between first and second time usage
Ratings for the findability of a resource,
that a user intends to find in the library
Ratings for ease of creating reading lists.
Though the system and hence its UI are
designed to be used to find resources,
this will help determine whether a user
finds the UI small enough to quickly
navigate it and find out how to perform
secondary actions like generating
reading lists
76
5.2.5.
My preparation for the testing sessions
I have used a standard test script, reviewed by our usability expert for the project, in
line with the standard practice at the Digital Library. Please find the “Test Script”, as is,
in the Appendix. This was adapated from Steve Krug’s “Rocket Surgery Made Easy” [31].
5.2.6.
Test Scenarios
I have used a set of test scenarios, which reflect prospective real world uses of the
system in future. These were once again used only after a couple of test runs and two
complete reviews by our usability expert. Please find the details of access to “Test Plan”
documents, as executed and recorded for each of the four participants in the Appendix. Please
note, that during the session, users were allowed to visit both the websites http://tagustest.lib.ed.ac.uk [Deployed Tagus] and http://ed.summon.serialssolutions.com
[University of Edinburgh’s Summon instance] as necessary without giving any tips or
hints to complete their tasks, other than what is already present in the test plan.
The contents of our test plan are summarized below:
Ratings are taken for:*
1. speed of service in retrieving tags
2. usability of user interface
3. ease of learning.
Comparitive evaluation is done for:*
4. user ratings for findability w.r.t. Summon with tagging [Tagus website] vs Summon
without tagging [Summon instance website]
5. ease of creating reading lists based on public tags by users based on this service vs
manual creation of such lists.
5.3.
Overview of Results Sections
I have organized the results of working on this project as described in this section. The
reports are divided into two sections, those aimed at future developers working on
similar projects OR working with similar / same technologies and those on usability,
meant for the Digital Library. These could help serve as inputs for taking up work on
similar projects in future.
*All ratings are on a scale of 1 to 5. For difficulty levels found by users, 1=very
difficult … 5=very easy. Alternatively, for user satisfaction levels, 1=very
unsatisfied … 5=very satisfied.
77
5.4.
Reports For Future Developers
5.4.1. Report on working with Summon API
The API to invoke functionalities of Summon is available. According to its
documentation, available online in [18], “The API is an HTTP-based service and supports
requests via the HTTP GET and POST methods. Currently there are two available
response formats: XML and JSON”. In keeping with modern web service development
standards, this is good for two reasons.
Firstly, it is a HTTP-based service. Most modern web programming languages, libraries
and tools have support for working with HTTP-based services. Therefore, a developer
need not write new code to work with the service and can reuse code.
Secondly, a web developer often works on both the source code relevant to a server
side and that relevant to a client side, often a web browser. Therefore, web developers
tend to be aware of the software skills needed to work on both the sides. With the
availability of new client side libraries like jQuery [16], which support HTTP GET and
POST methods, it becomes easier for the developer to invoke Summon’s functionality
easily enough to quickly check the feasibility or the working capability of a new idea.
Thus, it is conducive to rapid-prototype-development when used along with modern
client side libraries.
Serials Solutions, the company which develops and maintains Summon, also provides
PHP
and
Ruby
versions
of
the
API,
as
listed
in
http://api.summon.serialssolutions.com/help/api/code. This helps simplify the
development of extensions and new tools for developers who are already familiar with
atleast one of these languages. Both these languages are quite popular, at the time of
writing this document. Even if the chosen technologies do not incorporate either of PHP
and Ruby, developing a wrapper library which abstracts the mechanism [working with
HTTP-based calls], with functionality based API is feasible. For example, the api page,
http://api.summon.serialssolutions.com/help/api/authentication itself gives an example
of Java code to wrap code working with id’s and digests. Thus, portability and
extensibility of the API itself is feasible in moving from one technology to another.
78
There are quite a lot of advanced search features like commands, facets, filters,
repagination, etc. Each of them are described in detail in their respective sections in the
online documentation.
For projects wishing to incorporate a direct “search library with Summon” feature in
their user interface, there are several challenges to be faced. Firstly, replicating all
aspects of the user interface as shown in the university’s Summon instance website, is
not feasible in a short amount of time. Secondly, it is not practical to replicate all the
user interface features like “search boxes”, checkbox lists to expose all the advanced
search features. However, if a subset of the user interface still needs to be exposed, it
would require careful planning by the stakeholders of the project. Thus the API is great
to work with in the background [non-user-interface modules], as expected of most
modern day web services. But the web developer still needs to rewrite code to provide
support in the UI for each advanced feature provided in the API.
The PHP version of the API was used for Tagus. It supports 1. Retrieval of a resource’s
information given its unique id and 2. Searching the Summon index for all resources
matching a search query [a set of words]. While this sufficed for the implementation of
Tagus’ requirements, a future project might require the use of the advanced search
features [ignoring additional UI code that might need to be developed] may find the
PHP version falling short of expectations. In such cases, the developer might need to
extend it with custom functions to tap into the advanced features.
Another feature of the API, is the excellent error handling of incoming requests. There is
a clear classification of the possible errors, as given in
http://api.summon.serialssolutions.com/help/api/search/errors. This helps new
developers to quickly get an insight into the origin of the problem.
In summary, for functionality, the API is very good to use when working with code at a
raw HTTP requests’ level. It is good to use when working at a functionality level [PHP
version], but with limited support for advanced features which can be extended by the
developer. Any UI feature linked to the advanced search features would require
additional work.
79
5.4.2.
Report on working with Elastic Search
Quoted from its own documentation, “ElasticSearch is a highly available and distributed
search engine. Each index is broken down into shards, and each shard can have one or
more replica. By default, an index is created with 5 shards and 1 replica per shard (5/1).
There are many topologies that can be used, including 1/10 (improve search
performance), or 20/1 (improve indexing performance, with search executed in a map
reduce fashion across shards)”. More details are available online at
https://github.com/elasticsearch/elasticsearch/blob/master/README.textile.
ElasticSearch [ES] has been around for nearly a couple of years at the time of writing this
document, is very clean and stable. It is built on top of Apache Lucene. It provides HTTP
REST based services for indexing data and then searching the indexed data.
The best feature about ES is that the data to be indexed need not conform to a
particular schema. Also, even if schema is required in a project due to other
requirements and constraints, the schema need not be first defined before inserting the
first data into the system. Finally, even if schema is required and later there is a need to
extend, update or change the schema, ES itself need not be informed.
The next best feature is that all communication with an ES instance is in the form of a
HTTP URL followed by the data to be sent in JSON format. JSON [Java Script Object
Notation] is very easy to learn and use for a web developer.
Debugging source code is possible within browsers with tools like Microsoft Script Editor
for Internet Explorer and Firebug for Mozilla Firefox and Google Chrome. Development
with code making calls to an ES instance, therefore, becomes easier with early bug
detection.
Also, ES supports a query Domain Specific Language [DSL], which has several advanced
features while searching against the indexer. Search queries also follow JSON format,
thus preventing the need to learn another query language with custom keywords and
syntax.
80
Finally, ES instances can be launched on many machines in a lan, appropriately
configured to provide the same service. When machines go down, ES continues
functioning till atleast one machine remains. Thus its support for distributed indexing
helps keep the uptime of a deployed service. Other advantages include scaling to
multiple instances dynamically when the data to be indexed grows beyond the
capability of the current set of machines.
While there are so many advantages of ES, the documentation available is limited. For
example, if a developer working with advanced search queries faces a problem, it could
take up a lot of his / her time trying to figure out ways of creating the query. This was
faced even in the development of Tagus. Please note that ES has been around for a very
short time with a very small number of developers working on it.
There are API’S supported in multiple programming languages on both server side and
client side of web development. In summary, we have found ES excellent to work with,
only limited by the documentation provided. It is suggested to future developers to
refer to Clinton Gormley’s excellent introduction to ES [presented at the Yet-AnotherPerl-Conference,
European
Union,
2010],
available
online
at
http://clintongormley.github.com/ElasticSearch.pm/ElasticSearch_YAPC-EU_2010/.
5.5.
Reports on usability For Digital Library's Decision Makers
5.5.1.
speed of service in retrieving tags and speed of interaction
5.5.1.1.
Quantitative Evaluation
From section 2.6.3.3, speed of retrieving tags fits quantitative evaluation. Please refer to
the images given below for the actual data recorded for measuring the durations of
various calls to the indexer from the web browser. The tool used to capture this
information is Firebug 1.8.1.
The average time for the retrieval of tags across various resources was
a) Mozilla Firefox: 159.65 milliseconds
b) Google Chrome: 384.20 milliseconds
These times, in the order of half a second are observed as “very fast”.
81
5.5.1.1.1. Durations of request-response cycles in Mozilla Firefox
Figure 5.1. Timing of calls to Elastic Search’s instance in browser one
82
5.5.1.1.2. Durations of request-response cycles in Google Chrome
Figure 5.2. Timing of calls to Elastic Search’s instance in browser two
83
5.5.1.2.
From section 2.6.3.3, speed of interaction is the criterion which fits qualitative
evaluation. Please refer to the appendix for the actual data recorded during our usability
testing sessions. All the participants found the user interface to be “very fast” to use.
5.5.2.
usability of user interface
5.5.2.1.
Tasks & Questions
Please refer to the appendix for the actual data recorded during user sessions. The tasks
themselves and the accompanying questions could be summarized below:
Task :: Add a tag / an annotation
Task :: Find a resource
Task :: Remove a tag / an annotation
Task :: Add a tag, Tell a friend about it [to let the friend find it with this tag], Remove it
Given the tasks, data that was recorded
What helped ?
What did not help ?
What could have helped [expectations] ?
5.5.2.2.
Overall, the users found the user interface between “usable” and “very usable”, with
scope for improvement. I have listed the problems that were noted during the sessions.
Firstly, we found that if a user was familiar with popular tagging systems like Facbook,
Flickr, Delicious, etc., they found the overall system “usable”. Some users expected
advanced features like auto-suggest for resources tagged by people of the same group
[like all students taking the same course, those in the same undergraduate year, etc.].
Secondly, one user specifically did not want to use the “public”ness of tags and wanted
to use the tags only as “personal” information to be not shared with others. Though this
84
is a functionality related feedback, we found such expectations affecting their ratings on
usability [brought down the level from “very usable” to “usable”].
Finally, we found that if users were exposed to the Summon instance first and then
introduced to Tagus, users appreciated the user interface “very usable”.
Positive Feedback : Quick access via TABS is convenient, “Less cluttered” than other
search systems that they were aware of, Tag counts in tagcloud helps “immensely”,
case-insensitive search of tags is supported, Image under “Home” gives a quick idea of
what Tagus is about, Students will find the system very easy to use, Having a demo
video is not required as “it is easy to figure out”.
Negative Feedback : Image under “Home” is not intuitive enough, pure keyboard
navigation of the system is not supported [without mouse], users sometimes
overlooked which input box they were typing into [annotations vs tags], advanced
search features like filtering options and auto-suggest are not available.
Neutral feedback : Two users can add the same public tag to the same resource. When
such a resource comes up in the searches of users, a tag is displayed twice. Users found
it a bit “different” and “awkward”, but found the concept “useful” because it provides
them a way of knowing that a particular tag is more relevant to the resource if it is
present more number of times than the other tags.
Further, users’ perception of usability increased with the amount of provided
functionality of tagging. For example, one user commented that it Can be useful to
librarians to predict which resources would be in demand if they could know which ones
are being tagged the most.
Thus, overall, for a proof-of-concept tool [Tagus] built upon a resource discovery system
[Summon] we believe that the provided user interface was between “usable” and “very
usable”, while noting that users familiar with popular “tagging” based websites found it
more usable than users who were not and that the provided functionality affected the
“perceived usability” of the concept of tagging itself, which in turn led to their perceived
usability of the user interface.
85
5.5.3.
ease of learning :: first vs second time usage
5.5.3.1.
The core procedure of evaluating ease of learning is to give a task to the user and make
him / her repeat it in a different form. Observe the ratings given for the completion of
the two tasks. The data to be observed is the difference of ratings given for the first time
and the second time. This difference should either be zero or positive. For example, for
the first time, for a task X, a user U gives a rating of 3: Moderate, then the user has
learnt the task in between the two times, if for the second time U gives X a rating of
either 3: Moderate, 4: Easy or 5: Very Easy.
Tasks 4 and 5 were specifically designed to measure ease of learning.
5
TestUser1, TestUser3
4
TestUser 1
TestUser
4
3
2
TestUser 2
TestUser 3
TestUser2
TestUser 4
1
Task 4
Task 5
Graph 5.1. Ease of learning as measured for two specific tasks*
Ease of learning across the first six tasks, the commonality being “add / delete / search
via” tags.
TestUser3
5
TestUser4
4
TestUser 1
TestUser1
3
TestUser 2
TestUser 3
2
TestUser2
TestUser 4
1
Task 1
Task 2
Task 3
Task 4
Task 5
Task 6
Graph 5.2. Ease of learning as measured across all tasks*
Interpretation of the above graph: The complexity of the tasks increased from task1 to
task6. This could be a factor in causing the dips, especially for task3 and task5. This is a
86
good feedback to incorporate when revising the tests themselves in future. In other
words, to measure ease of learning, when a user performs a task for the second time,
there should not be any added complexity which could cause fluctuations in the
measurements.
To compensate for this, we also recorded how satisfied the users were with the results
they obtained, which should give a better measure of ease of learning [changing
expectations of the user] as they spent more time with [learnt more about] the system.
Task1, being the first task, has no recording of data for the user to rate “how satisfied he
/ she was the results”. This was because the results for the first task set the expectations
of the user in practice. So, the real measurements can start from the second task.
5
4
TestUser 1
3
TestUser 2
TestUser 3
2
TestUser 4
1
Task 2
Task 3
Task 4
Task 5
Task 6
Graph 5.3. Ease of learning as measured across all tasks using “satisfaction with results”*
As we can observe, with the exception of the rating for task5 by testuser2, the users
were able to learn the system. Also the average rating for all tasks was 4.35 across all
users. This indicates high satisfaction for users for the results of performing the tasks.
Please note that the minimum possible rating was 1 and the maximum one was 5.
Also please note the limitations of this measurement. In a real world scenario, a user is
not limited to the test duration of one hour. He / she would have more time to explore
the system. Also, we should consider the factor of community learning in Tagus, which
maintains data collectively owned by all the users.
If a learned user of Tagus informs about the system to a new user, then the new user’s
ease of learning could improve much more than when the new user explores Tagus by
himself / herself, for the first time.
87
5.5.4.
findability :: Summon [does not have tagging] vs Tagus [uses Summon]
For measuring the findability of a resource in Tagus as compared to the findability of
that resource, we designed a specific task, Task 7. Please find its details in the appendix.
It involved finding a resource in Summon and also finding it in Tagus in any order.
However, when in Tagus, the user tags the resource with a public tag, as relevant to him
/ her. A time gap is given here, to indulge the user in a general chat about the system for
around five minutes after taking the user test plan sheet away. The task ends with the
user trying to find the same resource again in both the systems. The sheet is returned to
the user to rate how findable the resource was in Tagus as compared to finding it in
Summon for the second time.
Please note that Tagus is like an extension to Summon and it works off its API. So, this
evaluation checks whether tagging would help improve findability if available along with
the regular functionality of Summon.
5.5.4.1.
comparitive evaluation :: Summon vs Tagus
5
4
TestUser 1
3
TestUser 2
TestUser 3
2
TestUser 4
1
0T
1 A
1
S
2
K
2
7
3
Graph 5.4. Findability: How EASIER is finding a resource in Tagus vs in Summon*
5.5.4.2.
Report on accuracy of public tagging affecting findability
Though one of the intentions of applying the concept of tagging in an academic library
environment is to improve findability, tagging could adversely affect the findability of a
resource. For example, assume a public tag called “book” is applied to many resources
by an user. Such a highly general tag does not really convey anything about the resource
to the user community. Yet another example would be to publicly tag a particular
88
resource on the topic of history with say, a tag called “geography”. In the latter
example, the goal of the user applying the irrelevant public tag might be to solely let
other users not find a book rather than help them find it.
However, this problem is not unique to academic library environments. It is applicable
equally to other applications of tagging like, say, a user tags a photo of a blank sheet of
paper with “operatingsystem” or just “photo”. Thus, it is a standard problem in the
world of tagging based applications.
One possible approach is the use of moderation. There could be a set of “moderators”,
users with super rights to override or remove inappropriate tags applied to resources.
Another approach is to have automatic bots monitoring the system for a known list of
blocked words. Yet another approach is to let the user community moderate tags itself.
A user can tag a library resource with a general tag called “book”. Yet, another user,
when he / she finds this tag not applicable to that resource, could simply delete it.
However, this last approach could cause new problems in cases when one user deletes
really relevant tags added by another user.
Our suggestion is to take the filtered approach for Tagus, where a newly added tag is
checked against a known list of blocked tags to check for relevancy. This again is
dependent on the concept of relevancy, but discussing further on the topic of relevancy
per user per resource is out of scope of this document.
5.5.5.
ease of creating reading lists :: manual vs Tagus
5.5.5.1.
Cooperative Evaluation
Tasks 8 and 9 [appendix] have been created exclusively with the aim of checking
whether the user interface is small / minimal enough to be quickly navigable, as
explained in section 2.6.3.4.
The user needs to find means of achieving the goals of the two tasks, viz., creating a
direct reading list out of search results and creating a customized reading list.
There are two role plays :: professor and student, in the two scenarios. Participants
were instructed to think aloud, during the testing sessions. All the participants found
89
task 8 very easy to do and agreed upon the usefulness of the feature [exporting reading
lists]. Task 9 required a bit of thinking and problem solving on part of the users, as the
solution was neither direct nor were any hints given. However, all the users managed
to think of a way to apply tagging in a way new to them [users were asked if they had
faced such a situation before on social networking websites, etc. where tagging was
used].
This paragraph concludes the chapter on evaluation of Tagus. We have divided our
reports into two appropriate sections and in each section, we described our results,
critically analysed the details in them, observed the limitations of both creating the
tests and the test conditions, as compared to real world situations and suggested
solutions, wherever applicable.
90
Chapter 6. Conclusions & Future Work
Implementing a 'Del.icio.us' like system in an academic library discovery environment is
about each of the following:
1. taking the concept of “community owned personal tagging” and applying it in a new
context
2. checking out the feasibility of working the public API of a new age library resource
discovery system to create a new tool
3. gathering the requirements needed to create a tool to exploit tagging
4. figuring out a list of technologies to work with all the constraints
5. designing the architecture, design all the modules as there is no existing software in
such an environment to serve as a standard model
6. exploring many possibilities at every stage and getting reviewed every week
7. designing the user interface, weighing options and justifying design decisions
8. designing usability test case scenarios and conducting usability test sessions
9. recording and critically analysing the results and reporting them.
During this project, we have also performed a literature review of recent publications,
a technical review of working with the technologies involved / chosen, problems facing
the development of such new applications and suggested solutions wherever
applicable. This project involved a lot of work as primarily the scope of the project went
from being proof-of-concept to a strong modularized framework, functional by itself,
but also extensible into further customized applications in future.
6.2. Limitations
The undertaking of this project would not have worked without the inputs of all the
individuals in the acknowledgements page and we strongly believe that this project was
successful considering the short duration that was available. However, this project is
still a pilot tool as there is scope for improvement in terms of both functionality and
usability.
91
Google’s search mechanism & Delicious’s easy to use tagging system have set high
standards of speed and usability for public resources on the internet. As a result, users
are very conscious of speed and usability in similar applications elsewhere. However, to
keep things in perspective and within the scope of this project :: As proposed, this is a
pilot project, intended to aid in the evaluation of DigLib’s new search services
candidates. In the end, reports on speed, usability and service integration would be
provided to the Digital Library and are expected to help in this evaluation. Finally, this
project can be customized in several ways, as suggested by members of DigLib, UoE, in
several ways, provided
1. DigLib thinks of going ahead with Summon
2. this project is out of pilot / test mode and deployed as a full time service [to start in
an alpha / beta release and slowly become a public release]
3. more testing is conducted
4. UI is improved
5. The development of more applications based on the standalone API / the web based
system is undertaken. For example,
a. All tags by all the students under a particular professor could create a data
aggregation and could be made available as a webservice to librarians and
budget allocators to predict, say, which journals subscriptions to allocate
more budget to, etc.
b. Automated notification of course-code-tagged resources to coursesubscribed-students via mailing lists could be made available as a
webservice integration, etc.
c. Resources used by previous academic years’ students could be autosuggested to current academic year’s students based on tag-based-theme
similarity, etc.
6.3.
Future Work
As observed in the feedback from the participants of our testing sessions, the concept
of tagging is useful to users. However, the guidelines to future developers in
conjunction with the whole approach to development of Tagus, as described in earlier
chapters of this document serve as 1. a model to base new work on and 2. inputs to
92
improve future test cases and 3. improve criteria for usability itself. Looking forward to
continuing my work at the Digital Library Office at the University of Edinburgh, the
immediate work that I would be taking up is as follows:
1.
2.
3.
4.
5.
Evaluating the proposed ideas for providing portability to Internet Explorer
Implement suggestions from test participants / users to improve the user interface
Work on creating Help Demo Video and FAQ sections on the website
An attempt to link the authentication module of Tagus with EASE [10]
Improve the format of generated reading lists
Although the results of usability testing of Tagus provide good feedback, after making
a couple of more iterations in the iterative development cycle, with all the above
done, one direction for future work is to involve more number of users and across a
wide range of roles in the university. This would be possible after the system has been
stress / load tested and found to be robust enough to take up users’ request load
appropriately.
A second direction for more work would be to focus on evaluating the feasibilities of
new ideas by quickly developing prototypes based on the framework delivered by this
project. Some ideas have been given as examples in the previous section.
A third direction would be to create an alternative version of the system to work with
other new age library resource discovery systems, along with Summon. This would
require that the public API’S of these systems conform to a uniform interface, thereby
providing / exposing similar functionalities as Summon.
93
7. References
1.
Bains [2010], The paper that went to Library Committee proposing a resource
discovery procurement
[Access Link: http://www.lib.ed.ac.uk/about/libcom/PapersFeb10/paperE100210.pdf]
[Cited 18 August 2011] [Internet Source]
2.
Stone [2010], Searching Life, the Universe and Everything? The Implementation
of Summon at the University of Huddersfield
[Access Link: http://liber.library.uu.nl/publish/articles/000489/article.pdf]
3.
The Summon Beta Evaluation Team [2009], An Evaluation of Serials Solutions
Summon As a Discovery Service for the Dartmouth College Library
[Access Link: http://www.dartmouth.edu/~library/admin/docs/Summon_Report.pdf]
4.
Klein [2010], Hacking Summon
[Access Link: http://journal.code4lib.org/articles/3655]
5.
Summon by SerialsSolutions
[Access Link: http://www.serialssolutions.com/discovery/summon/]
6.
University of Edinburgh’s instance of Summon service
[Access Link: http://ed.summon.serialssolutions.com/]
7.
ElasticSearch
[Access Link: http://www.elasticsearch.org/]
94
8.
Apache Solr
[Access Link: http://lucene.apache.org/solr/]
9.
Apache Lucene
[Access Link: http://lucene.apache.org/]
10. University of Edinburgh, EASE system for web based university wide
authentication
[Access Link: http://www.ed.ac.uk/schools-departments/informationservices/services/computing/computing-infrastructure/authenticationauthorisation/ease/overview ]
11. Webfeat
[Access Link: http://www.webfeat.org/]
12. Delicious
[Access Link: http://www.delicious.com/]
13. Hull D, Pettifer SR, Kell DB, 2008 Defrosting the Digital Library: Bibliographic Tools
for the Next Generation Web. PLoS Comput Biol 4(10): e1000204.
doi:10.1371/journal.pcbi.1000204
[Access Link:
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000204]
14. George Macgregor, Emma McCulloch [2006], "Collaborative tagging as a
knowledge organisation and resource discovery tool", Library Review, Vol. 55 Iss: 5,
pp.291 - 300
[Access Link: http://dx.doi.org/10.1108/00242530610667558,
95
http://www.emeraldinsight.com/journals.htm?issn=00242535&volume=55&issue=5&articleid=1554177&show=pdf ]
15. J. Alfredo Sánchez, Adriana Arzamendi-Pétriz, and Omar Valdiviezo [2007],
Induced tagging: promoting resource discovery and recommendation in digital libraries.
In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries (JCDL '07).
ACM, New York, NY, USA, 396-397. DOI=10.1145/1255175.1255252 [Access Link:
http://doi.acm.org/10.1145/1255175.1255252,
http://portal.acm.org/ft_gateway.cfm?id=1255252&type=pdf&CFID=15726141&CFTOK
EN=28450628].
16. jQuery
[Access Link: http://jquery.com/ ]
17.
Curl [Access Link: http://curl.haxx.se/ ] [Cited 18 August 2011] [Internet Source]
18. Summon API [Access Link: http://api.summon.serialssolutions.com/help/api/]
19. Lynn D. Lampert, Katherine S. Dabbour, Librarian Perspectives on Teaching
Metasearch and Federated Search Technologies, Internet Reference Services Quarterly
Vol. 12, Iss. 3-4, 2008
DOI: 10.1300/J136v12n03_02
[Access Link: http://www.tandfonline.com/doi/abs/10.1300/J136v12n03_02 ]
20. Lauridsen, Helle and Stone, Graham (2009) The 21st century library: a whole new
ball game? Serials, 22 (2). pp. 141-145. ISSN 0953-0460
DOI: 10.1629/22141
[Access Link: http://dx.doi.org/10.1629/22141, http://eprints.hud.ac.uk/5156/]
96
21. The Success of Web-Scale Discovery in Returning Net-Gen Users to the Library:
The SummonTM Service in Academic Libraries
[Access Link: http://www.libraryjournal.com/lj/tools/webcast/883883388/the_success_of_web-scale_discovery.html.csp]
22. Help with Summon,
http://www.library.usyd.edu.au/catalogue/summon/summonfaq.html
23. An article on Summon at PSU,
http://www.libraries.psu.edu/psul/itech/services/summon.html
24. What is Summon?,
http://www.adelaide.edu.au/library/help/summonabout.html
25.
Summon API,
http://api.summon.serialssolutions.com/help/api/
26.
RESTful Web services: The basics,
https://www.ibm.com/developerworks/webservices/library/ws-restful/
27.
JSON, http://www.json.org/
97
28.
ISO 9241-11:1998 Ergonomic requirements for office work with visual display
terminals (VDTs) -- Part 11: Guidance on usability,
http://www.userfocus.co.uk/resources/iso9241/part11.html
29.
Jakob Nielsen's Website on usability, http://www.useit.com/
30.
Alan Dix, Janet Finlay, Gregory Abowd, and Russell Beale. 1997. Human-Computer
Interaction. Prentice-Hall, Inc., Upper Saddle River, NJ, USA.
Access Link: http://dl.acm.org/citation.cfm?id=249491
HCITextbook at University of Edinburgh, http://www.hcibook.com/
31.
Steve Krug, Rocket Surgery Made Easy,
http://www.sensible.com/rsme.html
32.
Univeristy of Edinburgh, Data Protection Policy
http://www.recordsmanagement.ed.ac.uk/InfoStaff/DPstaff/UoEDPPolicy.htm
98
33.
Univeristy of Edinburgh, Web Accessibility Guidelines
http://www.projects.ed.ac.uk/methodologies/Standards/Accessibility/AccessGuid
e.htm
34.
FPDF Class For PHP Applications, http://www.fpdf.org
35.
University of Edinburgh, Human-Computer Interaction, Course Website,
http://www.inf.ed.ac.uk/teaching/courses/hci/
36.
Jakob Nielsen, Try to Be a Test User Sometime,
http://www.useit.com/alertbox/being-a-test-user.html
99
8. Chapter 8. Appendix
8.1. Tagus Screenshots
100
101
102
103
8.2. Test Script
MSc Project :: Tagus :: Implementing a 'Del.icio.us' like system in an academic library
discovery environment
Tagus Testing :: User Testing :: Single Test
August 2011
Test Script (Adapted from Rocket Surgery Made Easy © 2010 Steve Krug)
Clear the browsing history!!!
 Web browser minimised
Hi, ___________. My name is Girish Ede and I’m going to be walking you through
this testing session today.
Before we begin, I have some information for you, and I’m going to read it to
make sure that I cover everything.
You probably already have a good idea of why you were asked here, but let me go
over it again briefly. We’re asking people to use Tagus to find out if it works as
intended and to find out what you think about using it.
The session will take no more than an hour.
The first thing I want to make clear right away is that we’re testing Tagus, not you.
You can’t do anything wrong here. Please don’t worry about making mistakes.
There are no right or wrong answers. This is a proof of concept website :: it is not
perfect, but your feedback is really going to help us.
104
As you use Tagus, I’m going to ask you to try to think out loud as much as
possible. I’d like you to say what you’re looking at, what you’re trying to do, and
what you’re thinking. Why you’ve decided to click on something. Tell me when
you are confused or pleased that something has worked.
This will be a big help to us.
Also, please don’t worry that you’re going to say something you shouldn’t. We’re
doing this testing to evaluate Tagus and help improve the Library services we
offer, so we need to hear your honest reactions.
If you have any questions when we’re done I’ll try to answer them then. And if
you need to take a break at any point, just let me know.
If you would, I’m going to ask you to sign a simple permission form for me. It just
says that I have your permission to use the feedback you provide from the testing
which will be anonomysed, and that the results will only be seen by the people
working on the project.
 Tell the participant that his/her data will be
anonymised in your report
Do you have any questions so far?
105
OK. Before we look at Tagus, I’d like to ask you just a few quick questions.

Do you use social networking websites? If yes, which ones do you use from
the list below.
Facebook :: facebook’s photo tagging feature
Delicious :: delicious’s bookmark tagging feature
Flickr :: flickr’s photo tagging feature
Others :: any tagging feature you might have used earlier

What Library online resources do you use ? – (e.g., like Searcher, catalogue,
databases etc. at the University of Edinburgh ? )
Yes / No. If no, are you familiar with any other similar tools ?

I will now take you to a resource discovery service called Summon from
Serials Solutions.

Please take a couple of minutes to browse around the website. Feel free to
click around and see what it does.

What do you expect to do using Summon ?
106
OK, great. We’re done with the questions, and we can start looking at Tagus.
 Maximize browser window: http://tagus-test.lib.ed.ac.uk/
Tagus homepage
First, I’m going to ask you to look at this page and tell me what you make of
it:does anything stand out?, what do you think you can do here? Tell me what
you’re thinking when you’re looking around the page?
You can move the mouse around if you want to, but don’t click on anything yet.
 Allow this to continue for three or four minutes, at most.
Thanks. Now I’m going to ask you to try doing some specific tasks to complete
using Tagus. I’m going to read each one out loud and give you a printed copy.
There is no right or wrong answer: the task is complete when you are satisfied
with your answer.
It will help me if you can try to think out loud as much as possible as you go along.
Tell me what you’re thinking, what you think will happen when you click on links,
what you like, what you dislike, if you expected something to happen, if you’re
pleased, displeased….
After each task I’ll ask you to rate how satisfied you are with your result and how
easy or difficult you found the task. This would help us in our evaluation.
107
 Hand the participant the test plan sheet
 Hand the participant the first scenario, and read it aloud.
 Allow the user to proceed until you don’t feel like it’s producing
any value or the user becomes very frustrated.
 Repeat for each task or until time runs out.
 If time, ask for comment on results screen
Thanks, that was really helpful.
Do you have any questions for me, now that we’re done?
 Thank them and show them out.
108
8.3. Test Plan
MSc Project :: Tagus :: Implementing a 'Del.icio.us' like system in an academic library
discovery environment
Tagus Testing :: User Testing :: Single Test
Date: August 2011
Tester’s copy
Participant ID:
109
Task 1: Adding a tag, after finding a resource
Part 1
Assume you are an undergraduate student at the university.
Please login using the login / password combination given to you.
Can you find the book “Computer Graphics” by “Francis S Hill” ?
Your professor had recommended this book for your Book course.
Please add a public tag “Computerbook” to this book.
Please add a personal annotation “to_read” to this book.
Part 2
Please repeat the activity by finding the following resources and adding the same tag
and annotation to each of the following:
 Title:Fraunhofer Institute: building on a decade of computer graphics research
Author:Earnshaw
Title: 3-D Computer Animation
Author: Vince John
Did the participant complete this task successfully?
Yes
No
On a scale of 1-5 where 1 is very difficult and 5 is very easy, please rate how you found this task:
Very difficult
1
Very easy
2
3
4
5
Observations/ comments:
110
Task 2: Finding a resource, given a public tag
You and your friend John had taken a course, “Software Engineering”.
However being unwell, you could not attend today’s class.
The professor for the course has given a set of books and journals to read, for the next assignment.
Your friend searches for all the required books/journals in the website.
He tags them publicly with “SEAssignment1”. He then sms’es this tag to you.
Can you find the set of resources that the professor had suggested as reading material for the
assignment.
Yes
No
On a scale of 1-5, how satisfied are you with your result for this task?
Very unsatisfied
Very satisfied
1
2
3
4
5
On a scale of 1-5, please rate how you found this task:
Very difficult
1
Very easy
2
3
4
5
Comments/observations
111
Task 3: Removing a public tag / personal annotation
Part 1
You have a month to complete the first assignment for your course, “Software Engineering”.
Your professor has given a list of THREE books to be required reading for the assignment.
Your professor has marked THESE THREE with the public tag “Computerbook”. Find them.
Part 2
Assume you have completed reading TWO books out of the THREE in the required reading list.
Can you remove your personal annotations to those books, which you had earlier marked as
“to_read”.
Yes
No
On a scale of 1-5 how satisfied you are with your result for this task?
Very unsatisfied
Very satisfied
1
2
3
4
5
On a scale of 1-5, please rate how you found this task:
Very difficult
1
Very easy
2
3
4
5
**Please logout of Tagus now **
Observations/comments:
112
Task 4: Ease of learning, first usage vs second usage
**Please login back to Tagus**.
Can you find out the list of all the public tags you have added until now ?
Tip: One of the tags used previously was : “computerbook”.
Yes
No
Please rate how satisfied you are with your results:
Very unsatisfied
Very satisfied
1
2
3
4
5
Please rate how you found this task:
Very difficult
1
Very easy
2
3
4
5
Comments/observations:
113
Task 5: Ease of learning, Community maintained tags
Part 1
You get a text message saying that some of your friends taking the same courses as you have also
tagged some resources in the Digital Library.
You ’ve already used the tag “Computerbook” to tag useful resources.
Assume that your friends also tagged some of their interesting findings with “Computerbook”.
Can you find a list of the books you or your friends tagged with “Computerbook” ?
Part 2
Can you look around Tagus and find out how many resources YOU had tagged publicly as
“Computerbook” ?
Yes
No
On a scale of 1-5 how satisfied you are with your results?
Very unsatisfied
Very satisfied
1
2
3
4
5
On a scale of 1-5, Please rate how you found this task:
Very difficult
1
Very easy
2
3
4
5
Observations/comments:
114
Task 6: Find resources based on others’ tags and add your own tags to those resources
Part 1
Your friends, on the same course, have tagged some of their interesting findings with a public tag
called “interesting_book”.
Can you find a list of all books that were tagged with “interesting_book” ?
Part2
You think that the “interesting_book” tag is a bit vague for all those resources. Can you now add your
own public tag “Computerbook” for each of the results?
You now think that these resources are now more relevant to you as “computer books” rather than as
“interesting books”!
Yes
No
Please rate how satisfied you are with your results:
Very unsatisfied
Very satisfied
1
2
3
4
5
Please rate how you found this task:
Very difficult
1
Very easy
2
3
4
5
115
Task 7: Summon Only Vs Summon + Tagus
Open a new browser window.
Please visit University of Edinburgh’s installation of Summon ::
http://ed.summon.serialssolutions.com/.
You are a graduate student who is studying “Computer Graphics”.
Your professor for the course had asked you to refer to “3d And Multimedia On The Information
Superhighway” by “Earnshaw”.
Please find this book using Summon.
Go back to Tagus
Please find this book using Tagus.
Please tag it using an appropriate public tag in Tagus.
**Take away this sheet of paper from the user**
Assuming that you are now two months into the future, having forgotten the name of the book your
professor suggested you read, how would you
** Give the user, the URL for Summon again **
a. find this book using Summon.
** Give the user, the URL for Tagus again **
b. find this book using Tagus.
**Return this sheet of paper to the user**
As compared to the previous approach, Do you find the second approach to be
Very difficult
1
Very easy
2
3
4
5
116
Task 8: Reading Lists
Come back to Tagus. Now imagine you are the professor of the course “Computer Graphics”.
You want to compile a reading list for your course.
You want to then e-mail this list to your students.
Members of Library Staff have already tagged all the resources with a public tag called “CG101”.
Can you compile a reading list of these resources and save the list to your computer ?
Yes
No
Do you think you would you use the exporting feature to send relevant materials to a class of
students?
Yes
No
117
Task 9: Customized lists for each user
You are a graduate student at the university.
Your friends and you had over the last semester tagged several books you found in your searches with
“for_holidays”.
Your friends want this list to be trimmed to a subset of around 10 books.
Can you customize this list and export it as a file and send it to your friends ?
Imagine you are a student, do you think this feature would be useful?
Yes
No
Yes
No
118
What features and cues in the user interface did you find useful in doing these tasks ?
What helped you do your task ?
What could have been better ?
Did you think that the user interface was intuitive and easy to navigate ?
What confused you the most ?
Which task troubled you the most ?
What did not seem obvious to you at first, but you understood as you started doing these tasks ?
Did you feel comfortable with Tagus, once you understood it ?
Do you think, this “tagging and annotating” feature is useful ? Would you use it ?
Would you prefer
searching for a resource using its title and author every time you need to find it
OR
searching for a resource once, tag it, and then find it using your tag ?
Any other comments and feedback. Please list them here.
119
8.4. Access Details To Location Containing Actual Data collected during evaluation
I have already included the data collected, including feedback from users as graphs and plain text in
chapter 5 under appropriate sections.
Raw data sheets, collected “as they are”, have been assembled at ::
https://svn.ecdf.ed.ac.uk/repo/is/digitallibrary/Summon/tagus/RawDataSheets/.
The main source code itself is available at ::
https://svn.ecdf.ed.ac.uk/repo/is/digitallibrary/Summon/tagus/.
All files and folders except RawDataSheets under the main "tagus" folder are part of the source code of
the project.
These can be downloaded to your local machine and imported as an Eclipse project [PHP, Javascript].
Folder “RawDataSheets” has been added here for the sake of the student's dissertation. PLEASE do not
remove this folder from here. Its size is only around 3.2 MB.
For access to this repository location, please contact the following people:
1.The Supervisor Of The Project, Mr. Colin Watt
2. Usability expert of the project, Ms. Angela Laurins
3. Ms. Claire Knowles, Information Systems Developer At The Digital Library Office.
Further contact details are available at ::
http://www.ed.ac.uk/schools-departments/information-services/about/organisation/library-andcollections/who-we-are/staff-list.
120

Implementing a `Del.icio.us` like system in an academic library

Transcription

Similar documents

The Water Route 2016

PETtrac Stainless/Brass Tag Order Form

Whitepages Pro Caller Identification API

Connichiwa – A Framework for Cross

Golfland flyer

Quarterly NL May 2010-b

Dear Future Sponsor

ni.com - SDL.com

Sponsorship opportunities available