IGroup: Presenting Web Image Search Results in Semantic Clusters

Transcription

IGroup: Presenting Web Image Search Results in Semantic Clusters
IGroup: Presenting Web Image Search Results in Semantic
Clusters
Shuo Wang, Feng Jing, Jibo He*, Qixing Du**, Lei Zhang
Microsoft Research Asia,
5F, Sigma Center, No.49, Zhichun Rd. Haidian Dist., Beijing 100080, P. R. China
{shuowang, fengjing, leizhang}@microsoft.com
*Department of Psychology, Peking University, Beijing 100871, P. R. China, jiboh@pku.edu.cn
** Tsinghua University, Haidian Dist., Beijing 100080, P. R. China, dqx05@mails.tsinghua.edu.cn
Problems in Web Image Search
ABSTRACT
Current web image search engines still rely on user typing
textual description: query word(s) for visual targets. As the
queries are often short, general or even ambiguous, the
images in resulting pages vary in content and style. Thus,
browsing with these results is likely to be tedious,
frustrating and unpredictable.
Though highly accessible, the relatively poor quality of
offerings is not surprising, since they reflect the randomness
and unevenness of the Web. The frequent irrelevancy of
results is also explicable, since the automated engines are
guessing at their images' visual subject content using
indirect textual clues [18].
IGroup, a proposed image search engine addresses these
problems by presenting the result in semantic clusters. The
original result set was clustered in semantic groups with a
cluster name relevant to user typed queries. Instead of
looking through the result pages or modifying queries,
IGroup users can refine findings to the interested sub-result
sets with a navigational panel, where each cluster
(sub-result set) was listed with a cluster name and
representative thumbnails of the cluster.
Furthermore, the query formulation is problematic. There is
a natural gap between text description and visual
presentation. It could be hard to describe an image with a
proper query, even when the target is clear in mind [18].
Therefore, the results are often mixed up with undesired
images when the query is short (typically two or three
words [1]). These non-refined queries often lead to a large,
poor result set. Figure 1 indicates that the results of “tiger”
are mixed with “tiger woods”, and “tiger”, the animal.
Similarly, general queries (like “Disney”) or ambiguous
queries (like “Apple”) also suffer from this problem.
We compared IGroup with a general web image search
engine: MSN, in term of efficiency, coverage, and
satisfaction with a substantial user study. Our tool shows
significant improvement in such criteria.
Author Keywords
Image search result clustering (ISRC), image search
interface, search result clustering (SRC), user test
ACM Classification Keywords
H5.m [Information interfaces and presentation]: Misc.
INTRODUCTION
Image search engines collect and index images from other
sites and attempt to give access to the wide range of images
available on the Internet. The existing services offered by
Google [15] and MSN [16] are typical examples. According
to [19], 12% traffic of Google comes from its image search.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
CHI 2007, April 28–May 3, 2007, San Jose, California, USA.
Copyright 2007 ACM 978-1-59593-593-9/07/0004...$5.00.
Figure 1. The result of “tiger” in MSN image search:
mixed with “tiger woods” and “tiger animal”.
In addition, showing the retrieved images as pages in a
scrolled list is not user-friendly [10]. The images presented
on the first page are not necessarily better than those in the
1
RELATED WORK
following pages in terms of their relevance to the query.
The list presentation is also insufficient when users want to
compare between different results of modified queries.
Automatically arranging a set of thumbnail images
according to their similarity are useful to designers,
especially when narrowing down to a desired subset [9].
It is also important to note that although large numbers of
results are reported, Google does not enable its users to
view more than 1,000 image results [19].
Labels may also be necessary to help the user understand its
structure: a caption-based arrangement helps to break down
the set according to meaning, although its usefulness
depends on the level of detail in the available captions [9].
Image Searching Behaviors and Needs
User studies have pointed out several important behaviors
and needs in image search, which may further expose the
disadvantages of current web image search and spark light
for possible improvements.
Several visual-based web image search result clustering
(ISRC) algorithms have recently been proposed in the
academic area. In [2, 7], top result images were clustered
based on visual features so that images in the same cluster
is visually similar. Considering that global image features
cannot describe individual objects in the images precisely,
[11] proposed to use region-level image analysis. They
formalized the problem as salient image region pattern
extraction. According to the region patterns, images were
assigned to different clusters. Except for visual features,
link information was also considered for web image search
result clustering [4].
Difficulty in query formulation
Log analysis reveals that user queries are often too short
(generally two or three words) and imprecise to express
their search needs [1]. Follow-up researches further reveal
that it is hard for common web users to formulate proper
queries in image search [18].
Considering users’ difficulty in query formulation, it would
be worthwhile to offer query suggestions about images,
which can not only give users hint on other possible queries,
but also save efforts by providing shortcuts to popular
queries.
However, the efficiency of these visual feature-based
approaches [2, 4, 7, 11] depend heavily on clustering
performance and the quality of representative images of
each cluster. As ISRC is an online process, clustering
hundreds of images using high dimensional features is not
efficient enough for practical use.
Professional image search
Comparison studies of both professional and novice image
search engine users suggest that direct search with image
category labels may be more effective for image search.
Novice users prefer browsing through pages of image
results, while professional users search directly, formulating
more queries and navigating fewer pages [3]. Textual
category labels were emphasized as important elements in
professional image search [8].
On the other hand, some existing image search engines
attempt to assist image queries by offer text-based image
query suggestions. Picsearch [17] provides refinements of
the original query terms. For example, the suggested terms
for “superman” include “superman returns”, “superman
logo”, etc. While in Ask.com [12], the same query gets
three categorized suggestions: (1) Narrow your search: e.g.,
“superman costume”; (2) Expand your search: e.g.,
“Supergirl” and (3) Related names: e.g., “batman”. Except
for useful suggestions, those engines failed to bring the
textual features to the next level: organizing the search
result with semantic clusters. This is exactly what we
explored with IGroup. Flickr.com [13] leverages the tag
information to cluster the image search result presentation.
However, it only works for images with accurate tags in its
own database, which can not be applied to the general web
images as IGroup. Furthermore, this approach requires the
pairs of tags, which limits the output to a few of clusters.
Contrast between professional and novice image search
engine users suggests the positive influence of image
category labels on results browsing.
Goals of the IGroup
Previous analysis of problems in current web image search
engines as well as image searching behaviors and needs,
leads us to develop a new image search engine with image
clustering, the IGroup. We expect IGroup can overcome
users’ difficulties in query formulation and satisfy needs of
both query suggestion and browsing by category labels in
image search. Therefore, we endeavor to endow IGroup
with the following features:
IGROUP STRATEGIES
Cluster the original result set into sub-sets;
Provide an overview of the result set
representative thumbnails and cluster names;
Beyond offering a small fraction of a large image search
result, IGroup empowers the user with an overview of the
collection of subsets (self-contained clusters). It also
allows users to browse each cluster, compare, and locate the
desired refinement within clicks. In addition, the actual
results accessed with IGroup are multiplied by the cluster
number, which effectively expands the coverage.
with
Specify the query by offering refined clusters;
Offer assistance with an easy-to-use interface.
2
Overview of the Result: Understanding the Big Picture
The phrases are ranked according to the salience score, and
the top-ranked phrases are taken as salient phrases. The
salient phrases are further merged according to their
corresponding documents.
The results of IGroup are organized structurally by
semantic clusters. An average of 10 clusters (up to 50
clusters for the popular queries) gives the user an overview
of contents of the results and guide users to a self-contained
result set within their chosen cluster. The first impression of
such a UI gives more confidence and control over further
selection or modifies a new query.
Merging and Pruning Cluster Names
Given the candidate cluster names, a merging and pruning
algorithm is utilized to obtain the final cluster names. First,
we merged the same or similar candidates from different
sources. Second, the synonyms of “images,” e.g. “pictures”
or “photos” are utilized to prune the candidate cluster names
of possibly unhelpful clusters. Finally, the resulting
candidate cluster names are used as queries to search an
image search engine, e.g. MSN image search [16] with the
number of resulting images counted. The cluster names with
too many or too few resulting images are further pruned.
Each of the remaining cluster names corresponds to a cluster
that contains the images returned by the search engine using
the cluster name as query.
Refine the Search with Semantic Clusters
Within one cluster, the retrieved images are all related to
the same cluster name, which leads to visual resemblance.
The homogeneity of the cluster facilitates the selection of
the users and provides wider coverage of search goals.
The cluster names also serve as informative labels for the
images, explaining why the retrieved results are related with
the query. And this is especially helpful when the images
are ambiguous or beyond the knowledge of the users.
The clustering of the retrieved images makes it feasible to
organize the results in a highly relevant structure with little
redundancy.
Considering that some resulting images may not belong to
any cluster, an “others” cluster is used to contain these
images. Currently, the clusters except the “others” cluster
are ranked according to the number of images they contain.
The “others” cluster is ranked at the bottom. Once the
click-through data is available, the clusters could be more
rationally ranked according to the number of times they
have been clicked.
Wider Coverage
The general web search engine only offers a small portion
of the result [18], thus the desired images might not be able
to show up. For example, the “white tiger”, a minor subset
in the result of “tiger” is buried in other dominate subsets
like “tiger woods.” Therefore, in IGroup, users can get
access to the “white tiger” cluster with the result set
generated from a separated search with the term “white
tiger”, as in Figure 3. Likewise, other clusters all offer the
result with the cluster name respectively. IGroup enables
users to browse beyond the scope of 1,000 images of the
original query by the aggregated the search result of all the
cluster names.
IMAGE SEARCH RESULT CLUSTERING ALGORITHM
In this section, our web image search result clustering
process is briefly introduced (Figure 2). For more details of
the algorithm including efficiency issue, please refer to [5].
Learning Candidate Image Cluster Names
Figure 2. Flow chart of the image search result
clustering algorithm.
The web is an interesting domain for image search, because
the images tend to have abundant text that could be used as
metadata, such as the textual captions assigned to the images.
Given the original query, the candidate image cluster names
are generated from the clustering results of MSN web page
search [16]. The clustering algorithm we used was proposed
in [6]. It transforms the clustering problem as salient phrase
ranking.
USER INTERFACE
Below we explain the interface design and main features of
IGroup through a query example.
Navigational Panel
Suppose there is a creative writing homework: writing an
essay with clipping art for the Year of the Tiger. This leaves
an open space for the content and figures.
Given a query and the ranked list of search results, it first
parses the whole list of titles and snippets, extracts all
possible phrases (n-grams) from the contents, and calculates
several properties for each phrase such as phrase frequencies,
document frequencies, phrase length, etc. A regression
learning model from previous training data is then applied to
combine these properties into a single salience score.
By typing in a query word in IGroup, like “tiger”, the user
will be presented by an interface as in Figure 3.
Compared to other general web image search engines, the
most prominent difference is the navigational panel on the
left. It consists of three elements:
3
1.
2.
3.
Cluster results. The image cluster information is
shown on the top of the navigational panel: “Results in
11 clusters” (Figure 3), indicating that each of the
clusters is a subset of the whole result. To avoid
overwhelming, we limit the threshold of the maximum
cluster number to 21. The clusters are now ranked by
the result of images they contains.
with the augmented navigational panel. Below is the
resolution of major UI elements for reference.
Navigation panel: 260 X 767 pixels, 20% of the width of
the screen. Cluster thumbnail: 60 X 60 pixels. It presents 8
clusters in one screen, if the cluster number is 8 or fewer,
the scroll bar will not be shown.
Cluster names. As explained in the previous section,
the names are generated from SRC. The names listed
are “tiger woods” (a golfer), “bengal tiger” (a breed of
tiger animal), “crouching tiger” (an award-winning
film) and “Mac OS” (“tiger” is the name of Mac OS X),
etc. The last cluster, the eleventh in this case is called
“others”, containing images that do not belong to all
other ten clusters. These cluster names follows the
concept of folksonomic tags [20]. Cluster name tags
could be found across all the views, including the title
bar as a “bread crumb”, and image detail view as a
shortcut. Clicking on the tag “tiger woods” will lead to
the cluster view accordingly, as in Figure 4.
USER STUDY
We carried out a user study comparing two web image
search engines, IGroup and MSN * . The usefulness of
IGroup by presenting web image search results in semantic
clusters is the focus of the study. MSN was chosen as a
state-of-art representative of web image search engine. The
sharing of exactly the same data resources by MSN and
IGroup ensures our user study with scientific experimental
control in search results.
The following three main hypotheses, including effort
saving, performance efficiency and user experience about
IGroup, were also tested to evaluate the usability of the
interface design of IGroup by presenting web image search
results in semantic clusters.
Representative thumbnails. The top three result
images in each cluster were assumed to be the most
representative. They were listed under cluster names as
a preview of the cluster. Clicking on any of the
thumbnails shows the cluster view to which it belongs.
Hypotheses
H1. IGroup saves the efforts of the users compared with
the general web image search engine.
With the comprehensive terms and explanatory thumbnails,
the user can get an overview of tens of “tiger” related
themes with rich visual aides, and then discover the
interested clusters by a few of clicks.
Although the search effort of IGroup includes an additional
part of number of cluster name clicked, we assume that
IGroup saves the efforts of the users when compared with
MSN for the following two reasons:
General View vs. Cluster View
The general view refers to the initial results IGroup
presented right after performing a search, without choosing
any of the clusters in the navigational panel (Figure 3).
(1) The suggested cluster names spare users’ efforts to type
the queries, which are more time consuming than just
clicking the cluster names;
Cluster view refers to the result set within a cluster (Figure
4). Two ways of accessing the cluster view are clicking on
any of the three cluster thumbnails or the cluster name tag.
(2) Relevant images are organized in semantic clusters with
cluster names in IGroup, which ensures the similarity or
homogeneity within clusters. The users of IGroup can find
enough images with satisfaction without the effort of
browsing through several MSN pages.
H2. IGroup offers larger image coverage and more
closely fits into the search intention.
Because the cluster view is retrieved by performing a search
with the cluster name using MSN image search engine; the
result is as same as the outcome of using the cluster name to
search in MSN in terms of content and ranking.
H3. IGroup provides more satisfactory user experience.
However, the ranking in general view is different. It is
populated by the most representative image of each cluster,
followed by the second most representative image of each
cluster, and so forth. For example, in Figure 3, image 1 to
11 is from the first image of cluster 1 to 11; image 12 to 22
is from the second image of each cluster, and so forth.
Therefore, the result is more diversified and representative:
the view will not be monopolized by a few popular results.
The semantic clustering of IGroup avoids the randomness
of related images, thus ensures more perceived relevance of
the search results. The additional navigational panel makes
the IGroup more efficient, easy to use, and allows for more
user satisfaction.
Resolution Specifics
*
The MSN image search traffic was shifted to live.com shortly
after the user study. The result set and clustering result of IGroup
may slightly vary due to this change.
The screen design is implemented on the resolution of 1280
X 1024 pixels. We are careful not to overload the interface
4
Figure 3. The screen of IGroup: the general view.
Figure 4. The screen of IGroup: the cluster view.
5
Experimental Design
by users. They may retrieve a large number of similar
images for the satisfaction of quality, or the similarity of the
retrieved images with their visual sample.
This experiment uses a 2 x 2 within-group design, in which
the same group of participants featured in both experimental
conditions.
(2)Theme related search (B-1 and B-2): Search images
under a general theme, or topic, such as the images related
with “Disney” and “Michael Jordan”, such as Walt Disney,
Disney land, Disney princess, Disney logo, Disney cruise,
Michael Jordan’s wallpaper, Air Jordan, Jordan cologne,
Jordan collectibles, and Jordan dunk (Figure 5). Both the
coverage and the number of relevant images were the
concerns of users to get a full scope of the information.
The independent variables were Search Tool Type (IGroup
vs. MSN) and Task Type (theme-related search vs. target
image search).
The dependent variables were performance measures, which
included Number and category coverage of images retrieved;
Search Efforts (an indicator of search efforts, including
trying in new keywords and clicking on the page links or
group names). The clustering accuracy of IGroup and the
user experience of IGroup and MSN were also measured
with a post-task questionnaire.
Participants
Twenty four participants, including 18 males and 6 females,
were recruited for the user study from the student interns at
the Microsoft Research Asia lab. The age of the subjects
ranged from 22 to 26 with an average of 24. All participants
were web image search engine users with the experience of
at least six months. The participants were rewarded with a
portable alarm clock for their cooperation in the
approximately 40-minute user study.
The experiments also used a Latin Squares design, and
counterbalancing measures of the sequence of systems and
tasks to be tested were taken to minimize possible practice
effects and order effects.
Tasks
Target image search and theme related search tasks were
designed to simulate actual web image search practice:
Apparatus
The PCs used in the test were Dell desktop (2G Hz)
facilitated with LCDs of 1280x1024 pixels in resolution.
An automation test program was implemented to run the
experiment. The task sequence of all 24 subjects was
configured in an XML file. User behavioral data, including
clicks, clicked objects and queries were all tracked with a
log file.
A-1: Rice.
Procedure
After a demonstration and description about the searching
tasks, participants were allowed to familiarize themselves
with the search engine to be tested, by performing a series
of practice searches with the sample queries provided by us
or any other queries the users may interested in.
A-2: Pentagon.
Figure 5. The sample target images of search task.
After a three-minute system familiarization process,
participants completed the four experiment tasks (including
two theme-related searches and two target image searches
with a counterbalance of task sequences) with the search
engine they had just tried. The task direction screen was
presented to the participants before each task and printed
sample images and task directions were also available for
their reference during the test.
B-1: Disney.
All the tasks were timed and limited to three minutes. The
experiment program shifted automatically to the next task
when time for the working task was over.
B-2: Michael Jordan.
Figure 5. The sample target images of search task.
The evaluation procedure of the IGroup and MSN was
exactly the same as described above, and sequence of IGroup
and MSN was also counterbalanced.
(1) Target image search A-1 and A-2 : Search images of
a specific target with a clear visual sample in mind, such as
“rice plant” (Figure 5, A-1) and “pentagon shape” (Figure
5, A-2). The quality of the target images was emphasized
After finishing all search tasks, the participants were
required to evaluate the clustering accuracy, comprehension
of cluster names, user satisfaction, and efficiency of both the
6
IGroup and MSN with a 5-points Likert post-task
questionnaire.
Measures
Effort
RESULTS
In order to test the efficiency and user experience of IGroup
by presenting retrieved images into semantic clusters,
several performance measures and self-report results were
taken through URL recording, experiment tasks and
questionnaires. The performance measures included time of
completion, number and coverage of retrieved images,
search efforts, that is, the number of click-up of the page
links and query inputs. The self-report results were measured
with a five-point Likert Scale, focusing on clustering
accuracy and user experience, such as user satisfaction, easy
to learn and use.
Link
Query
Systems
Mean
SD
IGroup
12.033
0.670
MSN
23.712
0.682
IGroup
10.457
0.568
MSN
20.192
0.578
IGroup
1.576
0.191
MSN
3.520
0.195
F (1,173)
p
149.099
0.000
144.367
0.000
50.793
0.000
Table 1. Post Hoc test results of comparison of search
effort.
As shown in Figure 6 and Post Hoc test (Table 1), users spent
less effort, clicked less page links and tried fewer queries
with IGroup than MSN, in full support of the first hypotheses
that IGroup is effort saving.
Search Effort (E)
The search effort in our user study was defined as the
expenditure of physical or mental energy to accomplish
search, including the formation and input of a query, the
clicking of page links or IGroup clusters.
Performance Efficiency
To testify the above-mentioned hypothesis of wider
coverage and more images fitting into search intention with
IGroup, paired-sample t-tests were applied in SPSS. The
statistical results showed a significant difference between
MSN and IGroup in both the number and coverage of the
retrieved images (Figure 7).
The search efforts were calculated with the following
formulas for IGroup and MSN respectively.
E MSN=N query + N page links clicked
EIGroup =N query + N page links clicked+ N cluster name clicked
Theme-Related Search Task
The coverage of the images retrieved by IGroup (M=4.50,
SD=0.80) was significantly larger than that of MSN
(M=3.77, SD=1.04), t (47) = 4.216, p< 0.05;
(Note: N stands for the number of the items in the footnote.)
MANOVA tests are applied to compare the search efforts of
MSN and IGroup.
The number of images retrieved by IGroup (M=11.79,
SD=2.58) was also larger than that of MSN (M= 9.83,
SD=2.96), t (47) = 4.079, p< 0.05.
The MANOVA tests show significant main effect of systems
type (F (2,172) =74.484, p<.05) and the main effect of task (F
(6,346) =7.655, p<.05). And there is no significant interaction
effect (F (6,346) =1.941, p>.05), which implies that the
performance difference between MSN and IGroup is
independent of the tasks.
(A)
(B)
Figure 7. Comparison of the Number and Coverage of
retrieved images by IGroup and MSN. (A) Target
image search tasks; (B) Theme-related tasks.
Figure 6. Comparison of search efforts.
Post Hoc test of the difference between IGroup and MSN is
also applied and the detailed statistic comparisons are shown
in Table 1.
Target-Image Search Task
Users with IGroup (M=13.90, SD=5.70) also retrieved
significantly higher number of images than users with MSN
(M=12.33, SD=6.70), t (47) = 4.079, p< 0.05.
7
Please also note that in both the theme-related search task
and the target-image search task, the standard deviation (SD)
measures of users with IGroup are all less than that of MSN,
indicating a tendency of IGroup to enhance the performance
of the novices towards that of experienced image search
engine users.
mainly three categories as follows:
Clustering: 14 out of the 24 participants praise the
clustering of the retrieved images as the advantages of
IGroup, as one of them put it, “Classification is very
convenient for user to choose.” On the contrary, eight
participants regard the lack of clustering as one of
disadvantages of MSN.
Query formulation: four participants believe IGroup offers
help in query formation and six feel pity about MSN:
“Users sometimes do not know how to express their needs
in query.”
Relevance of retrieved images: three participants think
IGroup provides more relevant images while irrelevant
images make a significant portion of the results of MSN.
User Feedback
Observed Behaviors
There are several typical user behavioral patterns been
found during the test, which revealing interesting findings.
Experienced image search engine user tent to
frequently change query terms to improve the result.
Therefore, there is no significant difference in their
positive rating in the questionnaire, which indicates
that the UI is also well accepted among experienced
users.
The current ranking of the image clusters is by the
scale of the result set. Some participants suggested
more ranking methods, like alphabetical and click rate.
Some participants want to see the clustering continues
to the next level. We are cautious to the complexity and
inconsistency this may introduce to the interaction.
Figure 8. User feedback of IGroup and MSN.
Besides previous objective search effort and performance
measures, we conducted a post-task questionnaire with a
5-point Likert Scale to evaluate subjective user experience,
including relevance results, satisfaction, easy to use and
efficiency. The comparison of user experience between
IGroup and MSN was analyzed using paired-sample t-tests,
as is shown below:
Relevance of results
In the respect of relevance of the results, IGroup (M=4.13,
SD=0.85) was valued higher than MSN (M=3.54, SD=1.02),
t (23) = 3.685, p< 0.05.
Satisfaction
Users were more satisfied with IGroup (M= 4.38, SD=0.58)
than MSN (M= 2.88, SD=0.99), t (23) = 6.660, p< 0.05; the
contrast in satisfaction between IGroup and MSN is quite
clear: IGroup were highly evaluated, while MSN got a
slightly negative evaluation.
Easy to use
IGroup (M= 4.46, SD=0.66) were highly appraised for its
usability in sharp contrast with the negative evaluation of
MSN (M= 2.67, SD= 1.09), t (23) = 5.731, p< 0.05.
Efficiency
The evaluation of efficiency also favored IGroup (M= 4.46,
SD=0.59) compared with the approximately neutral score of
MSN (M=2.92, SD=0.93), t (23) = 6.407, p< 0.05.
The consistent smaller standard deviation of IGroup over
MSN also implied that the users were more in agreement
with one another in the evaluation of IGroup than MSN.
UI Comprehension
Rare case of misunderstanding has been found in the
test except for one, which regarded the navigational
panel as an image search history, as history list of
Internet Explorer. In general, the function of IGroup is
well perceived by the participants.
Positive Feedbacks
The cluster names are useful for the search tasks:
Strong learning effect of search with IGroup first.
Users tend to use the cluster names of IGroup as search
keyword in MSN search tasks, like “pentagon shape”,
“wild rice” and “walt disney”. The IGroup provide
useful query refinement in those cases.
A typical observed behavior difference is found
between IGroup and MSN when the result is not
satisfactory. With IGorup, participants try different
clusters by clicking the cluster thumbnails; With MSN,
they change query word instead.
Once find the “wild rice” and “pentagon shape” during
task A-1 and A-2, participants ignores other clusters.
The open-ended questions after the questionnaires about the
advantages, disadvantages and possible improvements of
both IGroup and MSN are also in favor of IGroup and in
consistent with our expectation based on our algorithms and
user interface design. The 24 participants’ answer falls into
Negative Feedbacks
When the representative thumbnails failed to represent
the majority of the images in their groups, the user task
will be harder. Several users missed the “Pentagon
8
content on the Web: organizing the result set of video,
audio with labeled clusters.
shape”, as there is only one geometrical pentagon
appeared in the three of the thumbnails. Improving the
thumbnail quality is helpful for effective browsing.
The left navigational panel will be gone if the query is
refined enough. Some participant reported this change
was unexpected.
One participant report the navigational panel caused extra
visual load. The maximum number of the clusters should be
considered.
CONCLUSION & FUTURE WORK
To assist web image users with a practical and effective
image clustering tool, we leverage the SRC algorithm to
generate a set of semantic clusters and aggregate the
separate image searches result of those terms.
The interface presents the results in “breadth first” rather
than “depth first”, thus all the groups are visible on the first
page (general view) without scrolling.
DISCUSSION
In the current implementation, IGroup performs ISRC only
once on the query word. The resulted cluster names will not
be clustered further with the same treatment, thus a
hierarchically structure is not offered yet. For example, as
one of the clusters of “tiger”, the results of “tiger woods”
are not clustered further. Though it could be an interesting
subject to explore, we are careful not to introduce extra
complexity to the system.
The user study showed significant improvements in image
search efficiency, coverage and satisfaction in the given
tasks. Participants prefer to browse image search results
using these visual, labeled clusters over traditional list
views of image search results. The function of navigational
panel and cluster view was well accepted.
In the future, once the click-through data are available, the
clusters could be more rationally ranked according to the
number of times they have been clicked. We will also
improve the algorithm to produce high-quality cluster
names and representative thumbnails to enhance the
browsing experience.
One limitation of IGroup is that not all the retrieved images
are clustered. The final cluster, “others”, contains the
images which can not be assigned to any other cluster.
However, the positive sides are: (1) the algorithm could be
more efficient by compromising on precision; (2) it avoids
overwhelming users with too many clusters.
IGroup is available for use at: http://igroup.msra.cn.
Another flaw is that semantically similar clusters present
similar or duplicated images. For example, among the
image clusters of query “Britney Spears”, “Britney poster”
and “Britney wallpaper” share many identical images. With
visual content-based analysis and URL de-duplication,
similar clusters could be reasonably merged.
ACKNOWLEDGMENTS
This work has been greatly supported by the MSRA Web
Search and Data Mining group. The authors would like to
thank Like Liu, Dwight Daniel and all who have tested
IGroup and helped with their comments.
Because the grouping of IGroup is based on textual features,
it could be easily integrated with other classifiers, such as
content, color and resolution. Another advantage of this
approach is that it could also be applied to other multimedia
REFERENCES
the 12th annual ACM international conference on
Multimedia, 952-959.
1. B. J. Jansen, A. Spink, J. Bateman and T. Saracevic,
Real life information retrieval: a study of user queries on
the web, ACM SIGIR Forum, 32 (1998), 5-17.
5. Feng Jing, et al, IGroup: Web Image Search Results
Clustering, Proc. ACM Multimedia 2006.
2. B. Luo, X. G. Wang, and X. O. Tang, A World Wide
Web Based Image Search Engine Using Text and Image
Content Features, Proc. of IS&T/SPIE Electronic
Imaging 2003, Internet Imaging IV.
6. H. J. Zeng, Q. C. He, Z. Chen, W. Y. Ma and J. W. Ma,
Learning to cluster web search results, Proc. of the 27th
annual international ACM SIGIR conference, 210 – 217.
7. H. Liu, X. Xie, X. O. Tang, Z. W. Li and W. Y. Ma,
Effective Browsing of Web Image Search Results,
Proceedings of the 6th ACM SIGMM international
workshop on Multimedia information retrieval, 84-90.
3. C. O. Frost, B. Taylor, A. Noakes, S. Markel, D. Torres,
and K. M. Drabenstott. Browse and search patterns in a
digital image database. Information Retrieval, 1(4): 287 –
313, 2000.
8. K. P. Yee, K. Swearingen, K. Li, and M. Hearst. Faceted
metadata for image search and browsing. In Proceedings
of ACM SIGCHI 2003, 401–408.
4. D. Cai, X. F. He, Z. W. Li, W. Y. Ma and J. R. Wen,
Hierarchical Clustering of WWW Image Search Results
Using Visual, Textual and Link Analysis, Proceedings of
9
9. K. Rodden, W. Basalaj, D. Sinclair, and K. R. Wood.
Does organization by similarity assist image browsing?
In Proceedings of ACM SIGCHI 2001, 190–197.
15.Google Image Search. http://images.google.com
16.MSN Search. http://search.msn.com/
17.Picsearch. http://www.picsearch.com/
10.S. Mukherjea, K. Hirata, and Y. Hara. Using clustering
and visualization for refining the results of a WWW
image search engine. In Proc. of ACM NPlV ’98, 29-35.
18.TASI image search engine review.
http://www.tasi.ac.uk/resources/searchengines.html
19.Using images to increase your search engine rankings.
11.X. J. Wang, W. Y. Ma, Q. C. He and X. Li, Grouping
Web Image Search Result, Proceedings of the 12th
annual ACM international conference on Multimedia,
436-439.
http://www.thumbshots.org/article.pxf?artid=99
20.Wikipedia: Tag(metadata).
http://en.wikipedia.org/wiki/Tag_%28metadata%29
12.Ask.com image search. http://images.ask.com
13.Flickr photo search by tags.
http://www.flickr.com/photos/tags/microsoft/clusters
14.Google. http://www.google.com/
10